## Abstract

Light microscopy on dynamic samples, for example neural activity in the brain, often requires imaging volumes that extend over several 100 *µm* in axial direction at a rate of at least several tens of Hertz. Here, we develop a tomography approach for scanning fluorescence microscopy which allows recording a volume image in a single frame scan. Volumes are imaged by simultaneously recording four independent projections at different angles using temporally multiplexed, tilted Bessel beams. From the resulting projections, three-dimensional images are reconstructed using inverse Radon transforms combined with convolutional neural networks (U-net).

© 2019 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

## 1. Introduction

In many fluorescence microscopy experiments samples show dynamics which evolves not in a single focal plane but in a volume. For example, imaging neural activity in the brain requires monitoring multiple cells that are distributed in three dimensions [1, 2]. In this situation, the temporal resolution of the microscope needs to match the time scale of the sample dynamics, which for neural activity monitored with calcium indicators, lies in the range of several tens of Hertz [1, 3, 4].

While such imaging rates are standard for scanning a single plane, they are more difficult to reach for volumes. The scan rate is ultimately limited by the requirement of integrating a sufficient number of photons per scanned pixel. To overcome this limit, solutions for volume imaging have been developed that scan multiple pixels in parallel [3, 4]. Multifocal microscopy, for example, images several spots at the same time, but requires detectors with multiple pixels making this approach susceptible to scattering (for example when imaging in tissue) [3, 5]. Multiple focal spots can also be imaged in parallel with single pixel detectors: by introducing temporal offsets between different (typically pulsed) beams, fluorescence emission can be sorted into different temporal channels [6, 7].

To speed up volume imaging rates, extended volumes -instead of multiple focal spots - can be excited at the same time, for example using Bessel beams, Airy beams, line foci or light sheets [4, 8, 9]. Combined with camera detectors at an angle to the axis of excitation, as for example in Bessel beam light sheet microscopy [8], these approaches work best for weakly scattering samples [3, 4], but Bessel beams or line foci [5] can also be used for imaging with non-descanned PMT detection. This is more suitable for imaging in scattering tissue and results in a projection of the entire imaging volume along the beam axis [10–14]. While for Bessel beams this approach offers the advantage of recording from an entire volume in a single frame scan, it comes at the cost of losing all axial resolution.

Axial resolution was restored in experiments that used two Bessel beams which imaged the sample at different angles [11, 12, 14] for generating two different projections. These projections were either recorded sequentially [11, 12] or simultaneously (as required for moving samples such as brain tissue *in vivo*) in a single channel by averaging over two different projections [14]. From these projections axial information was recovered based on the distance between prominent, sparse features such as fluorescent beads or cell bodies [12, 14].

Recording multiple projections at different viewing angles generally forms the basis for reconstructing volumes in tomographic imaging [15]. While such tomographic approaches have been implemented for fluorescence microscopy by rotating the samples with respect to the illumination direction [16, 17], this is not compatible with most *in vivo* imaging preparations. A tomography approach compatible with fluorescence microscopy for *in vivo* imaging has recently been developed with Gaussian line foci generating projections in the focal plane and was combined with scanning the objective along the axial direction to record volumes [5].

We here develop a tomography method based on Bessel beams. Volume information is obtained from four independent projections recorded at the same time, in a single frame, from four different angles by scanning temporally multiplexed, tilted Bessel beams. For volume reconstruction we combine inverse Radon transforms adapted for Bessel beam scanning with machine learning [15, 18–22]. Machine learning has been shown for example in optical phase imaging to allow high resolution reconstruction from sparse projections at shallow angles similar to the ones used here [20]. We show that three-dimensional convolutional neural networks (U-net) trained on suitable datasets similarly improve reconstructions in Bessel beam tomography. This scanning microscopy method allows volumetric imaging with non-descanned detection in a single frame scan.

## 2. Results

#### 2.1. Setup

The setup is based on a custom built two-photon scanning microscope controlled by Scanimage (Vidrio) in a resonant scanning configuration with photon counting. The microscope optics was adapted to scan four tilted Bessel beams across the sample as described in the following. Four different projections were generated by using four different Bessel beams tilted with a mean angle of 166.75 (standard deviation 2.75) degrees with respect to the optical axis and an azimuth separation of approximately 90 degrees between neighboring beams (see Table 1 and Fig. 8 for more details and for quantification). The setup is shown in Fig. 1(a) and pictures of the setup are shown in Figs. 9(a) and 9(b). A laser beam (Coherent Discovery, 120 fs pulse width) is split into four beams with non-polarizing plate beam splitters and the (nevertheless polarization dependent) splitting ratio is additionally balanced using a half-wave plate for each beam (not shown). The beams are separated with a delay of 3 ns (corresponding to a 1 m additional path length per beam) [6,7] which allows distinguishing the different projections with a single detector using photon counting at a time resolution faster than the pulse repetition rate, provided that the fluorescence lifetime is shorter than the pulse separation [6]. The beam width and collimation is adjusted for each beam with a telescope (two achromatic lenses, one mounted on a translation stage).

Each beam is sent through an axicon (AX252-B, all optical components were from Thorlabs unless otherwise noted) for generating a Bessel beam. The beam is collimated after the axicon with an achromatic lens (AC254-100-B) and the resulting ring is imaged onto the back focal plane of the microscope objective. For this, the ring is first demagnified and imaged onto the scan mirrors using a telescope (AC254-200-B and AC254-100-B), with the latter lens common to all four beam paths. The scan mirror is imaged onto the back focal plane of the objective using a scan lens (AC300-050-B) and tube lens (AC508-300-B).

To achieve tilted illumination from four opposing azimuth angles (see Fig. 3), the last lens in each beam path (before the beams are recombined) is displaced laterally and/or vertically for each beam, resulting in an off-center Bessel ring in each quadrant of the back focal plane of the objective. The displacement was in the range of 1-2 mm in the respective direction and was adjusted while observing the projection of the Bessel ring at the back aperture of the objective. The Bessel ring diameters at the back aperture of the objective were slightly below half the objective aperture diameter and allowed laterally displacing the beam without clipping. The four rings projected onto the back aperture of the objective are shown in Fig. 9(c).

Depending on the parameters and methods used to generate Bessel beams, different profiles in terms of width, length and axial intensity distribution result [23]. The axial intensity profile additionally depends on the placement of the collimating lenses [14, 23]. Here, we aimed for a beam length of about 80 *µ*m as a compromise between recording a sufficiently large volume while limiting the power required for two-photon excitation in each beam. The maximum power per beam used in the experiments was 90 mW measured at the entrance aperture of the resonant scanner. In each beam path the collimating lens was placed 11.5 cm behind the axicon, the first lens of the telescope was placed 30 cm behind the collimating lens which was separated by 28 cm from the lens common to all beam paths. The focal length of each beam was iteratively adjusted based on the two-photon point spread function by changing the beam collimation using the translation stage in the beam expander [14].

The four beams are recombined by first combining pairs of beams -after adjusting their polarizations othogonally with a half-wave plate (not shown) -with a polarizing beam splitter cube. All four beams are then combined with a right angle reflecting prism positioned just before the common telescope lens (that forms the demagnified image of the Bessel ring on the scan mirror). The reflecting prism is positioned laterally into the beam path using a translation stage, reflecting two beams at 90 degrees while allowing the other two beams to pass near the edge of the prism. This is possible without clipping any of the beams by laterally displacing the two passing beams into the appropriate neighboring quadrants in the objective back focal plane (as required for tilting of the Bessel beams).

An additional Gaussian beam path (not shown) was used for recording reference image z-stacks. A flip mirror was used to direct the beam through a telescope towards a polarizing beam splitter cube mounted on a magnetic mount that was manually inserted before the resonant scanner.

All experiments were performed with a custom-built two-photon microscope controlled with Scanimage (Vidrio) [24] with a resonant scanner (resonant scan box, Sutter Instruments) and components similar to the ones described in [25]. Scanimage allows FPGA-based (National Instruments, PXIe-7975R) photon counting, in our setup at a time resolution of 1.28 GHz (16 time bins between laser pulses at 80 MHz repetition rate). The resonance frequency of the resonant scan mirror was 8 kHz and images were acquired using bidirectional scanning (images are acquired during both movement directions across the sample). Overlaying the four beams resulted in a combined pulse repetition rate of 320 MHz and therefore four time bins per beam. To suppress cross talk between the fluorescence decays excited by the different beams, only three time bins were used for each beam and the intensity in these three bins was averaged into one channel (or image) using the scanning control software. The four different channels were selected with the help of the fluorescence decay time histogram displayed in the scanning control software.

#### 2.2. Tomographic volume reconstruction

A typical 3D tomography approach, for example in medical imaging, acquires 2D projections of a sample by varying the azimuth angle at a fixed elevation of 90 degrees, as illustrated on the left side of Fig. 1(c). Changing the elevation to an oblique angle as shown on the right side of Fig. 1(c) distorts the reconstruction by a factor of 1/sin(*θ*). Using tilted Bessel beams to record projections results in a distortion along the *z* axis and this was (partially) corrected for using machine learning. The workflow of our tomography approach is shown in Fig. 1(b): first, four projections are recorded from a 3D sample by scanning the four different Bessel beams accross the sample, second, back projection is applied resulting in a distorted 3D reconstruction and, and third, a convolutional neural network (3D U-net [26–28]) is used to recover the 3D object. These steps are described in more detail below.

### 2.2.1. Radon transform for Bessel beam scanning

For volume reconstruction we developed a 2D Radon transform based approach for Bessel
beam scanning with tilted beams. In this approach, the amount of
light emitted following excitation at a given point **x**
∈ ℝ^{3} in a volumetric sample is
described with a function *f* and axial position
*z*: ℝ^{3} → ℝ. The
Bessel beam is approximated as a line segment of length
*ℓ* with constant intensity centered at a
point **x*** _{c}* ∈
ℝ

^{3}and a direction given by the normalized vector $\widehat{\mathbf{s}}\in {\mathbb{R}}^{3}$ with a parameter

*t*. The total amount of light emitted by the sample due to excitation by the beam is then described in terms of the Radon transform along the segment:

We define an image of size (2*N*+1) × (2*N*+1) as
a function $I:\{\mathbb{Z}\cap [-N,N]\}\times \{\mathbb{Z}\cap [-N,N]\}\to \mathbb{R}$ representing the intensity of a pixel
at a position with integer indices (*ν,
µ*). To create an image, the Bessel beam is
scanned with the two scanning mirrors across the field of view and
for each pixel the resulting intensity is recorded. When scanning
the tilted Bessel beams we observed small changes of both the
vector **x _{c}** and the direction vector
$\widehat{\mathbf{s}}$ across large fields of view compared
to a Gaussian reference z-stack, see Fig. 3 and Table 1 for quantification. Both vectors are therefore
described as vector fields depending on the pixel coordinates
(

*ν, µ*): the offset vector field,

**x**=

_{c}**x**(

_{c}*ν, µ, z*), and the the direction vector field, $\widehat{\mathbf{s}}=\widehat{\mathbf{s}}(\nu ,\mu ,z)$. A general expression for the intensity in each pixel is then given by:

### 2.2.2. Calibration of Bessel beam projections

The vector fields ${\mathbf{x}}_{\mathbf{c}}^{(\mathbf{i})}$ and ${\widehat{\mathbf{s}}}^{(\mathbf{i})}$ are required for reconstructing the volumetric sample *f* (**x**) from its projections *I*^{(}^{i}^{)}(*ν, µ*; *z*) and are measured experimentally for each projection *I*^{(}^{i}^{)} with a volumetric sample of 1 *µ*m diameter fluorescent beads (Fluoresbrite, Cat. No. 17154-10, Polysciences) suspended in 2% Agarose (Agarose LE, M 3044, Glenaxxon) (see Fig. 2(a) and Fig. 3). For this, a z-stack was recorded while scanning the four Bessel beams resulting in volumes of 2D projections, *V*^{(}^{i}^{)}(*ν, µ, z*)≡{*I*^{(}^{i}^{)}(*ν, µ*; *z*)/*z* ∈[*z _{min}, z_{max}*]}. As quantified in Table 1, variations of the vector field in z-direction were small and are ignored in the following.

The recorded volumes were segmented using the mean value after Gaussian filtering with
standard deviation of 5 pixels for noise reduction (manually tuned
for optimizing segmentation) as threshold, see Fig. 2(a) and (b). The center position of
each segmented bead *b*, $({\nu}_{c}^{(i,b)},{\mu}_{c}^{(i,b)},{z}_{c}^{(i,b)})$ was identified using the
*skimage* function *regionprops* and
Principal Component Analysis (PCA) was applied to obtain the axis
with the highest eigenvalue as the direction vector
${\widehat{\mathbf{s}}}^{(i,b)}$ the center point $({\nu}_{c}^{(i,b)},{\mu}_{c}^{(i,b)},{z}_{c}^{(i,b)})$ as well as the length of the beam
*ℓ*^{(}^{i,b}^{)},
see Fig. 2(c) and Fig. 3.

Making use of all beads measured in each z-stack *i* and the corresponding directional vectors,

*ν, µ, z*),

*A*is a 3 × 3 matrix while

_{s}**B**is a bias vector, which are both parameters fit to the data

_{s}*S*

^{(}

^{i}^{)}.

The length of the beam in each z-stack *i* is considered constant, and it is computed as the average value of the lengths obtained for all beams:

Due to the selected beam length and collimation of the tilted Bessel beams, the different beams do not meet at the center of the field of view (similar to [14]). This offset leads to cropping of the projections, see Fig. 4(a) top row for full frames and Fig. 4(a) center row for cropped frames (as indicated by the white frames). (An additional cropping is introduced by the offset between reference z-stack recorded with a Gaussian beam used for calibration and the z-stack recorded with the Bessel beam (referred to as a Bessel beam z-stack below).) For tomographic reconstructions, this offset was corrected for by computing an offset vector field. For this, one Bessel beam z-stack was used as a reference. For each centroid of each bead in the segmented Bessel beam z-stack the offset vector to the same bead imaged in the three other segmented Bessel beam z-stacks was computed. Interpolating the offset vectors using linear regression for all three z-stacks results in the offset vector field:

### 2.2.3. Back projection

Using the calibrated vector fields ${\mathbf{x}}_{\mathbf{c}}^{(i)},{\widehat{\mathbf{s}}}^{(i)}$, and length *ℓ*^{(}^{i}^{)} for each projection *i*, the inverse Radon transform, ℜ^{(−1)}*I*^{(}^{i}^{)}, can be applied to any projection *I*^{(}^{i}^{)}(*ν, µ*), where the pixels of *I*^{(}^{i}^{)} are projected along the corresponding direction vector ${\widehat{\mathbf{s}}}^{(\mathbf{i})}$ and translated with its corresponding offset vector ${\mathbf{x}}_{\mathbf{c}}^{(\mathbf{i})}$. This results in a 3D image:

The parameter *t* needs to be discretized: *t* =
*t _{min},*
(

*t*+

_{min}*z*)

_{res}*, …,*(

*t*−

_{max}*z*)

_{res}*, t*, where

_{max}*z*gives the

_{res}*z*resolution. We obtain an approximated reconstruction of the volume as the sum of the the inverse Radon transforms (back projection):

For small objects, such as the fluorescent beads in Fig. 4), the back projections were additionally filtered by setting the 3D image to zero if not all four inverse Radon transforms overlapped in a given position (*ν*, *µ*, t). For this filtering, a binary image was computed from each inverse Radon transform (obtained from equation (7)) by applying a Gaussian filter with 1 pixel standard deviation and thresholding to the mean value of the inverse Radon transform. This was done for each projection, and the binary inverse Radon transforms were then summed to form a thresholded back projection. The maximum value of the thresholded back projection is 4 (only 4 projections). This image was binarized, setting values below 4 to zero, creating a back projection mask with a value of 1 where the four beams intersect. The actual back projection obtained from equation (8) was finally multiplied element-wise with the back projection mask so that the density of the resulting image was non-zero only where the 4 beams intersected. The recorded projections are shown in Fig. 4(a) and the resulting reconstructed volume with only intersecting regions of the four beams as well as a comparison with the Gaussian reference z-stack is shown in Fig. 4(b) (no convolutional networks were applied for this reconstruction).

### 2.2.4. Neural networks for distortion correction

For objects that are of similar size to the Bessel beam focal length, the back projected volume will extend over the entire axial range of the sampled volume due to the shallow projection angles. Neural networks were used to correct for distortions and to recover axial resolution in this situation, (see Fig. 1(b) and (c)).

A three dimensional U-net (Fig. 5(a)) was trained using simulated 3D samples consisting of ellipsoids of different sizes with homogeneous constant density (ranging from 1*µm* to 40 *µm*) (Fig. 5(b)). Using the Radon transform approach, four projections were generated from each simulated 3D sample using the vector fields measured during calibration and shot noise was added to simulate detection noise (Fig. 5(b), bottom row). From these four projections a (distorted) 3D volume was reconstructed using back projection and a back projection mask (as described above) to obtain reconstructions based on all four beams (Fig. 5(b), left side). This reconstructed volume served as input to a 3D U-net (dimension 128*x*128*x*100 pixels) and the original simulated 3D sample (same dimension of 128*x*128*x*100) served as the output. The number of data samples generated was 1000.

Networks were built and trained using Python and *keras* with *tensor flow* as the backend [29]. The 3D U-net had 804809 parameters and was composed of blocks formed in the first part by 3D convolutional layers, batch normalization and 3D max pooling. After the bottleneck we used a combination of 3D convolutional layers, batch normalization and 3D upsampling layers to deconvolve the 3D image and recover the corrected reconstruction (Fig. 5).

The model was trained on a desktop computer running Ubuntu with an Intel Xeon (R) CPU E5-1650 v4 @ 3.60GHz x 12 processor and a TITAN XP graphics card, that allowed to train the network for 100 epochs with batches of 5 samples in about 5 hours. The loss function was the mean squared error between the original simulated image and the network output and we used Adam optimizer to train the neural network. Results of applying this network on a pollen grain volume sample (Mixed Pollen Grains, part number 304264, Carolina) are shown in Fig. 6, where Fig. 6(a) shows the four recorded projections and Fig. 6(b), top row, shows the reconstruction using back projection. The back projection of the pollen grains was again filtered by a back projection mask to base the reconstruction on all four projections. The Gaussian filter of the back projected mask had again 1 pixel standard deviation and the threshold was the mean of the inverse Radon transform of each projection. Due to the (compared with the beam length) large size of the pollen grains, almost no axial resolution is recovered using back projection (Fig. 6(b), top row). However, axial information can be partially recovered with the U-net (Fig. 6(b), second row, compare to Fig. 6(b), last row for ground truth).

For samples which do not change their structure during the observation time but only their intensity a Gaussian reference z-stack could first be recorded for obtaining structural information and Bessel beam tomography for subsequently monitoring activity at high temporal resolution [23]. We tested this approach by generating a structure-mask from the Gaussian z-stack by filtering it in the xy-plane with a Gaussian filter with 1 pixel standard deviation, thresholding it at the mean value, binarizing and downsampling it to 128 pixel resolution by averaging over four pixels. The resulting mask was multiplied with the volume reconstructed using combined back projection and U-net filtering (Fig. 6(b), second row) and resulted in the reconstruction shown in Fig. 6(b), third row. The accuracy of the reconstructed intensities in this case still depends on the quality of the tomographic reconstruction.

## 3. Discussion

We introduced a tomography method for scanning fluorescence microscopy: volumes were imaged in a single frame scan using temporally multiplexed and tilted Bessel beams for recording independent projections at four different viewing angles with a single objective. A tomographic reconstruction approach suitable for scanning with tilted Bessel beams was developed combining Radon transforms with three dimensional convolutional networks (3D U-net). We recovered volume information using fluorescent beads (Fig. 4) and pollen grain samples (Fig. 6). The frame rate depended on the resolution of the image and was 30 Hz for images with 512 × 512 pixel resolution (and increases with fewer pixels, for example 90 Hz for 128 pixel resolution) and the largest reconstructed volumes was 179*µm* × 219*µm* × 60*µm* (Fig. 6).

Imaging using extended beams, lines or light sheets offers a promising approach for increasing the imaging rate beyond point scanning methods [1–4234]. Different from approaches that detect fluorescence using a camera, Bessel beam tomography uses non-descanned PMT detection. This is an advantage for imaging in tissue, since, although scattering affects the infrared excitation light, it does not lead to cross talk of the more strongly scattering visible emission between different detection channels or pixels [3, 9].

The depth of the field of view depends on the length of the Bessel beam and can be adjusted. For for *in vivo* two-photon imaging the power that can be sent into the sample is a limiting factor. Here we used a maximum of 90 mW per beam at a combined pulse rate of 360 MHz. Fluorescence excitation could likely be optimized by using lasers with higher energy per pulse and lower repetition rate without exceeding the damage threshold [4]. Lower repetition rates would also reduce crosstalk between channels and would allow imaging with fluorophores with longer lifetimes, such as typical genetically encoded calcium indicators. Additionally, using higher numerical aperture objectives (we here used 0.8) or using a larger fraction of the available N.A. at the cost of smaller tilt angles could further improve the lateral beam resolution and two-photon excitation efficiency. Bessel beam tomography could also be performed using single photon excitation, where laser power would not be a problem, or using three-photon excitation.

Different from previous approaches with tilted Bessel beams [11, 12, 14] we simultaneously recorded independent projections using temporal multiplexing. The advantage of simultaneous instead of sequential recordings [11, 12] lies in the increased frame rate and makes the measurement less sensitive to motion artefacts (which would interfere with the tomographic reconstruction) as typically observed in *in vivo* imaging. Increasing the number of projections to four is advantageous for volume reconstruction at shallow angles as illustrated in Fig. 7 and adding more projections is expected to only add a limited benefit for an isolated object. Different from another recent tomography approach [5], projections here were recorded along the optical axis instead of across the focal plane. While the axial projections at shallow angles come at the cost of lower resolution it is compatible with a standard 2D scanning configuration and doesn’t require axial scanning.

For volume reconstruction we used Radon inverse transforms combined with machine learning and adapted it for the presented Bessel beam scanning configuration. We modeled the Bessel beam as a constant line segment for computational simplicity, but models of the actual point spread function could be used for computing the inverse Radon transform.

Approaches combining imaging physics and machine learning have been successfully applied in other tomography techniques for improving the reconstruction from sparse, shallow angle projections [15, 18–22]. Overall, the quality of the reconstruction depends on the neural network model and training data. Our networks were limited by GPU memory and therefore only low resolution images (128 pixel resolution instead of 512) were used. Training the network models on higher resolution images will likely improve the reconstruction. Additionally, we used a limited training data set based on simulated, randomly distributed homogeneous spheres to approximate the pollen grain sample. Training a network on this data set was sufficient for partially reconstructing axial information, but as expected didn’t recover higher resolution features of the sample, such as small spikes on the pollen grains. The quality of the reconstruction is expected to improve if the training data is more similar to the sample. Therefore, increasing the training data set to include large numbers of more realistic samples and, if necessary, increasing the model complexity will allow to further improve the reconstruction. It will be interesting to explore in how far denser and less regular samples can be reconstructed with Bessel beam tomography and suitable training data sets. While for an isolated object four projections suffice for reconstruction (see Fig. 7), for denser samples, the reconstruction would be expected to improve with the number of projections.

Overall, Bessel beam tomography is suitable for fast imaging of sparse volumetric samples that can be accessed only with a single objective, a situation that is commonly encountered when imaging neural activity from cell bodies in the brain.

## Appendix

Table 1 shows the variation of the elevation and azimuth angles for each Bessel beam across an entire z-stack. For each beam, a total of 10^{6} vectors were generated from the linear interpolation equation (4) after calibration, and the mean and standard deviation (std) of the beam angle with respect to the *x*-axis (azimuth) and *z*-axis (elevation) were computed for each of these vector fields. In the third column the change of direction of each Bessel beam with respect to the *z* axis (over a larger volume than used in experiments) is shown. Generating 10^{4} vectors in the top layer (*z* = 200*µm*) and 10^{4} vectors in the bottom layer *z* = 0*µm* from equation (4), the angle between each pair of vectors corresponding to the same pixel locations as well as the mean and standard deviation for all vectors were computed. The mean difference for each beam was close to zero and therefore any z-dependence of the vector fields could be ignored in the reconstruction.

## Funding

Max Planck Society, Center of Advanced European Studies and Research.

## Acknowledgments

We would like to thank Alex Turpin for initial experiments on Bessel beam generation, Georg Jaindl for Scanimage support, and Ivan Vishniakou for helpful discussions and comments on the manuscript.

## Disclosures

The authors declare that there are no conflicts of interest related to this article.

## References

**1. **P. J. Keller and M. B. Ahrens, “Visualizing whole-brain activity and development at the single-cell level using light-sheet microscopy,” Neuron **85**, 462–483 (2015). [CrossRef] [PubMed]

**2. **J. A. Calarco and A. D. Samuel, “Imaging whole nervous systems: insights into behavior from worms to fish,” Nat. Methods **16**, 14 (2019). [CrossRef]

**3. **S. Weisenburger and A. Vaziri, “A guide to emerging technologies for large-scale and whole-brain optical imaging of neuronal activity,” Ann. Rev. Neurosci. **41**, 431–452 (2018). [CrossRef] [PubMed]

**4. **E. M. Hillman, V. Voleti, K. Patel, W. Li, H. Yu, C. Perez-Campos, S. E. Benezra, R. M. Bruno, and P. T. Galwaduge, “High-speed 3d imaging of cellular activity in the brain using axially-extended beams and light sheets,” Curr. Opin. Neurobiol. **50**, 190–200 (2018). [CrossRef] [PubMed]

**5. **A. Kazemipour, O. Novak, D. Flickinger, J. S. Marvin, J. King, P. Borden, S. Druckmann, K. Svoboda, L. L. Looger, and K. Podgorski, “Kilohertz frame-rate two-photon tomography,” bioRxiv p. 357269 (2018).

**6. **W. Amir, R. Carriles, E. E. Hoover, T. A. Planchon, C. G. Durfee, and J. A. Squier, “Simultaneous imaging of multiple focal planes using a two-photon scanning microscope,” Opt. Lett. **32**, 1731–1733 (2007). [CrossRef] [PubMed]

**7. **A. Cheng, J. T. Gonçalves, P. Golshani, K. Arisaka, and C. Portera-Cailliau, “Simultaneous two-photon calcium imaging at different depths with spatiotemporal multiplexing,” Nat. Methods **8**, 139 (2011). [CrossRef] [PubMed]

**8. **J. Nylk and K. Dholakia, “Light-sheet fluorescence microscopy with structured light,” in *Neurophotonics and Biomedical Spectroscopy*, (Elsevier, 2019), pp. 477–501. [CrossRef]

**9. **F. O. Fahrbach, P. Simon, and A. Rohrbach, “Microscopy with self-reconstructing beams,” Nat. Photonics **4**, 780 (2010). [CrossRef]

**10. **G. Thériault, Y. De Koninck, and N. McCarthy, “Extended depth of field microscopy for rapid volumetric two-photon imaging,” Opt. Express **21**, 10095–10104 (2013). [CrossRef] [PubMed]

**11. **G. Thériault, M. Cottet, A. Castonguay, N. McCarthy, and Y. De Koninck, “Extended two-photon microscopy in live samples with bessel beams: steadier focus, faster volume scans, and simpler stereoscopic imaging,” Front. Cell. Neurosci. **8**, 139 (2014). [PubMed]

**12. **Y. Yang, B. Yao, M. Lei, D. Dan, R. Li, M. Van Horn, X. Chen, Y. Li, and T. Ye, “Two-photon laser scanning stereomicroscopy for fast volumetric imaging,” PLOS ONE **11**, e0168885 (2016). [CrossRef] [PubMed]

**13. **R. Lu, W. Sun, Y. Liang, A. Kerlin, J. Bierfeld, J. D. Seelig, D. E. Wilson, B. Scholl, B. Mohar, and M. Tanimoto *et al.*, “Video-rate volumetric functional imaging of the brain at synaptic resolution,” Nat. Neurosci. **20**, 620 (2017). [CrossRef] [PubMed]

**14. **A. Song, A. S. Charles, S. A. Koay, J. L. Gauthier, S. Y. Thiberge, J. W. Pillow, and D. W. Tank, “Volumetric two-photon imaging of neurons using stereoscopy (vtwins),” Nat. Methods **14**, 420 (2017). [CrossRef] [PubMed]

**15. **T. G. Feeman, *The mathematics of medical imaging* (Springer, 2010). [CrossRef]

**16. **J. Sharpe, U. Ahlgren, P. Perry, B. Hill, A. Ross, J. Hecksher-Sørensen, R. Baldock, and D. Davidson, “Optical projection tomography as a tool for 3d microscopy and gene expression studies,” Science **296**, 541–545 (2002). [CrossRef] [PubMed]

**17. **M. Rieckher, U. J. Birk, H. Meyer, J. Ripoll, and N. Tavernarakis, “Microscopic optical projection tomography in vivo,” PLOS ONE **6**, e18963 (2011). [CrossRef] [PubMed]

**18. **X. F. Ma, M. Fukuhara, and T. Takeda, “Neural network ct image reconstruction method for small amount of projection data,” NUCL INSTRUM METH A **449**, 366–377 (2000). [CrossRef]

**19. **F. Thaler, C. Payer, and D. Štern, “Volumetric reconstruction from a limited number of digitally reconstructed radiographs using cnns,” in Proc. of the OAGM Workshop, (2018), pp. 13–19.

**20. **A. Goy, G. Roghoobur, S. Li, K. Arthur, A. I. Akinwande, and G. Barbastathis, “High-resolution limited-angle phase tomography of dense layered objects using deep neural networks,” arXiv preprint arXiv:1812.07380 (2018).

**21. **T. C. Nguyen, V. Bui, and G. Nehmetallah, “Computational optical tomography using 3-d deep convolutional neural networks,” Opt. Eng. **57**, 043111 (2018).

**22. **D. Pelt, K. Batenburg, and J. Sethian, “Improving tomographic reconstruction from limited data using mixed-scale dense convolutional neural networks,” J. Imaging **4**, 128 (2018). [CrossRef]

**23. **R. Lu, M. Tanimoto, M. Koyama, and N. Ji, “50 hz volumetric functional imaging with continuously adjustable depth of focus,” Biomed. Opt. Express **9**, 1964–1976 (2018). [CrossRef] [PubMed]

**24. **T. A. Pologruto, B. L. Sabatini, and K. Svoboda, “Scanimage: flexible software for operating laser scanning microscopes,” Biomed. Eng. Online **2**, 13 (2003). [CrossRef] [PubMed]

**25. **J. D. Seelig, M. E. Chiappe, G. K. Lott, A. Dutta, J. E. Osborne, M. B. Reiser, and V. Jayaraman, “Two-photon calcium imaging from head-fixed drosophila during optomotor walking behavior,” Nat. Methods **7**, 535 (2010). [CrossRef] [PubMed]

**26. **O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in *Med. Image. Comput. Comput. Assist. Interv.*, (Springer, 2015), pp. 234–241.

**27. **Ö. Çiçek, A. Abdulkadir, S. S. Lienkamp, T. Brox, and O. Ronneberger, “3d u-net: learning dense volumetric segmentation from sparse annotation,” in *Med. Image. Comput. Comput. Assist. Interv.*, (Springer, 2016), pp. 424–432.

**28. **M. Weigert, U. Schmidt, T. Boothe, A. Müller, A. Dibrov, A. Jain, B. Wilhelm, D. Schmidt, C. Broaddus, S. Culley, M. Rocha-Martins, F. Segovia-Miranda, C. Norden, R. Henriques, M. Zerial, M. Solimena, J. Rink, P. Tomancak, L. Royer, F. Jug, and E. W. Myers, “Content-aware image restoration: pushing the limits of fluorescence microscopy,” Nat. Methods **15**, 1090 (2018). [CrossRef] [PubMed]

**29. **F. Chollet, “Keras,” https://keras.io (2015).