Inverse design of optical needles with central zero-intensity points by artificial neural networks

Wei Xin; Qiming Zhang; Qiming Zhang; Min Gu; Min Gu

doi:10.1364/OE.410073

1. Introduction

With the characteristic of axial central minimized points and the ability to realize z-axis super-resolution, the optical needles with central zero-intensity points have attracted broad attention for their wide applications in spectroscopy [1,2] and 3D STED [3,4]. Besides, the optical needles with central zero-intensity points have the potential demand in all-optical magnetic recording [5], optical lithography [6] and ﬂuorescent imaging [7].

So far, optical needles with central zero-intensity points have been demonstrated theoretically [8–10] and experimentally [1–3,11,12]. Optical needles with central zero-intensity points can be generated by the 4pi system [9,13] or phase plate [1,8,11]. Although there exist some theories and algorithms [1,4,8,9,14,15], these algorithms require physical perception and sweeping of many parameters, and the amplitude and resolution of the phase plate are invariable after optimizing. Besides, the initial parameters of these algorithms are set intuitively or empirically [16].

The general method of inverse design can provide the best possible output automatically after the desired targets input, which has been used to demonstrate an integrated nanophotonics polarization beamsplitters with controllable extinction ratios [17]. After combining with ANNs, the method can provide both the generality present in numerical optimization schemes and the speed of an analytical resolution [18], which have been introduced to design circuit elements [19–21], vectorial holography [22], nanophononics particles [18] and metasurfaces [23].

In this work, we report on the inverse design of optical needles with central zero-intensity points by DANNs. Optical needles with central zero-intensity points can be inversely designed with different amplitude and resolution with DANNs for the first time. We generate several optical needles with central zero-intensity points along the longitudinal direction in this work and the resolution of these optical needles with central zero-intensity points are close to lateral diffraction limit (∼1λ). Surprisingly, the network can also realize inverse design of multi-focal distributions and optical needles. We generate some optical needles with DANNs. The depth of focus (DOF) (>81% for both cases) of optical needles ranges from 2.5λ to 8λ and most of them are sub-diffractive. This study provides new insights into the length-selectable optical tweezers [24–27] and super-resolution microscopy [3,11,12].

2. Network architecture for the inverse design of optical needles with central zero-intensity points

The DANN employed here is a multilayer perception neural network [18,28], which can directly predict the optical needles with central zero-intensity points by on-axis intensity distribution due to dipole arrays’ axially and radially symmetrical properties. The advantage of direct prediction is superior performance [28], low-dimension sample data and less occupation of computer memory [18], which enables the network to be accomplished on a laptop. Figure 1 displays the framework of DANNs. The input to the DANN is a target on-axis intensity distribution I, while the output of the DANN is amplitude function E₀ in the pupil plane.

Fig. 1. (a) Training process of DANNs. The parameters, pupil plane and distributions are unnormalized. (b) Flow-process diagram of the inverse design of optical needles with central zero-intensity points by DANNs. FFT denotes the fast Fourier transform. The predicted electric field in the pupil plane is radially polarized.

Download Full Size | PDF

The determination process of DANN is as follows. The on-axis intensity distribution is fed into the input layer I which is connected to the hidden layer H by the first weight matrix W₁. Then, the input to hidden layer H is transmitted through a Sigmoid function. Additionally, the output of hidden layer H is calculated via the second weight matrix W₂. Finally, the output layer E₀ adopts the weighted output from second weight matrix W₂, gets through linear output neurons and returns the amplitude function. Here, the network comprises 203 pixel on-axis intensity distribution and a single hidden layer consists of 256 neurons.

Here we assume many on-axis dipoles are symmetrically distributed along z axis and these ‘virtual’ dipoles can be derived to pupil plane function easily according to propagating function, which is further introduced in Eqs. (A4), (A5) and (A6). When the dipoles are distributed precisely, the optical needle with central zero-intensity points is created. We choose this method because the optical needles with central zero-intensity points can be generated by changing parameters on the pupil plane easily. Besides, this method can directly show the characteristics of an assumed optical needle with central zero-intensity points. Moreover, this method can be realized experimentally with a discrete complex pupil filter [16]. Figure 1(a) illustrates the training process of DANNs. The target pupil plane is derived from assumed dipoles which can be expressed by several parameters. The on-axis distributions are fed into the DANNs after Fast Fourier Transform (FFT). After several loops of forward and backward propagation, the DANN can predict parameters. The detailed definition of amplitude A_n, dipoles’ spacing distance d_n and initial phase β_n in Fig. 1(a) is discussed in Appendix A. Figure 1(b) illustrates pairs of dipoles (red horizontal arrows) design symmetrically along the z axis [16] and the derivation from dipoles to the pupil plane is shown in red declining arrows. The detailed derivation process will be introduced in Appendix A. In our work, we attempt to assume the situation of N2 [16] (2 pairs of optical needles), N3, N4 and N5.

3. Training of DANNs

The training is carried out by gradient descent with random fixed step size based on Mean Squared Error (MSE) [29]

(1)$$MSE = \frac{1}{n}\mathop \sum \limits_{i = 1}^n {({{y_i} - {y_i}^{\prime}} )^2}.$$

In the training process, MSE is labelled as MSE_tra. The y_i denotes the target parameters in Fig. 1(a), y_i’ denotes the predicted parameters in Fig. 1(a) and (n) denotes the number of samples. During the evaluation of result, MSE is labelled as MSE_eva. The y_i denotes the targe on-axis intensity distribution, y_i’ denotes predicted on-axis intensity distribution and n denotes the number of samples. The value of MSE denotes the residual function which measures the quality of prediction during the training process.

To construct the network, we use MATLAB neural network toolbox, as it is a well-updated and easily operated network toolbox [30,31]. The DANNs are composed of a three-layer feed-forward and back-propagation network [28] trained with Levenberg-Marquardt backpropagation algorithm [32], because the three layer structure is widely utilized in chemistry [33], metallurgy [34], meteorology [35] and water conservancy project [36] as its capacities of non-linear prediction and recognition of high dimensional correlation. The Levenberg-Marquardt backpropagation algorithm is a mature algorithm in geography [37], electronics [38], economics [39] and water conservancy project [40]. The starting weights and bias will be randomized due to inner setting of the toolbox, while specific weights and networks can also set the start point to get a further and better performance. Besides, overfitting is one of the problems which should be avoided when training [28], so the network toolbox sets a validation check to check whether the error on the validation dataset starts to rise [30,31,41]. After the validation error reaches a number of iterations, the training process stops and the weights and biases are output to avoid overfitting. Each DANN spends 0.5–2 minutes in training process with MATLAB neural network toolbox, which spends less time than some algorithms [42–44].

4. Network training and analyzing

The pair number of dipoles and corresponding parameters ranges, which are fed and trained in the DANNs, are given in Table 1. A_n is the arbitrary intensity, d_n is the spatial distance of symmetrical dipoles and β_n is the initial phase. The λ is wavelength of incident beam.

Table 1. Parameters ranges of different dipole arrays for the training of DANNs

View Table | View all tables in this article

The error between target parameter and predicted parameter under N2 and N5 condition is displayed in Fig. 2(a). With traditional algorithms, the times and complexity grow exponentially as the number of parameters increases [18], so the prediction of high-N needles will be hindered and N3 is close to the calculation limitation of computers. Here, although the error increases with the increasing pair number, 5 pairs of dipoles can also be predicted easily and rapidly. When analyzing the predicted parameters of optical needles in Appendix B, some parameters are out of the limited scale, which shows that the networks are trained-well and worked.

Fig. 2. (a) Error between target parameters and predicted parameters under N2 and N5 condition. Each color represents one pair of parameters and contains stimulation (solid) and target (hollow) parameter. Geometric figures are also introduced to distinguish every pair of parameters clearly. (b) and (c) show gradient descent of N2 and N5 MSE_tra during training process.

Download Full Size | PDF

According to Fig. 2(a), the networks are hard to converge when the parameter scale is enlarged and the number of dipole arrays is increasing. The corresponding gradient descent of MSE_tra during training in two different situations is given in Figs. 2(b) and (c).

5. Inverse design of optical needles and optical needles with zero-intensity points

5.1 Inverse design of optical needles

This network has the ability to recognize different on-axis distribution after the similar models fed in, such as the optical needles, optical needles with central zero-intensity points and multi-focal distributions. After analyzing the optical needles existed [16,45], we visually find that optical needles have a flat top and both edges of on-axis intensity attenuate sharply from top to nearly zero. With curve fitting and feature analysis, what we found is that the first-order Gaussian function fits different edges well, so we simplify the model intuitively and create the Gaussian-edge model.

(2)$${I_1} = {A_1}{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {z_{left}} \le z \le {z_{right}},$$

(3)$${I_1} = {A_1} \cdot \textrm{exp} \left[ { - {{\left( {\frac{{z - {B_1}}}{{{C_1}}}} \right)}^2}} \right]{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} z\textrm{ < }{z_{left}}\textrm{ or }z\textrm{ > }{z_{right}},$$

where I₁ denotes the unnormalized on-axis intensity, A₁ denotes the unnormalized top of the optical needles, B₁ changes the position of Gauss function and C₁ influence the slope of edge. With Gaussian-edge model, it is available to generate optical needles in Fig. 3. Besides the presented two optical needles in Fig. 3, the other 36 optical needles are shown in Appendix B.

Fig. 3. The first column is input (blue) and predicted (yellow) normalized on-axis intensity. The second column displays the predicted input pupil electric field. The third column shows the corresponding optical needles and evaluating parameters (white letters). The predicted input electric field includes both amplitude and phase. FWHM denotes full width at half maximum of central radial direction. R_m denotes the maximum radius of the entrance pupil. The predicted electric field in the pupil plane is radially polarized.

Download Full Size | PDF

To evaluate the optical needles, we introduce the purity η [16]

(4)$$\eta = \frac{{{\mathrm{\Phi }_z}}}{{{\mathrm{\Phi }_z} + {\mathrm{\Phi }_r}}},$$

which η denotes the percentage of longitudinal intensity to the total field. φ_z and φ_r are longitudinal and radial components of one plane and they can be provided as

(5)$${\mathrm{\Phi }_{z(r)}} = \mathop \smallint \nolimits_0^{{r_0}} {|{{E_{z(r)}}(r,z)} |^2}rdr,$$

where r₀ denotes the first zero point of the radial component intensity. Here we choose the focal plane (z=0) to evaluate the purity.

5.2 Inverse design of optical needles with central zero-intensity points

The DANNs can also predict central minimized points. We found that the networks were sensitive to hollows in the middle of the on-axis distribution. After analyzing the feature of optical needles with central zero-intensity points and the convergence distance of the Gaussian function, we create the multi-trapezoidal-shape model. The one-trapezoidal-shape model can be shown as

(6)$${I_1} = {A_1}{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {z_{left}} \le z \le {z_{right}},$$

(7)$${I_1} = k \cdot {A_1} \cdot z + {B_1}{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} z\textrm{ < }{z_{left}}\textrm{ or }z\textrm{ > }{z_{right}},$$

which I₁ denotes the unnormalized on-axis intensity, A₁ denotes the unnormalized top of the optical needles with central zero-intensity points, k denotes the slope of edge and B₁ denotes the intercept of edge. With this model, we successfully predict many optical needles with central zero-intensity points, which is shown in Fig. 4. The contrast (I_max/I_min) are above 10⁵−10⁶ and the resolutions are close to 1λ, so these are qualified central minimized points.

Fig. 4. Generation of optical needles with central zero-intensity points by N2, N3, N4 and N5 network (the first four rows). The first column is input (blue) and predicted (yellow) normalized on-axis intensity. The second column displays the predicted input pupil electric field. The third column shows the corresponding optical needles with central zero-intensity points and evaluating parameters (white letters). Resolution is the total width from the central zero-intensity point to the closest half maximum point of both sides. Contrast represents the sharpness of the central zero-intensity points. The predicted input electric field includes both amplitude and phase. R_m denotes the maximum radius of the entrance pupil. The predicted electric field in the pupil plane is radially polarized.

Download Full Size | PDF

The ratios of maximum on-axis intensity of focal field (I_maxfocal) to maximum on-axis input field (I_maxinput) are more than 10⁵ in all the samples provided below, which means that the beam is tightly focused into the focal region.

5.3 Inverse design of optical needles with zero-intensity points and multifocal regions

Additionally, we also create needles with zero-intensity points (the first row of Fig. 5) with the N3 network, which has a flat top instead of sloping fast, and one multifocal region (the second row of Fig. 5). Nevertheless, as the flat top becomes wider and the shape of on-axis distribution become complex, it’s hard to predict needles with three-layer DANNs because of the predicting limit of these networks. More complex networks such as CNNs [46] or multi-layers DANNs can have a better performance.

Fig. 5. The first row is an optical needle with a zero-intensity point and the second row is a multifocal region. The predicted input electric field includes both amplitude and phase. R_m denotes the maximum radius of the entrance pupil. The predicted electric field in the pupil plane is radially polarized.

Download Full Size | PDF

6. Conclusion

Generation of optical needles with central zero-intensity points with specific resolution and amplitude requires inverse design. In this work, we proposed the inverse design of optical needles with central zero-intensity points by DANNs and permit the inverse design of optical needles with central zero-intensity points which are close to specific resolution and amplitude for the first time. The resolution is close to axial diffraction limit (∼1λ) and it shows the potential of designing sub-diffractive optical needles with central zero-intensity points by DANNs. Meanwhile, the multifunctional DANNs can also realize the inverse design of sub-diffractive optical needles range from 2.5λ to 8λ and multifocal regions. This study provided an interesting opportunity for the length-selectable optical tweezers [24–27] and super-resolution microscopy [3,11,12] and could be further applied in ﬂuorescent imaging [7], optical lithography [6] and all-optical magnetic recording [5].

Appendix A: Acceleration of training data generation on a high numerical aperture lens

The optical system of DANNs is modelled under high NA lens. Here, we choose the lens with a NA of 0.95. A radially polarized beam at a wavelength of 488 nm is selected as the incident beam [47].

The electric radiation in the pupil plane of the dipole array can be derived from [7,16,48]

(A1)$${\vec{E}_0}(\theta ) = C\sin \theta A{F_N}{\vec{a}_\theta },$$

(A2)$$A{F_N} = \mathop \sum \limits_{n = 1}^N {A_n}[{{e^{j({k{d_n}\cos \theta + {\beta_n}} )/2}} + {e^{ - j({k{d_n}\cos \theta + {\beta_n}} )/2}}} ],$$

which ${\vec{E}_0}(\theta )$ is the electric radiation in the pupil plane, C is the intrinsic impedance of air [16] and normalized to 1 in our assumption, θ is the incident angle of the beam and ${\vec{a}_\theta }$ is a unit vector pointing from spherical surface Ω in Fig. 1 (dotted arc) to focal region. AF_N is the array factor [16] which is relevant to amplitude A_n, incidence angle θ, dipoles’ spacing distance d_n and initial phase β_n. The N here is the number of dipoles and is an even number. Furthermore, we can simplify Eq. (A2) to

(A3)$$A{F_N} = \mathop \sum \limits_{n = 1}^N 2{A_n}\cos (\frac{{k{d_n}\cos \theta \textrm{ + }{\beta _\textrm{n}}}}{2}).$$

The incident beam can be modulated by a discrete complex pupil filter [16]. In our work, we assume the situation of N = 4 (N2) or 6 (N3) or 8 (N4) or 10 (N5).

With the help of fast Fourier transform [49], it took 0.6 seconds to produce a pair of samples on a laptop with RTX2060 and Intel i7 9th CPU. When generating an axial cross section, vector Debye integral is always introduced due to its specific optimization and numerical precision [50], but the double integral is time-consuming. Whilst, with the help of fast Fourier transform [49], the speed can be accelerated greatly.

Fast Fourier transform of three Debye electric ﬁeld components in the Sine condition objective lens and radially polarized beam distribution across the pupil plane are expressed by the Eq. (A4), (A5) and (A6), we set the precision of FFT as 20 nm in radial position and 20 nm in axial position.

(A4)$${E_x}(z) = FFT\{{C\sin \theta A{F_N}\cos \theta \cos \varphi \textrm{exp} [ - ikr\sin \theta \cos (\varphi - \psi ) - ik\textrm{z}\cos \theta ]\sin \theta } \},$$

(A5)$${E_y}(z) = FFT\{{C\sin \theta A{F_N}\cos \theta \sin \varphi \textrm{exp} [ - ikr\sin \theta \cos (\varphi - \psi ) - ik\textrm{z}\cos \theta ]\sin \theta } \},$$

(A6)$${E_z}(z) = FFT\{{C\sin \theta A{F_N}\sin \theta \textrm{exp} [ - ikr\sin \theta \cos (\varphi - \psi ) - ik\textrm{z}\cos \theta ]\sin \theta } \}.$$

Moreover, we choose on-axis light intensity distribution as our samples, because the diploes’ distribution is symmetrical axially and radially [16], so there are no requirements to lock the distribution with two planes [46]. With the design above, the input matrix dimension is reduced greatly and the network occupies less computer memory, so the generation and training process can both implement on the laptop. Additionally, we would like to emphasize that the datasets are unnormalized and there will be an ergodic and evaluating process of finding an optical needle after training.

After the fast Fourier transform, the intensity distribution comprises three Debye electric ﬁeld components. The on-axis intensity is available when $x$ = 0 and $y$ = 0, which is provided in Eq. (A7).

(A7)$$I(z) = {|{{E_x}(z)} |^2} + {|{{E_y}(z)} |^2} + {|{{E_z}(z)} |^2}.$$

Appendix B: Several optical needles and their parameters

The optical needles created by N2, N3, N4 and N5 DANN are listed in Tables 2, 3, 4 and 5. The sampling precision of the fast Fourier transform here is 10 nm in radial position and 10 nm in axial position.

Table 2. List of created optical needles by DANN of N2

View Table | View all tables in this article

Table 3. List of created optical needles by DANN of N3

View Table | View all tables in this article

Table 4. List of created optical needles by DANN of N4

View Table | View all tables in this article

Table 5. List of created optical needles by DANN of N5

View Table | View all tables in this article

Minimizing the scale of parameters is one of the most time-consuming processes, so the tables shown below will pave the way for further optimization. Besides, with the analysis of these parameters, some thought-provoking ideas or physical models can also be implemented in further progress.

Funding

National Natural Science Foundation of China (61975123); Zhangjiang National Innovation Demonstration Zone (ZJ2019-ZD-005).

Acknowledgments

We would like to thank the support from all the members in the Centre for Artiﬁcial-Intelligence Nanophotonics. Additionally, Wei Xin wants to thank Dr Yangyundou Wang, Mr. Yiming Li, Mr. Hao Dong and associate professor Zhiwei Bi for discussing of neural networks, MATLAB, Blender and academic English. Wei Xin would also specifically like to give his sincere thanks to his parents for their support over the years.

Disclosures

The authors declare no conflicts of interest.

References

1. Y. Ma and T. Ha, “Fight against background noise in stimulated emission depletion nanoscopy,” Phys. Biol. 16(5), 051002 (2019). [CrossRef]

2. A. Barbotin, S. Galiani, I. Urbančič, C. Eggeling, and M. J. Booth, “Adaptive optics allows STED-FCS measurements in the cytoplasm of living cells,” Opt. Express 27(16), 23378–23395 (2019). [CrossRef]

3. R. Schmidt, C. A. Wurm, S. Jakobs, J. Engelhardt, A. Egner, and S. W. Hell, “Spherical nanosized focal spot unravels the interior of cells,” Nat. Methods 5(6), 539–544 (2008). [CrossRef]

4. M. G. Velasco, M. Zhang, J. Antonello, P. Yuan, E. Allgeyer, D. May, O. M’Saad, P. Kidd, A. Barentine, V. Greco, J. Grutzendler, M. Booth, and J. Bewersdorf, “3D super-resolution deep-tissue imaging in living mice,” (2019).

5. Y. Jiang, X. Li, and M. Gu, “Generation of sub-diffraction-limited pure longitudinal magnetization by the inverse Faraday effect by tightly focusing an azimuthally polarized vortex beam,” Opt. Lett. 38(16), 2957–2960 (2013). [CrossRef]

6. Z. Gan, Y. Cao, R. A. Evans, and M. Gu, “Three-dimensional deep sub-diffraction optical beam lithography with 9 nm feature size,” Nat. Commun. 4(1), 2061 (2013). [CrossRef]

7. H. Li, Y. Wang, and P. Chen, “Ultra-long optical needles with controllable homogeneously 3D spin-orientation produced with an annular spherical mirror,” in Proc.SPIE (2019), Eleventh International Conference on Information Optics and Photonics (CIOP 2019), 1120908 (20 December 2019).

8. Y. Li, L. Lai, C. Rui, and L. Wang, “Optimization of depletion focal spot in STED nanoscopy using amplitude manipulation,” Opt. Commun. 372, 132–136 (2016). [CrossRef]

9. M. Luo, D. Sun, Y. Yang, S. Liu, J. Wu, Z. Ma, S. Sun, and X. Sun, “Three-dimensional isotropic STED microscopy generated by 4π focusing of a radially polarized vortex Laguerre–Gaussian beam,” Opt. Commun. 463, 125434 (2020). [CrossRef]

10. Y. Li, H. Zhou, X. Liu, Y. Li, and L. Wang, “Effects of aberrations on effective point spread function in STED microscopy,” Appl. Opt. 57(15), 4164–4170 (2018). [CrossRef]

11. J. Heine, C. A. Wurm, J. Keller-Findeisen, A. Schönle, B. Harke, M. Reuss, F. R. Winter, and G. Donnert, “Three dimensional live-cell STED microscopy at increased depth using a water immersion objective,” Rev. Sci. Instrum. 89(5), 053701 (2018). [CrossRef]

12. T. A. Klar, S. Jakobs, M. Dyba, A. Egner, and S. W. Hell, “Fluorescence microscopy with diffraction resolution barrier broken by stimulated emission,” Proc. Natl. Acad. Sci. U. S. A. 97(15), 8206–8210 (2000). [CrossRef]

13. X. Hao, E. S. Allgeyer, M. J. Booth, and J. Bewersdorf, “Point-spread function optimization in isoSTED nanoscopy,” Opt. Lett. 40(15), 3627–3630 (2015). [CrossRef]

14. X. Yang, H. Xie, E. Alonas, Y. Liu, X. Chen, P. J. Santangelo, Q. Ren, P. Xi, and D. Jin, “Mirror-enhanced super-resolution microscopy,” Light Sci. Appl. 5(6), e16134 (2016). [CrossRef]

15. A. Barbotin, I. Urbančič, S. Galiani, C. Eggeling, M. Booth, and E. Sezgin, “z-STED Imaging and Spectroscopy to Investigate Nanoscale Membrane Structure and Dynamics,” Biophys. J. 118(10), 2448–2457 (2020). [CrossRef]

16. J. Wang, W. Chen, and Q. Zhan, “Engineering of high purity ultra-long optical needle field through reversing the electric dipole array radiation,” Opt. Express 18(21), 21965–21972 (2010). [CrossRef]

17. B. Shen, P. Wang, R. Polson, and R. Menon, “An integrated-nanophotonics polarization beamsplitter with 2.4×2.4µm 2 footprint,” Nat. Photonics 9(6), 378–382 (2015). [CrossRef]

18. J. Peurifoy, Y. Shen, L. Jing, Y. Yang, F. Cano-Renteria, B. G. DeLacy, J. D. Joannopoulos, M. Tegmark, and M. Soljačić, “Nanophotonic particle simulation and inverse design using artificial neural networks,” Sci. Adv. 4(6), eaar4206 (2018). [CrossRef]

19. E. Goi, Q. Zhang, X. Chen, H. Luan, and M. Gu, “Perspective on photonic memristive neuromorphic computing,” PhotoniX 1(1), 1–26 (2020). [CrossRef]

20. Q. Zhang, H. Yu, M. Barbiero, B. Wang, and M. Gu, “Artificial neural networks enabled by nanophotonics,” Light: Sci. Appl. 8(1), 1–14 (2019). [CrossRef]

21. W. Ma, Z. Liu, Z. A. Kudyshev, A. Boltasseva, W. Cai, and Y. Liu, “Deep learning for the design of photonic structures,” Nat. Photonics1–14 (2020). [CrossRef]

22. H. Ren, W. Shao, Y. Li, F. Salim, and M. Gu, “Three-dimensional vectorial holography based on machine learning inverse design,” Sci. Adv. 4(1), 2061 (2020). [CrossRef]

23. Z. Liu, D. Zhu, S. P. Rodrigues, K. T. Lee, and W. Cai, “Generative Model for the Inverse Design of Metasurfaces,” Nano Lett. 18(10), 6570–6576 (2018). [CrossRef]

24. B. Spektor, A. Normatov, and J. Shamir, “Singular beam microscopy,” Appl. Opt. 47(4), A78–A87 (2008). [CrossRef]

25. T. Kinoshita, “Stress singularity near the crack-tip in silicon carbide: Investigation by atomic force microscopy,” Acta Mater. 46(11), 3963–3974 (1998). [CrossRef]

26. T. A. Nieminen, N. R. Heckenberg, and H. Rubinsztein-Dunlop, “Computational modeling of optical tweezers,” Opt. Trapp. Opt. Micromanipulation 5514, 514–523 (2004). [CrossRef]

27. Y. Shen, X. Wang, Z. Xie, C. Min, X. Fu, Q. Liu, M. Gong, and X. Yuan, “Optical vortices 30 years on: OAM manipulation from topological charge to multiple singularities,” Light: Sci. Appl. 8(1), 90 (2019). [CrossRef]

28. B. P. Cumming and M. Gu, “Direct determination of aberration functions in microscopy by an artificial neural network,” Opt. Express 28(10), 14511–14521 (2020). [CrossRef]

29. J. Prakash Vijay, N. Kumar, S. A. Professor, and M. T. Scholar, “Performance Analysis Of RLS Over LMS Algorithm For MSE In Adaptive Filters,” Int. J. Technol. Enhanc. Emerg. Eng. Res. 2, 40–44 (2014).

30. I. Kirbas and A. Kerem, “Short-Term Wind Speed Prediction Based on Artificial Neural Network Models,” Meas. Control 49(6), 183–190 (2016). [CrossRef]

31. D. P. B. T. B. Strik, A. M. Domnanovich, L. Zani, R. Braun, and P. Holubar, “Prediction of trace compounds in biogas from anaerobic digestion using the MATLAB Neural Network Toolbox,” Environ. Model. Softw. 20(6), 803–810 (2005). [CrossRef]

32. J. Moré, “The Levenberg-Marquardt algorithm: Implementation and theory,” Numer. Anal. 630, G. A. Watson, ed. (Springer-Verlag, Berlin, 1977).

33. T. B. Blank and S. D. Brown, “Nonlinear Multivariate Mapping of Chemical Data Using Feed-Forward Neural Networks,” Anal. Chem. 65(21), 3081–3089 (1993). [CrossRef]

34. Y. C. Lin, J. Zhang, and J. Zhong, “Application of neural networks to predict the elevated temperature flow behavior of a low alloy steel,” Comput. Mater. Sci. 43(4), 752–758 (2008). [CrossRef]

35. B. ZareNezhad and A. Aminian, “A multi-layer feed forward neural network model for accurate prediction of flue gas sulfuric acid dew points in process industries,” Appl. Therm. Eng. 30(6-7), 692–696 (2010). [CrossRef]

36. A. Sarkar and P. Pandey, “River Water Quality Modelling Using Artificial Neural Network Technique,” Aquat. Procedia 4, 1070–1077 (2015). [CrossRef]

37. Ö Çelik, A. Teke, and H. B. Yildirim, “The optimized artificial neural network model with Levenberg-Marquardt algorithm for global solar radiation estimation in Eastern Mediterranean Region of Turkey,” J. Cleaner Prod. 116, 1–12 (2016). [CrossRef]

38. B. G. Kermani, S. S. Schiffman, and H. T. Nagle, “Performance of the Levenberg-Marquardt neural network training method in electronic nose applications,” Sens. Actuators, B 110(1), 13–22 (2005). [CrossRef]

39. S. Mammadli, “Financial time series prediction using artificial neural network based on Levenberg-Marquardt algorithm,” Procedia Comput. Sci. 120, 602–607 (2017). [CrossRef]

40. A. J. Adeloye and A. De Munari, “Artificial neural network based generalized storage-yield-reliability models using the Levenberg-Marquardt algorithm,” J. Hydrol. 326(1-4), 215–230 (2006). [CrossRef]

41. M. H. Beale, M. T. Hagan, and B. Demuth, Neural Network Toolbox User’s Guide (The MathWorks. Inc, 2013).

42. J. Lin, H. Zhao, Y. Ma, J. Tan, and P. Jin, “New hybrid genetic particle swarm optimization algorithm to design multi-zone binary filter,” Opt. Express 24(10), 10748–10758 (2016). [CrossRef]

43. Y. Zha, J. Wei, H. Wang, and F. Gan, “Creation of an ultra-long depth of focus super-resolution longitudinally polarized beam with a ternary optical element,” J. Opt. (United Kingdom) 15 (2013).

44. S. W. K. Roper, S. Ryu, B. Seong, C. Joo, and I. Y. Kim, “A topology optimization implementation for depth-of-focus extension of binary phase filters,” Struct. Multidiscip. Optim. 62(5), 2731–2748 (2020). [CrossRef]

45. H. Wang, L. Shi, B. Lukyanchuk, C. Sheppard, and C. T. Chong, “Creation of a needle of longitudinally polarized light in vacuum using binary optics,” Nat. Photonics 2(8), 501–505 (2008). [CrossRef]

46. H. Ma, H. Liu, Y. Qiao, X. Li, and W. Zhang, “Numerical study of adaptive optics compensation based on Convolutional Neural Networks,” Opt. Commun. 433, 283–289 (2019). [CrossRef]

47. R. Dorn, S. Quabis, and G. Leuchs, “Sharper focus for a radially polarized light beam,” Phys. Rev. Lett. 91(23), 233901 (2003). [CrossRef]

48. J. Luo, H. Zhang, S. Wang, L. Shi, Z. Zhu, B. Gu, X. Wang, and X. Li, “Three-dimensional magnetization needle arrays with controllable orientation,” Opt. Lett. 44(4), 727–730 (2019). [CrossRef]

49. M. Leutenegger, R. Rao, R. A. Leitgeb, and T. Lasser, “Fast focus field calculations,” Opt. Express 14(23), 11277–11291 (2006). [CrossRef]

50. M. Gu, Advanced Optical Imaging Theory (Springer, 2000).

	A_n / a.u.	d_n / λ	β_n / rad
N2	A₁ = 0–1.1	d₁ = 1–1.5λ	β₁ = 0.6–1.1π
N2	A₂ = 0–1.1	d₂ = 4–4.5λ	β₂ = 2.6–3.1π
N3	A₁ = 0–1.5	d₁ = 0–2λ	β₁ = 0–1.5π
	A₂ = 0–1.5	d₂ = 3–5λ	β₂ = 2–3.5π
	A₃ = 0–1.5	d₃ = 6–8λ	β₃ = 6–7.5π
N4	A₁ = 0.5–1.1	d₁ = 1.5–2λ	β₁ = 0.5–1π
	A₂ = 0.5–1.1	d₂ = 4.5–5λ	β₂ = 2–2.5π
	A₃ = 0.5–1.1	d₃ = 7.5–8λ	β₃ = 3.5–4π
	A₄ = 0.5–1.1	d₄ = 10.5–11λ	β₄ = 4–6π
N5	A₁ = 0.5–1.1	d₁ = 1.5–2λ	β₁ = 0.5–1π
	A₂ = 0.5–1.1	d₂ = 4.5–5λ	β₂ = 2–2.5π
	A₃ = 0.5–1.1	d₃ = 7.5–8λ	β₃ = 3.5–4π
	A₄ = 0.5–1.1	d₄ = 10.5–11λ	β₄ = 4–6π
	A₅ = 0.5–1.1	d₅ = 13.5–14λ	β₅ = 6–8π

Sample	A_n(a.u.)	d_n(λ)	β_n(π)	FWHM(λ)	DOF(λ)	Purity (%)
1	A₁=0.72735	d₁=1.15180	β₁=0.48293	0.53279	4.79508	∼70
1	A₂=0.91275	d₂=4.25720	β₂=1.29954	0.53279	4.79508	∼70
2	A₁=0.89390	d₁=1.43310	β₁=0.90501	0.40984	4.54918	∼80
2	A₂=0.85316	d₂=4.16182	β₂=2.83446	0.40984	4.54918	∼80
3	A₁=0.85183	d₁=1.41586	β₁=0.92079	0.40984	4.42623	∼80
3	A₂=0.81494	d₂=4.15877	β₂=2.74850	0.40984	4.42623	∼80
4	A₁=0.85807	d₁=1.45740	β₁=0.85412	0.40984	4.18033	∼80
4	A₂=0.82654	d₂=4.36826	β₂=6.37297	0.40984	4.18033	∼80
5	A₁=0.59983	d₁=1.05050	β₁=0.55931	0.49180	3.93433	∼70
5	A₂=0.83828	d₂=3.81031	β₂=5.79606	0.49180	3.93433	∼70
6	A₁=0.82468	d₁=1.22556	β₁=0.69527	0.49180	3.68852	∼70
6	A₂=0.89099	d₂=4.33597	β₂=5.08618	0.49180	3.68852	∼70
7	A₁=0.67293	d₁=1.26809	β₁=0.87925	0.40984	3.64754	∼75
7	A₂=0.84803	d₂=4.19817	β₂=1.90315	0.40984	3.64754	∼75
8	A₁=0.70191	d₁=1.32260	β₁=0.60562	0.40984	3.36066	∼80
8	A₂=1.02869	d₂=3.41683	β₂=7.08128	0.40984	3.36066	∼80

Sample	A_n(a.u.)	d_n(λ)	β_n(π)	FWHM(λ)	DOF(λ)	Purity (%)
1	A₁=−3.08179	d₁=−0.14243	β₁=−47.47429	0.49180	5.08197	∼70
	A₂=4.56600	d₂=4.41177	β₂=13.11296
	A₃=0.15835	d₃=6.39899	β₃=20.10301
2	A₁=−2.77284	d₁=0.44618	β₁=−48.04473	0.45082	4.50820	∼70
	A₂=4.59920	d₂=4.23786	β₂=13.69403
	A₃=0.05755	d₃=5.39046	β₃=16.93462
3	A₁=−3.17819	d₁=−0.27487	β₁=−50.03603	0.57377	4.46721	∼70
	A₂=4.31840	d₂=4.28316	β₂=14.43944
	A₃=0.21306	d₃=5.37162	β₃=16.87544
4	A₁=−2.70375	d₁=0.67171	β₁=−48.36182	0.40984	4.05738	∼75
	A₂=4.63210	d₂=4.06702	β₂=13.99131
	A₃=0.07224	d₃=5.14794	β₃=16.17274
5	A₁=−3.72747	d₁=−0.79387	β₁=−47.71651	0.40984	4.01639	∼75
	A₂=3.47470	d₂=4.57066	β₂=13.43052
	A₃=0.36493	d₃=7.41675	β₃=23.30040
6	A₁=−1.66447	d₁=0.61817	β₁=−34.47377	0.40984	3.89344	∼75
	A₂=3.34702	d₂=3.90646	β₂=8.02398
	A₃=0.01143	d₃=6.58181	β₃=20.67737
7	A₁=−2.59055	d₁=1.01768	β₁=−48.91182	0.40984	3.89344	∼75
	A₂=4.64836	d₂=3.82163	β₂=14.52058
	A₃=0.17451	d₃=5.08480	β₃=5.08480
8	A₁=−3.55915	d₁=−0.72542	β₁=−47.59325	0.45082	3.77049	∼75
	A₂=3.98569	d₂=12.90130	β₂=12.90130
	A₃=0.28783	d₃=7.04613	β₃=22.13608
9	A₁=−2.85206	d₁=1.27823	β₁=−44.98244	0.40984	3.72951	∼75
	A₂=2.66118	d₂=4.42676	β₂=13.86510
	A₃=0.63459	d₃=8.32690	β₃=26.15974
10	A₁=3.00477	d₁=1.01915	β₁=31.43151	0.40984	3.07377	∼75
	A₂=−1.36169	d₂=4.31890	β₂=−10.04686
	A₃=2.18636	d₃=8.05185	β₃=25.29565

Sample	A_n(a.u.)	d_n(λ)	β_n(π)	FWHM(λ)	DOF(λ)	Purity (%)
1	A₁=0.67816	d₁=3.44096	β₁=0.87128	0.40984	6.80328	∼75
	A₂=0.44995	d₂=0.60414	β₂=2.59416
	A₃=0.70030	d₃=5.99922	β₃=3.06320
	A₄=0.51878	d₄=12.35735	β₄=8.72029
2	A₁=1.25652	d₁=−0.19710	β₁=−0.42544	0.53279	4.91803	∼70
	A₂=1.13575	d₂=4.94609	β₂=2.23732
	A₃=2.03809	d₃=9.13821	β₃=3.17735
	A₄=2.37951	d₄=9.17231	β₄=0.98033
3	A₁=0.88848	d₁=3.40319	β₁=0.52386	0.45082	4.67213	∼75
	A₂=0.90170	d₂=1.35399	β₂=2.17931
	A₃=0.71098	d₃=4.57633	β₃=3.40030
	A₄=0.32732	d₄=12.60878	β₄=5.28447
4	A₁=0.88406	d₁=3.53836	β₁=0.41222	0.45082	4.22131	∼75
	A₂=0.96967	d₂=1.46427	β₂=2.13693
	A₃=0.52677	d₃=4.52025	β₃=3.28738
	A₄=0.37787	d₄=12.30756	β₄=5.79651
5	A₁=0.83805	d₁=1.21062	β₁=0.44836	0.48082	3.81148	∼75
	A₂=1.89812	d₂=3.80461	β₂=1.88421
	A₃=1.89204	d₃=7.90003	β₃=3.84589
	A₄=1.74514	d₄=9.55324	β₄=4.29264
6	A₁=0.84571	d₁=−1.27483	β₁=2.99032	0.40984	3.72951	∼80
	A₂=0.95319	d₂=4.08474	β₂=2.50852
	A₃=0.92709	d₃=8.69983	β₃=4.51043
	A₄=0.81614	d₄=9.26618	β₄=2.51483
7	A₁=0.91558	d₁=3.31458	β₁=0.93084	0.40984	3.72951	∼80
	A₂=0.54522	d₂=0.52579	β₂=2.70924
	A₃=0.71186	d₃=5.82576	β₃=3.13886
	A₄=0.40448	d₄=12.78865	β₄=7.55215
8	A₁=1.20428	d₁=0.65340	β₁=−0.19608	0.40984	3.68852	∼80
	A₂=0.93283	d₂=2.97823	β₂=1.69102
	A₃=−0.40906	d₃=6.11274	β₃=3.40589
	A₄=0.73353	d₄=9.38549	β₄=7.04153
9	A₁=0.89067	d₁=3.61726	β₁=0.30670	0.45082	3.52459	∼75
	A₂=1.05583	d₂=1.58414	β₂=2.09306
	A₃=0.29297	d₃=4.59049	β₃=3.11600
	A₄=0.51084	d₄=11.96290	β₄=6.16178
10	A₁=1.05169	d₁=3.70769	β₁=0.60215	0.40984	3.31967	∼75
	A₂=1.20354	d₂=1.70782	β₂=2.12771
	A₃=0.27933	d₃=4.38964	β₃=4.43373
	A₄=0.53219	d₄=12.09623	β₄=5.15816

Sample	A_n(a.u.)	d_n(λ)	β_n(π)	FWHM(λ)	DOF(λ)	Purity (%)
1	A₁=1.31745	d₁=1.49646	β₁=0.83178	0.40984	6.55738	∼80
	A₂=−0.51645	d₂=6.67250	β₂=2.01039
	A₃=1.51326	d₃=5.30725	β₃=5.25855
	A₄=−0.05145	d₄=8.26318	β₄=3.12249
	A₅=1.56189	d₅=16.69875	β₅=7.79184
2	A₁=0.99587	d₁=1.66989	β₁=0.57337	0.40984	5.81967	∼80
	A₂=−0.42643	d₂=7.84902	β₂=1.63361
	A₃=1.23794	d₃=5.20802	β₃=5.46092
	A₄=0.43432	d₄=8.13546	β₄=3.05678
	A₅=1.26386	d₅=16.09213	β₅=4.22084
3	A₁=−1.87147	d₁=0.02975	β₁=0.33071	0.45082	4.38525	∼75
	A₂=2.72212	d₂=3.92629	β₂=5.82653
	A₃=−1.05921	d₃=9.22262	β₃=3.16860
	A₄=0.08102	d₄=12.58013	β₄=6.81833
	A₅=1.18394	d₅=16.11548	β₅=21.35576
4	A₁=−2.15325	d₁=0.11171	β₁=0.47404	0.40984	4.13934	∼75
	A₂=2.97790	d₂=3.74900	β₂=6.40230
	A₃=−1.50633	d₃=9.16850	β₃=2.88006
	A₄=−0.34139	d₄=12.86384	β₄=6.98320
	A₅=0.73064	d₅=15.86902	β₅=21.65868
5	A₁=−1.10997	d₁=0.64329	β₁=1.24609	0.49180	3.56557	∼70
	A₂=1.87133	d₂=3.81211	β₂=3.66486
	A₃=−0.21207	d₃=7.77048	β₃=4.13045
	A₄=0.67589	d₄=10.04291	β₄=8.06377
	A₅=1.24111	d₅=18.28293	β₅=6.20995
6	A₁=−2.70129	d₁=1.29751	β₁=0.93395	0.40984	3.27869	∼75
	A₂=2.34583	d₂=4.08396	β₂=4.20441
	A₃=0.59615	d₃=8.50871	β₃=5.28765
	A₄=1.22963	d₄=11.23125	β₄=6.83541
	A₅=0.39203	d₅=19.02650	β₅=9.64341
7	A₁=−2.97749	d₁=0.88130	β₁=1.50746	0.40984	3.11475	∼75
	A₂=2.31776	d₂=3.71438	β₂=4.61956
	A₃=0.74186	d₃=8.10170	β₃=4.36040
	A₄=0.16965	d₄=11.58626	β₄=6.86009
	A₅=1.07967	d₅=18.47277	β₅=10.13473
8	A₁=−3.06769	d₁=0.43970	β₁=−0.14436	0.40984	2.90984	∼80
	A₂=4.17259	d₂=2.93989	β₂=3.53230
	A₃=−0.91383	d₃=9.06370	β₃=2.91408
	A₄=−0.01693	d₄=13.60303	β₄=6.56365
	A₅=0.02840	d₅=16.81437	β₅=7.63700
9	A₁=−2.70325	d₁=1.49386	β₁=2.33011	0.40984	2.58197	∼80
	A₂=2.38886	d₂=2.28172	β₂=4.84071
	A₃=−0.07177	d₃=8.65215	β₃=5.38212
	A₄=0.89017	d₄=11.81219	β₄=10.49580
	A₅=0.22184	d₅=18.91733	β₅=6.97086
10	A₁=−2.88396	d₁=0.35633	β₁=0.92762	0.40984	3.36066	∼80
	A₂=3.18066	d₂=2.56401	β₂=6.07435
	A₃=−0.24801	d₃=8.10078	β₃=2.95902
	A₄=−0.42657	d₄=12.87265	β₄=9.39266
	A₅=1.55839	d₅=16.71247	β₅=14.71590

Inverse design of optical needles with central zero-intensity points by artificial neural networks

Abstract

1. Introduction

2. Network architecture for the inverse design of optical needles with central zero-intensity points

3. Training of DANNs

4. Network training and analyzing

5. Inverse design of optical needles and optical needles with zero-intensity points

5.1 Inverse design of optical needles

5.2 Inverse design of optical needles with central zero-intensity points

5.3 Inverse design of optical needles with zero-intensity points and multifocal regions

6. Conclusion

Appendix A: Acceleration of training data generation on a high numerical aperture lens

Appendix B: Several optical needles and their parameters

Funding

Acknowledgments

Disclosures

References

Cited By

Figures (5)

Tables (5)

Equations (14)

Optics Express