Laser interferometry, as applied in cutting-edge length and displacement metrology, requires detailed analysis of systematic effects due to diffraction, which may affect the measurement uncertainty. When the measurements aim at subnanometer accuracy levels, it is possible that the description of interferometer operation by paraxial and scalar approximations is not sufficient. Therefore, in this paper, we place emphasis on models based on nonparaxial vector beams. We address this challenge by proposing a method that uses the Huygens integral to propagate the electromagnetic fields and ray tracing to achieve numerical computability. Toy models are used to test the method’s accuracy. Finally, we recalculate the diffraction correction for an interferometer, which was recently investigated by paraxial methods.
© 2015 Optical Society of America
CorrectionsBirk Andreas, Giovanni Mana, and Carlo Palmisano, "Vectorial ray-based diffraction integral: erratum," J. Opt. Soc. Am. A 33, 559-560 (2016)
Soon after the invention of the laser, its applicability to unexcelled measurements of lengths and displacements by interferometry was recognized . Plane or spherical waves were assumed to relate the phase of the interference fringes to the measurand, but, as early as 1976, Dorenwendt and Bönsch pointed out that this is not correct and that diffraction brings about systematic errors . Since then, metrologists have routinely applied corrections based on scalar and paraxial approximations of the interferometer operations [3 –13]. In recent years, cutting-edge interferometric measurements carried out to determine the Avogadro constant are looking at relative accuracies over propagation distances of the order of 1 m and propagation differences of many centimeters. Consequently, metrologists are drawing attention to nonparaxial vector models, with a goal to verifying the validity of the scalar and paraxial approximations or to improving the correction calculation. Since these measurements use continuous-wave lasers, which are stabilized and have an extremely narrow bandwidth, monochromatic light propagation can be assumed.
Optical interferometers may combine many optical components in a potentially complex geometry. Wolf and Richards carried out early work in 1959 on diffraction in optical systems, but that work provided only a description of the image or focal region [14,15]. Among the tools based on the paraxial approximation, a common one is the Collins integral, which allows the electromagnetic field at the output plane to be calculated from the field in the input plane and the ray matrix of the optical system . Ray matrices have been also applied to propagate Gaussian beams, as well as higher-order Laguerre–Gaussian or Hermite–Gaussian beams, which are solutions of the paraxial scalar wave equation [17,18]. Finally, Fourier optics is used together with the thin lens approximation .
In regards to nonparaxial models, in 1980, Byckling and Simola  proposed a sequential application of spherical harmonic Green’s functions and the evaluation of the relevant diffraction integral by the stationary phase method, but their method is limited to the scalar approximation and to spherical interfaces.
Geometric optics and ray tracing allow nonparaxial propagation to be considered, but usually they do not include diffraction . Some aspects of diffraction can be caught by associating phase information, in terms of the optical path length, to each ray. Douglas et al. simulated the operation of an interferometer by ad hoc assumptions that allowed computability to be simplified, but this waived the exact description of diffraction . Improvements were made by Riechert et al., but at the cost of sampling issues and relatively large computation times .
There exist several commercial software packages, where the segments of an optical system are tackled by different propagation techniques, e.g., ZEMAX, GLAD, and LightTrans VirtualLab. The latter is described in detail by Wyrowski and Kuhn . While free-space propagation is handled by Fourier optics methods, segments with interfaces are modeled either by the thin lens approximation or the geometrical optics field tracing. At the interfaces, the input field is locally decomposed, assuming local plane waves propagating in a direction perpendicular to the wavefront. These plane waves define the launch directions of ray tubes, which are traced through the subsequent subdomain, taking the amplitude changes into account by Fresnel equations and actual geometrical convergence or divergence. However, because they do not interact at the output plane, diffraction is not modeled. For this reason, LightTrans VirtualLab does not continuously model diffraction throughout the optical system; therefore, it is not suited for our purposes.
A different approach, which is common in optical design and is the basis of at least two software packages, was proposed by Greynolds in 1985 . This approach is based on the input-field decomposition into Gaussian beams , which, in turn, are represented by base rays along with a set of characteristic parabasal rays. After tracing the base and parabasal rays, the beams are recombined to form the output field [27,28]. However, Gaussian beams do not form a complete orthogonal set of basis functions; the decomposition is not unique and produces artifacts in the output field. Moreover, this method is reliable only if the parabasal rays are paraxial everywhere, and it is not clear how, for instance, coma can be modeled by basis beams that, at most, describe astigmatism.
The method called stable aggregate of flexible elements (SAFE) relies on a similar but improved concept, which allows the estimation of its own accuracy. Alas, so far it has not yet been extended to 3D problems [29,30]. Recently, a quantum-mechanical method has been proposed, by using Feynman path integrals and stationary-phase approximation to link geometric and scalar wave optics [31,32].
When the measurements aim at subnanometer accuracy levels, it is possible that the description of interferometer operation by paraxial and scalar approximations is not sufficient. The software package exploits a method called beam synthesis propagation that might meet the case. However, to the authors’ knowledge, no detailed description of this method is available to carry out an uncertainty analysis when applied to compensate for phase delays due to diffraction in state-of-the-art dimensional metrology. Extensive investigations of the propagation of electromagnetic waves and diffraction can be found in the literature of the radio wave and microwave communities , where significant effort was put into the efficient calculation of large-scale antenna fields as well as of radar cross sections and where computationally efficient ray tracing techniques were developed [34 –43].
In order to provide an efficient, vectorial, and nonparaxial model of the operation of laser interferometers, as applied in length and displacement metrology, and to calculate the relevant diffraction corrections, we developed a ray-based method to integrate diffraction integrals (VRBDI, vectorial ray-based diffraction integral). An early scalar version of this method is described in . Although it was independently developed, the main concept is the same as proposed by Ling et al., who, in 1989, published a comprehensive ray method, later known as SBR-PO for shooting and bouncing rays and physical optics [39,43].
Ling et al. approximated the ray-based field on the detector grid locally by plane waves , although in 1988, they described the far-field contribution of a ray tube by an approximate solution of an integral taken on the ray-tube wavefront . In this paper, we calculate the field on the detector grid by using a ray aiming approach together with a local representation of the wavefront, which is based on matrix optics [16 –18,45] and differential ray tracing [34,46]. In , the input field was assumed to be a single plane wave. Therefore, a single set of parallel rays entered the investigated optical system. Here, we decompose the input field into numerous components, which are either spherical or plane waves and are represented by sets of divergent or parallel ray tubes, respectively. The output field is then obtained by an integral superposition of the traced components. Ling et al. applied the phase matching technique to describe the local curvature and, hence, the divergence assigned to a single ray . In , it is noted that an alternative description by differential ray tubes, which, in turn, consist of base and parabasal rays, does not deliver a correct description of the Gouy phase as, potentially, too many ray path crossings that are difficult to detect could not be considered. In Appendix B of this paper, it is shown how the correct Gouy phase can indeed be obtained from differential ray tubes without the tracking of ray path crossings. In addition to the perfectly reflecting surfaces considered in , we extended the material equations to include transparent dielectrics. Finally, much effort was put into a stepwise, i.e., surface to surface, integral method, for which numerical tests of energy conservation were carried out in order to check physical consistency.
The VRBDI method is described in Section 3. In Section 2, the exact treatment of vector diffraction theory in linear, homogeneous, and isotropic dielectrics is reviewed to set the basis for the VRBDI and a touchstone of its performance. Section 4 compares this more rigorous method to the VRBDI methods by applying them to a simple toy model. Finally, to illustrate the applicability of the VRBDI, it is used in Section 5 to simulate an interferometer setup. Furthermore, these results are compared to an earlier paper where paraxial methods were utilized to describe the same setup.
2. VECTORIAL DIFFRACTION THEORY
A. Derivation of Vectorial Diffraction Integrals
For the following treatment, the same boundary conditions as, e.g., in Chen et al.  are assumed. The derivation of the diffraction integrals follows the one found in a book by Smith . The harmonic time dependence is consequently omitted.
A monochromatic continuous electromagnetic wave in a linear, homogeneous, isotropic, and transparent dielectric is fully characterized by knowing the components of either the electric field or the magnetic field tangential to a plane. For the sake of simplicity, let this plane be the , plane of a Cartesian coordinate system and let as well as be initially known. Then the angular plane-wave spectrum can be obtained by Fourier transform :49,50]: 49] 19,48] 3) exists, which likewise allows the full characterization of the electromagnetic field, in the case that only and are initially known.
The following derivation is done solely for the electric field, as it is fully analogous to the one for the magnetic field. Setting Eq. (2) back into Eq. (4) and using the versors , , and of the coordinate system, one obtains5) can equivalently be written as  48,51]: 7) into Eq. (6) yields 49,52,53]. In the given references, it is derived in different ways for the open aperture in an infinite metallic (perfectly conducting) screen. In this special case, it is an exact solution [49,52,53]. However, since we assume the field on initially to be known, we have , , and . Therefore, in our case Eq. (9) is not an integral equation. Thus, it can be further simplified. But first, by means of simple coordinate transformations, the more general result
With the short notation , the versor , and, after pulling the curl operator into the integral, one obtains12) and (13) satisfies the Helmholtz equation, , where . It is worth noting that the kernels permit only field components orthogonal to .
Because we are interested in the simulation of physical light beams, we also request that the field on is a section of a field satisfying the Maxwell equations. Practically, this condition can only be met asymptotically, because a finite and discrete calculation window where the initial field is calculated by Eqs. (1 )–(4) from two given components involves a cut at the border of the calculation window and finite sampling resolution. However, with sufficient resolution and a large enough computation window, which encloses a square integrable field, the error can be made negligible.
A necessary but not sufficient check of physical correctness can be done by applying the law of energy conservation. For a time-averaged continuous wave, this implies conservation of power. The respective numerical tests will be treated after the next subsection. For these tests, we also require that the initial field is a restriction of a solution of the Maxwell equations to in order to calculate the initial power.
B. Propagation through an Interface
Although the boundary conditions for the initial field applied to Eqs. (12) and (13) require the definition on a plane, arbitrarily curved interfaces are assumed here. The impact of this approximation is checked later by numerical tests.
An alternative method, based on the Stratton–Chu diffraction integrals , was introduced by Guha and Gillen , but it lacked the possibility to test power conservation. We compared numerically the Stratton–Chu diffraction integrals with Eqs. (12) and (13) by checking power conservation. We found that Eqs. (12) and (13) ensure power conservation several orders of magnitude more accurately than the Stratton–Chu diffraction integrals. A thorough report of this issue will be given in a separate paper; for now, we do not see advantages in using the more complex Stratton–Chu diffraction integrals in our kind of propagation problems.
A sketch of the situation is shown in Fig. 1. For each versor , the respective reflected versor and refracted one are calculated by assuming for an infinitesimally small patch around the intersection point local planarity for the wavefront and interface. Furthermore, in transparent isotropic media, the Poynting vector, which describes the transport of energy by electromagnetic radiation [49,50],12) and (13) into Eq. (14), one finds that the local Poynting vector 49,50]: 16) is the reflection law, Eq. (17) is Snell’s law [49,50], and , 0, 1 for , , , is a sign operator. The surface normal can also be a local function on . For example, if is spherical, can be found by subtraction of the actual position from the sphere center and subsequent normalization: .
It is useful to define the reference frames , , and , with19), local planarity and direction conformity are exploited again. The Fresnel equations are exact only in the case of an infinite plane wave at an infinite plane boundary. Therefore, we expect some accuracy loss, which we will quantify by tests of power conservation.
It is convenient to express the field calculations at the interface by differentials and perform the integration afterward. With the quantities defined above, one can write24) and (25) can be deduced from Eqs. (21) and (22) by patient use of Eq. (3) and one or two vector identities. Now, the integrals can be expressed as , where .
The next step is the propagation from to , where Eqs. (12) and (13) can be used by replacing , , and with , , and , respectively. If an integration is performed on a curved interface, the surface element must be changed. For the plane, one has . To define a surface, the coordinate can be given by the 2D function . Then, the surface element can be calculated according to . For example, a hemisphere is uniquely described by
C. Numerical Tests of Power Conservation
The transport of energy by electromagnetic radiation is described by Eq. (14). Taking the scalar product with a surface normal yields the irradiance,26) from those grids. Therefore, as well as Eq. (27) are finite quantities, and the integrals are replaced by sums over all respective sampling points .1 )–(4). In the first tests, the refractive indices and are kept identical. Therefore, the intermediate surface is not a real interface, and a comparison to the direct propagation between the first and the last surface is possible. The respective quantities on obtained by the direct propagation receive the index . The first test comprises an arbitrarily oriented plane as an intermediate surface. Differently than shown in Fig. 2, two successive rotations about the and axes are applied to an , mesh to generate (Table 2, Test 1). The power conservation can be described by the relative errors:
The field values on from the direct propagation are presented in Fig. 3. For sake of brevity, in the main article only the electric field components are shown. The corresponding magnetic components can be found in Appendix C. It can be seen that the component of the electric field, , vanishes everywhere.
The field that is propagated over does not vanish exactly, as can be deduced from the relative deviations plotted in Fig. 4. However, the residual values are small and likely numerical artifacts. Because the directly propagated is exactly zero, the relative deviation for , as it is defined in Fig. 4, is formally up to 1. However, since, in absolute terms , it is of no physical relevance. The irradiance from the directly propagated beam and its relative deviation to the indirectly propagated one is presented in Fig. 5. One can see that the size of the calculation window is indeed amply chosen. The peak-to-valley (PV) deviation of the irradiances is found to be . One can state that this first test reveals the magnitude of the residual numerical errors because no systematic errors of the theory can be identified.
For Test 2, the tilted plane is replaced by a spherical surface while again . As mentioned earlier, in this case the found integrals in Eqs. (12) and (13) can only be approximations whose viability is tested now. The relative deviation of the field components is given in Fig. 6. Again, the component of the electric field is nonzero. Its absolute value as well as the relative deviation of the irradiance from the directly propagated one is shown in Fig. 7.
Now, with curved, the deviations can no longer be classified as numerical artifacts. These regular patterns of larger magnitude must be caused by systematic effects. By inspection of Eq. (12) and comparison to Test 1, it turns out that two reasons can be recognized as necessary and sufficient to explain at least the nonzero component shown in Fig. 7 (left):
- • The cross product term in Eq. (12) produces a component if the normal vector contains a component.
- • On the sphere surface, all normals are oriented differently.
The first reason is also true for the arbitrarily tilted plane. However, the second, which enables the (almost) utter cancellation, is not. For the sphere case, cancellation is not perfect due to the different normal orientations; thus, the residual regular pattern is allowed (Fig. 7, left). The regular deviation patterns for the other components are likely caused in an analogue way, i.e., by parasitic contributions, which did not cancel out because of the different normal directions and the vectorial nature of the integral kernels. However, the impact on the irradiance is still quite small: (Fig. 7, right). Since the residual irradiance error is quite symmetrically modulated around zero, upon integration the positive and negative parts almost cancel each other out. Therefore, the respective relative error of power conservation is even smaller:
Thus, in rotationally symmetric setups, relying on power conservation alone can potentially lull us into a false sense of security. This symmetry can be broken when is tilted, as is the case for the remaining tests.
Table 3 lists the power conservation errors for the Tests 3 (plane interface) and 4 (curved interface). The relevant test parameters are given in Table 2. Now, the refractive indices and are different and becomes a real interface. The largest error appears in the calculation of reflected and refracted power and is more pronounced with a curved interface. Strangely, the error for the propagation step to or from the curved interface is smaller than for the tilted plane case by 1 order of magnitude. This may be explained by the symmetry effect mentioned above. On the other hand, these values are already quite small and in the range of the numerical limit. Therefore, it is questionable if they are interpretable this way.
To check the influence of the sample intervals on the error, Test 4 is repeated for different sampling resolutions. Table 4 shows the results together with the needed calculation times on a personal computer with 64 bit MATLAB, 24 GB RAM and two Intel Xeon X5680 @3.33 GHz, i.e., altogether 12 cores, each working in parallel at 100% workload by use of MATLAB’s Parallel Processing Toolbox. Also, all computation times given later refer to the above hardware and software conditions. While and always are in the range of the numerical limit, decreases with increasing resolution to , which is likely a lucky value beyond the reachable numerical limit for this type of calculation in this setup, because the values for and pixels are larger than for pixels. Therefore, this numerical error appears to be in the range of , and it is reached at a certain resolution threshold, somewhere between and pixels.
The dependence on the sphere radius is also checked. In order to keep the calculation time low, the sampling resolution was chosen to be . The results are listed in Table 5.
Now, is always , while and are again governed by numerical noise.
The purpose of the last numerical test in this section is twofold. First, it checks whether the field in a focal plane can be calculated. Second, it delivers error values for a stronger refractive index contrast. The refractive index values are changed to and . Then, the paraxial focus is obtained at for . The results are shown in Table 6 for various sampling resolutions.
The values for and are again in the range of the numerical limit. Therefore, it would be daring to obtain significant trends from three data points. Comparison of the column at pixels to the column of Table 5 reveals for a strong proportionality to the refractive index contrast. Due to this scaling effect, the decreasing of with increasing resolution is now more resolvable. However, again there appears to be a limit in the range of . As mentioned earlier, the Fresnel equations are only exact for infinite plane waves at infinite plane boundaries. Thus, the violation of this condition may reveal itself here.
It could be shown that the above stepwise method describes the transport of radiation power through an interface between two transparent, isotropic, homogeneous, and nonmagnetic media with reasonable accuracy for quite large wavelengths, small curvature radii, and strong refractive index contrasts. However, it should be noted here that this method is not immune to undersampling, which can always prevent proper results. Due to the sampling requirements of the term in Eqs. (12) and (13), which have already been extensively discussed in the literature, e.g., see , the simulation of a macroscopic optical system with many (close-by) interfaces can easily inflate the computation time to astronomic scale. This consideration has led to a different approach that is not limited by the number and distance of interfaces in an optical system. This method is presented in the next section. Fortunately, it can be compared to the stepwise method in simple toy models, such as the ones shown above. This is done after the next section.
3. VECTORIAL RAY-BASED DIFFRACTION INTEGRAL
In this section, the vectorial ray-based diffraction integral (VRBDI) is presented. In order to increase readability, it is structured into several subsections. First, a brief introduction into the used geometrical ray tracing is given. Then, a method to get close to specific sampling points behind an optical system is briefly presented while the details can be found in Appendix A.
Afterward, the extension of ray tracing to gather all relevant information for the VRBDI is introduced while the details of the method, i.e., how matrix optics [16 –18,45] and differential ray tracing [34,46] can be used to calculate the field at the sampling grid points, have been moved to Appendix B. The VRBDI itself concludes this section.
A. Ray Tracing
Geometrical ray tracing is a common technique used in optical design software [21,50]. A ray is defined by position and versor . Then, the question as to how to scale the versor in order to intersect the next surface, i.e., , leads to an equation system that needs to be solved. One has to distinguish between sequential ray tracing, where the next surface is given by the next entry in a list, and nonsequential ray tracing, where the next surface has to be determined for each ray individually. The latter one prevents effective parallel implementation, i.e., the tracing of many rays at once rather than one after another. Therefore, we chose sequential ray tracing in order to implement a parallel computation.
A plane is fully characterized by pivot vector and normal . Then, the condition directly yields
When the intersection point is found, the reflected or refracted versor is calculated by Eq. (16) or (17), respectively. Then, the next ray is defined by the intersection point and the respective versor. If a ray intersection lies outside of allowed positions, i.e., outside an aperture, it is marked as terminated. The first and last surfaces are usually planes, the input plane and the detector plane, respectively.
B. Ray Aiming
A ray starting from the input plane of an optical system is defined by position and versor . There are two possible cases, depending on the chosen decomposition of the input field: which parameter is fixed and which one is free for aiming. For a plane wave decomposition, is fixed and the starting point on a plane, which is normal to and shares its origin with the input plane, is the free parameter. In contrast, for spherical wave decomposition, is a fixed sampling grid point and is free. Both cases are sketched in Fig. 8. Initially, a grid on the aperture plane defines the ray spacing.
By tracing test rays through the optical system, various starting conditions can be related to positions on the detector plane. A fit by Zernike polynomials  is then utilized to obtain a functional relation between the starting condition and the detector position. Then, it is possible to aim for specific points on the detector sampling grid. A detailed explanation of the used ray-aiming procedure and the description of a test tool for the reachability of detector positions and quality of the ray aiming is given in Appendix A. Generally, with this kind of ray aiming, the targeted positions on the detector cannot be reached exactly. There is always a smaller or larger displacement between the targeted grid point and the ray intersection. This displacement will also be important for the next subsection.
There are also situations where certain regions in the area of the detector cannot be reached by rays, while in other regions the relation between starting condition and position on the detector plane is not unique, i.e., where the different test rays intersect each other. This is the case at caustics, focal, and image regions.
C. Tracing the Electromagnetic Field
From each aimed ray, i.e., a ray that gets close to an assigned detector pixel and is found by ray aiming, a differential ray tube is constructed at an input position [34,46]. To each ray tube, an electric field contribution is assigned by21) or (22) and without using Eq. (20). After surfaces, is obtained. By use of Eq. (3), one also obtains the magnetic field contribution: 16,58] at the sampling grid points on the detector plane. A detailed description of the evaluation procedure, which utilizes matrix optics [16 –18,45] and differential ray tracing [34,46], is given in Appendix B. Finally, the field on a detector sampling grid point can be written as
The VRBDI itself is then given by24], is a numerical implementation of a diffraction integral. Therefore, all input field components interact with each other on the output plane according to the Huygens–Fresnel principle  or its equivalent description by plane waves . In Fig. 9, a flow chart of the full algorithm is shown. The irradiance can then be calculated by use of Eqs. (14) and (28).
4. COMPARISON BY A TOY MODEL
In this section, the VRBDI is compared to the stepwise propagation method of the second section. The second toy model from Fig. 2 is chosen, and the parameters are changed according to Table 7. Again, the input field is given by Eq. (30) with , , and using Eqs. (1 )–(4) for the other components. It is worth noting that the optimal conditions for the plane wave decomposition are reciprocal to the optimal conditions for the spherical wave decomposition. For instance, in Table 7 the input field for the plane wave decomposition is chosen on a wider spatial mesh (the number of pixels is maintained while the overall dimensions of the mesh are increased), which yields the needed higher spatial frequency resolution. In contrast, for the spherical decomposition, a sufficient spatial resolution is most important. Therefore, a denser spatial mesh is better.
The stepwise method is again checked for power conservation:
From the plane wave decomposition, 1069 components, i.e., plane waves, and from the spherical decomposition, 59,381 points were chosen for the VRBDI. In both cases, a relative threshold, for the spherical wave decomposition and for the plane wave decomposition, in relation to the strongest component, defines a cut-off level. The following calculation times are obtained for the three methods:
Obviously, choosing lower cut-off levels increases calculation time but, of course, also accuracy. Therefore, a trade-off between accuracy and computation time has to be found. However, most important is the proper choice of the decomposition type. In the presented case, VRBDI-PW is roughly 57 times faster than VRBDI-SW, which corresponds to the number of the respective input components: . The results are quite comparable, as can be seen in Fig. 10. However, some distinctions are observable: in the VRBDI-PW deviations from the stepwise method, one can see some numerical noise in the low-field regions.
The irradiance calculated by VRBDI-SW and its relative deviation from the stepwise method as well as the relative deviation of VRBDI-PW from VRBDI-SW are shown in Fig. 11. The latter is quite small, , and seems to be caused by sampling issues. Therefore, the uncertainty for both is represented by the relative deviation of VRBDI-SW from the stepwise method, which is . From the shape of the deviation, one can deduce that the beam spot generated by the stepwise method is slightly broader. By consideration of power conservation,12. Now, the numerical noise can be clearly seen.
5. INTERFEROMETER SIMULATION
In order to illustrate the versatility of the VRBDI method, it is applied to the calculation of the diffraction correction for an interferometer, as it was used to measure the diameter of silicon spheres [11 –13]. The parameters of the simulated setup are identical to the ones in , where a Gaussian beam-tracing algorithm, known as vectorial general astigmatic Gaussian beam (VGAGB) tracing, was utilized.
As a rule of thumb, when deciding which decomposition type for the input field is preferable, i.e., the one that needs the fewest components, it is suggested to consider two extreme situations:
- • A large collimated beam close to the first aperture can efficiently be composed from a few plane waves, VRBDI-PW.
- • A small divergent source far away from the first aperture can efficiently be composed from a few point sources, VRBDI-SW.
Following this rule, VRBDI-SW is chosen because the source, a Gaussian beam with waist radius , can be sufficiently sampled by pixels at its waist location, which is 683.6 mm away from the first interferometer component. Furthermore, a relative threshold of yields 793 components for each VRBDI run.
The resolution and size of the detectors were chosen to be pixels over . The interferometer is operated by wavelength shifting. As in , the full procedure is simulated here as well but only for the ideal configuration of the interferometer (no Monte Carlo simulations). For each measurement, i.e., for an empty and loaded cavity, indicated as E and L, respectively, seven interferograms , one for each wavelength, are captured and fed into the phase retrieval formula of . For each wavelength, three beam paths are considered: corresponding to zero, one, and two round trips in the empty or loaded gap. Therefore, VRBDI runs have to be performed. With Eqs. (14) and (28), the interferograms are obtained from the respective sum of the (complex-valued) fields from each path. In Fig. 13, the phase biases are plotted. We refer to bias as the difference between measured value and correct value of a quantity. Therefore, a correction to a measured value is the respective negative bias. The difference to the VGAGB calculation, , is shown in Fig. 14. Table 8 compares the resulting measurement biases for empty gap, loaded gap, and resulting sphere diameter from VRBDI-SW and VGAGB. These values are calculated directly from . Since an odd sampling was chosen, the central value already exists and needs not to be interpolated. The agreement is quite remarkable. By inspection of Fig. 14, one can see that this is only true in a relatively small central area, i.e., the paraxial regime, where the VGAGB method can be considered to be accurate. It should be noted here that the meaning of paraxial and nonparaxial depends on the desired accuracy. On that note, nonparaxial means that the paraxial approximation can no longer be applied because the desired accuracy is no longer met. For a metrologist working in the field of ultraprecision interferometry, the paraxial approximation can potentially be much more restrictive than commonly assumed. However, it is shown here that it is well justified for the desired accuracy at the extremal position of the interferogram phases in the described interferometer, i.e., the location where the interferograms obtained from this setup are usually evaluated. Dividing the calculation time of 16.6 h by 42 yields a mean calculation time per run of ca. 24 min; further division by 793 results in ca. 1.8 s per input field component.
The stepwise method, as it is introduced in the second section, represents to our knowledge the most rigorous and still practically computable propagation method applicable to the optical interface problem, where it can be successfully used to test other propagation methods such as the VRBDI.
To provide a more physically motivated and intuitive interpretation, we refer to the Huygens–Feynman–Fresnel principle [31,32]. Provided that the Huygens–Fresnel principle and the Maxwell equations [48 –50] are equivalent to the quantum-mechanical description of a single photon, the stepwise method solves the path integral problem for each segment of the optical system individually. It constitutes the full brute-force approach: for every point on the target surface, the coherent sum of all wavelets emerging from all possible input positions is calculated numerically, although spatial discretization is unavoidable.
Following this picture, the VRBDI would be an approximation whose main assumption is that the light interaction at the interfaces of an optical system can be approximately described deterministically. In this sense, the traced rays represent the different possibilities that remain when the influence of an aperture on the photon’s momentum uncertainty is neglected :40) is equivalent to the left-hand side, the cautious reader may consider this analogy only as a rude estimation. In any case, for a valid description of light propagation by the VRBDI, the apertures in an optical system should be much larger than the wavelength.
More generally, any confinement of photon trajectories should increase the momentum uncertainty. This could explain why focal regions cannot be described by deterministic and straight photon trajectories. The question as to whether it is possible to properly describe the situation by curved trajectories goes beyond the scope of this paper .
Beside these fundamental considerations, there are also technical issues that prevent the proper usage of the VRBDI in focal regions. However, by using Eqs. (12) and (13) or, alternatively, Eqs. (1 )–(4), one can easily close this reachability gap. Alternatively, it seems natural to assume that proper usage of respective angle-to-angle and point-to-angle eikonals  could also solve this issue. However, this has not been attempted by the authors so far.
For proper handling of small apertures, an intermediate field has to be sampled on the apertured surface. Approximately, by assuming Kirchhoff boundary conditions , the field in the aperture can simply be obtained by multiplication to the respective aperture function. Then, further propagation can be started from there.
In order to simulate the operation of laser interferometers, a vectorial ray-based diffraction integral method has been developed. The tangential electric components of a continuous monochromatic wave field on a plane, which is immersed in a transparent nonmagnetic homogeneous medium, are necessary and sufficient to obtain the other electromagnetic components by use of the angular plane wave spectrum. From the latter, vectorial diffraction integrals are deduced, which satisfy Maxwell and Helmholtz equations. By exploitation of the kinematic and dynamic boundary conditions of the Maxwell equations, a stepwise surface-to-surface propagation method is established, which is still a good approximation when the interface between two linear, transparent, nonmagnetic, and homogeneous media is curved.
For application of the VRBDI method, an input field for a general optical system is decomposed either into spherical waves, i.e., point sources, or plane waves. By means of ray aiming, an input position of a point source or, alternatively, an input angle of a plane wave can be related to a certain sampling position on a detector plane at the end of this optical system. Differential ray tubes, each consisting of an aimed base ray and four close, i.e., paraxial, parabasal rays, are launched and traced in parallel through the optical system to their respective sampling point on the detector plane. The optical path length and a ray matrix are obtained for each ray tube. By use of the respective paraxial eikonals and further adaptation of the diffraction integrals, the electromagnetic field at all sampling points is calculated and summed to all the other traced contributions from the decomposed input field. For the location of the detector plane, caustics as well as focal and image regions have to be avoided. However, propagation to these regions can be carried out by common free-space propagation methods.
Direct comparison of the VRBDI with the stepwise method in a simple curved interface toy model using power conservation as a criterion reveals that the stepwise method is most exact, but the VRBDI is a good approximation when the apertures are much larger than the wavelength. In the chosen toy model, the remaining error of the simulated irradiance is at the parts per million level, which should be almost unresolvable by experiment. By proper choice of the input field decomposition, the VRBDI is already faster than the stepwise method. Improper choice can lead to longer computation times, though, albeit this drawback applies only to simple systems, which can only just be modeled by the stepwise method. Macroscopic optical systems with many (closely spaced) surfaces cannot be modeled with the stepwise method in reasonable computation times.
The simulation of an interferometer, which was previously described by a paraxial Gaussian beam tracing method, showed very good agreement in the paraxial regime, whereas deviations between the VRBDI and the paraxial method at more distant locations from the optical axis already indicate the expectable failure of the paraxial method in the nonparaxial regime.
APPENDIX A: DETAILED DESCRIPTION OF RAY AIMING AND TEST TOOL
For the sake of brevity, only the ray aiming for the spherical wave decomposition case is explained here. Then, is a fixed sampling grid point and is free. More exactly, the continuous independent variables are and when is chosen to be dependent. When the intersection point on the detector plane is expressed in the local detector frame defined by the orthonormal set and pivot vector ,A2) can be approximated by proper interpolation functions. For this purpose, we chose Zernike polynomials . The definition space of the Zernike polynomials requires A4), the number of Zernike polynomials , i.e., the number of columns in , should match the number of detected ray intersections , i.e., the number of rows in . However, a least-squares solution for is readily obtained by  A3). Then, with and , new rays are traced that should “hit” the sampling grid points.
In order to evaluate the quality of the ray aiming and the reachability of the detector sampling grid points, a test tool has been developed. Figure 15 illustrates the test method. To the aperture plane a sampling grid is assigned, holding either zero for blocking or one for passing a ray. From each source point, whose intensity corresponds to the numbering in Fig. 15 (left), a ray is directed onto each of the nonblocking aperture grid points and traced to the detector. Then, a simple incoherent additive sampling is done on the detector grid.
For the central source point, i.e., point 9 in Fig. 15, ray aiming is carried out and the positioning error is evaluated according to
In Fig. 16, the result for a simple imaging system [distance from input () to lens middle plane], consisting of a biconvex spherical lens (focal length , diameter , thickness ) and a circular aperture (diameter ) directly in front of the lens, is shown at behind the rear lens surface. One can see that the nine source points (Fig. 15, left) yield separated spots. Thus, there is almost no overlap. The values for and given in Fig. 16 are quite different. For a good overlap, these values should become almost equal. The ray aiming error for the central source point is quite small. Figure 17 shows the same for different image distances and a larger aperture . Directly behind the lens () and far away () overlap and ray aiming are good, although the ray aiming for the larger aperture is not as good as for the small aperture. Close to the image plane at , there is insufficient to no overlap, and the ray aiming error becomes very large.
In order to illustrate the application to a different optical system and to show that the optimal overlap and ray aiming are not always close to or far away from the last lens surface, a second lens (, , ) is placed confocally to the image position of the first lens. The results for various distances are shown in Fig. 18. At , the ray aiming and overlap are better than at . For , both are poor.
At caustics, image, or focal regions, certain locations in the area of the detector cannot be reached by rays, while in other locations rays accumulate in vicinity to certain sampling coordinates, i.e., the overlap is poor. Therefore, the relation between starting condition and position on the detector plane is not unique, i.e., Eq. (A2) is no longer bijective. Use of Eq. (A6) yields distances much larger than or between and and gets poor.
One can conclude that the test tool enables one to find positions where the features, which are necessary for the calculation of a ray-based diffraction integral, i.e., overlap and hitting of sampling points, are established. Generally, it should now be clear that caustics as well as image and focal regions have to be avoided.
APPENDIX B: DETAILED DESCRIPTION OF FIELD TRACING
From each aimed ray, i.e., a ray that gets close to an assigned detector pixel and is found by ray aiming, a ray tube is constructed at an input position. Such a ray tube consists of the base ray, which is identical to the aimed ray, and four parabasal rays. The ray tubes are generated by either translating the base ray along orthogonal directions or by tilting the base ray around two orthogonal axes. The finite differences, i.e., the translation of magnitude or the tilt angle , should be chosen to be as small as numerically reasonable . The orthogonal directions can be obtained from the arbitrary orthonormal set , which is defined by
When the rays of a ray tube are traced through an optical system, they stay sufficiently paraxial to the base ray if or are chosen sufficiently small, so that the positions and directions of the output rays can be described by matrix optics [45,46]:18]. The coordinates and live in a plane orthogonal to the base ray , i.e., the base plane, while their slopes and are taken along . Obviously, and can be readily used to initially define the base plane. Thus, the orthonormal set defines the base ray frame, i.e., the coordinate system that is fixed to the respective base ray. However, when the base ray undergoes refractions or reflections its direction changes. Therefore, at every interface of the optical system, when the same rotation has to be applied to or . For example, , while ensures the right-hand chirality of the orthonormal set. In order to find , Rodrigues’ rotation formula  is used, which describes how a vector is rotated around an arbitrary rotation axis that is given by a versor . For the present problem, one can obtain by B5), which has to be prevented by explicitly setting in this case. There are two possible rotation directions, which have to be considered by probing with and taking the solution that minimizes , where is obtained by Rodrigues’ rotation formula :
While the initial ray vectors are simply given by 19 illustrates the problem of finding these intersection points. In analogy to Eq. (31), one can obtain the sought intersection points from any known ray positions and versors by 46]: 36).
However, the field in the base plane is not the sought quantity. Additionally, the opportunity to choose only one of four possible -combinations for the ray vectors implies an ambiguity. Figure 20 illustrates these problems and offers solutions. First, the nearby sample-grid point is expressed in the base ray frame:
Depending on the signs of and , the initial and final ray vectors are chosen as indicated by Fig. 20. Then, by use of Eq. (B15), the corresponding submatrices and or and are calculated. Further propagation into base ray direction can simply be described by
Therefore, onlyB20) holds also if , which corresponds to a backward propagation. For example, this is the case when lies on the other side of in Fig. 20 (left).
To properly include the well-known Gouy phase anomaly, which causes a shift at each astigmatic focus ( at a symmetric double focus) [39,63] and which was also confirmed experimentally , it is still necessary to modify or , respectively. It can be shown that
Eventually, a field contribution on a detector sampling grid point can be written as37) and (38) are 58].
APPENDIX C: MAGNETIC FIELD COMPONENTS
EMRP Joint Research Project (SIB08); European Union.
This work is a study for the EMRP Joint Research Project SIB08: Traceability of sub-nm length measurements (subnano). The EMRP is jointly funded by the European Union. The authors gratefully acknowledge funding by the EMRP.
1. J. N. Dukes and G. B. Gordon, “A two-hundred-foot yardstick with graduations every microinch,” Hewlett-Packard J. 21, 2–8 (1970).
2. K. Dorenwendt and G. Bönsch, “Über den Einfluß der Beugung auf die interferentielle Längenmessung,” Metrologia 12, 57–60 (1976). [CrossRef]
3. J.-P. Monchalin, M. J. Kelly, J. E. Thomas, N. A. Kurnit, A. Szöke, F. Zernike, P. H. Lee, and A. Javan, “Accurate laser wavelength measurement with a precision two-beam scanning Michelson interferometer,” Appl. Opt. 20, 736–757 (1981). [CrossRef]
4. G. Bönsch, “Wavelength ratio of stabilized laser radiation at 3.39 μm and 0.633 μm,” Appl. Opt. 22, 3414–3420 (1983). [CrossRef]
5. G. Mana, “Diffraction effects in optical interferometers illuminated by laser sources,” Metrologia 26, 87–93 (1989). [CrossRef]
6. A. Bergamin, G. Cavagnero, and G. Mana, “Observation of Fresnel diffraction in a two-beam laser interferometer,” Phys. Rev. A 49, 2167–2173 (1994). [CrossRef]
7. A. Bergamin, G. Cavagnero, L. Cordiali, and G. Mana, “Beam-astigmatism in laser interferometry,” IEEE Trans. Instr. Meas. 46, 196–200 (1997). [CrossRef]
8. A. Bergamin, G. Cavagnero, L. Cordiali, and G. Mana, “A Fourier optics model of two-beam scanning laser interferometers,” Eur. Phys. J. D 5, 433–440 (1999). [CrossRef]
9. R. Masui, “Effect of diffraction in a Saunders-type optical interferometer,” Metrologia 34, 125–131 (1997). [CrossRef]
10. G. Cavagnero, G. Mana, and E. Massa, “Aberration effects in two-beam laser interferometers,” J. Opt. Soc. Am. A 23, 1951–1959 (2006). [CrossRef]
11. N. Kuramoto, K. Fujii, and K. Yamazawa, “Volume measurement of 28Si spheres using an interferometer with a flat etalon to determine the Avogadro constant,” Metrologia 48, S83–S95 (2011). [CrossRef]
12. B. Andreas, L. Ferroglio, K. Fujii, N. Kuramoto, and G. Mana, “Phase corrections in the optical interferometer for Si sphere volume measurements at NMIJ,” Metrologia 48, S104–S111 (2011). [CrossRef]
13. B. Andreas, K. Fujii, N. Kuramoto, and G. Mana, “The uncertainty of the phase-correction in sphere-diameter measurements,” Metrologia 49, 479–486 (2012). [CrossRef]
14. E. Wolf, “Electromagnetic diffraction in optical systems I. An integral representation of the image field,” Proc. R. Soc. A 253, 349–357 (1959). [CrossRef]
15. B. Richards and E. Wolf, “Electromagnetic diffraction in optical systems II. Structure of the image field in an aplanatic system,” Proc. R. Soc. A 253, 358–379 (1959). [CrossRef]
16. S. A. Collins, “Lens-system diffraction integral written in terms of matrix optics,” J. Opt. Lett. 60, 1168–1177 (1970).
17. H. Kogelnik and T. Li, “Laser beams and resonators,” Appl. Opt. 5, 1550–1567 (1966). [CrossRef]
18. A. E. Siegman, Lasers (University Science Books, 1986).
19. J. W. Goodman, Introduction to Fourier Optics (McGraw-Hill, 1968).
20. E. Byckling and J. Simola, “Wave optical calculation of diffraction effects in optical systems,” Opt. Acta 27, 337–344 (2010). [CrossRef]
21. A. S. Glassner, An Introduction to Ray Tracing (Morgan Kaufmann, 1989).
22. N. G. Douglas, A. R. Jones, and F. J. van Hoesel, “Ray-based simulation of an optical interferometer,” J. Opt. Soc. Am. A 12, 124–131 (1995). [CrossRef]
23. F. Riechert, F. Dürr, U. Rohlfing, and U. Lemmer, “Ray-based simulation of the propagation of light with different degrees of coherence through complex optical systems,” Appl. Opt. 48, 1527–1534 (2009). [CrossRef]
24. F. Wyrowski and M. Kuhn, “Introduction to field tracing,” J. Mod. Opt. 58, 449–466 (2011). [CrossRef]
25. A. W. Greynolds, “Propagation of generally astigmatic Gaussian beams along skew ray paths,” Proc. SPIE 560, 33–50 (1985).
26. M. J. Bastiaans, “The expansion of an optical signal into a discrete set of Gaussian beams,” Optik 57, 95–102 (1980).
27. R. Herloski, S. Marshall, and R. Antos, “Gaussian beam ray-equivalent modeling and optical design,” Appl. Opt. 22, 1168–1174 (1983). [CrossRef]
28. J. Arnaud, “Representation of Gaussian beams by complex rays,” Appl. Opt. 24, 538–543 (1985). [CrossRef]
29. G. W. Forbes and M. A. Alonso, “The Holy Grail of ray-based optical modeling,” Proc. SPIE 4832, 186–197 (2002).
30. M. A. Alonso, “Ray-based diffraction calculations using stable aggregate of flexible elements,” J. Opt. Soc. Am. A 30, 1223–1235 (2013). [CrossRef]
31. A. V. Gitin, “Huygens-Feynman-Fresnel principle as the basis of applied optics,” Appl. Opt. 52, 7419–7434 (2013). [CrossRef]
32. R. P. Feynman, “Space-time approach to non-relativistic quantum mechanics,” Rev. Mod. Phys. 20, 367–387 (1948). [CrossRef]
33. R. Mittra, ed., Computer Techniques for Electromagnetics (Pergamon, 1973).
34. G. A. Deschamps, “Ray techniques in electromagnetics,” Proc. IEEE 60, 1022–1035 (1972). [CrossRef]
35. S. W. Lee, P. Cramer Jr., K. Woo, and Y. Rahmat-Samii, “Diffraction by an arbitrary subreflector: GTD solution,” IEEE Trans. Antennas Propagat. AP-27, 305–316 (1979).
36. R. Mittra and A. Rushdi, “An efficient approach for computing the geometrical optics field reflected from a numerically specified surface,” IEEE Trans. Antennas Propagat. 27, 871–877 (1979).
37. S. W. Lee, M. S. Sheshadri, V. Jamnejad, and R. Mittra, “Refraction at a curved dielectric interface: geometrical optics solution,” IEEE Trans. Microwave Theory Trans. MTT-30, 12–19 (1982).
38. S. W. Lee and P. T. Lam, “Correction to diffraction by an arbitrary subreflector: GTD solution,” IEEE Trans. Antennas Propagat. AP-34, 272 (1986).
39. H. Ling, R. C. Chou, and S. W. Lee, “Shooting and bouncing rays: calculating the RCS of an arbitrarily shaped cavity,” IEEE Trans. Antennas Propagat. 37, 194–205 (1989).
40. S. W. Lee, H. Ling, and R. C. Chou, “Ray-tube integration in shooting and bouncing ray method,” Microwave Opt. Technol. Lett. 1, 286–289 (1988). [CrossRef]
41. J. Baldauf, S. W. Lee, L. Lin, S. K. Jeng, S. M. Scarborough, and C. L. Yu, “High frequency scattering from trihedral corner reflectors and other benchmark targets: SBR versus experiment,” IEEE Trans. Antennas Propagat. 39, 1345–1351 (1991).
42. Y. Tao, H. Lin, and H. Bao, “GPU-based shooting and bouncing ray method for fast RCS prediction,” IEEE Trans. Antennas Propagat. 58, 494–502 (2010).
43. C. Y. Kee and C. F. Wang, “Efficient GPU implementation of the high-frequency SBR-PO method,” IEEE Antennas Wireless Propagat. Lett. 12, 941–944 (2013). [CrossRef]
44. B. Andreas, G. Mana, E. Massa, and C. Palmisano, “Modeling laser interferometers for the measurement of the Avogadro constant,” Proc. SPIE 8789, 87890W (2013).
45. A. Gerrard and J. M. Burch, Introduction to Matrix Methods in Optics (Wiley, 1975).
46. M. Harrigan, “General beam propagation through non-orthogonal optical systems,” Proc. SPIE 4832, 379–389 (2002).
47. C. G. Chen, P. T. Konkola, J. Ferrera, R. K. Heilmann, and M. L. Schattenburg, “Analyses of vector Gaussian beam propagation and the validity of paraxial and spherical approximations,” J. Opt. Soc. Am. A 19, 404–412 (2002). [CrossRef]
48. G. S. Smith, An Introduction to Classical Electromagnetic Radiation (Cambridge University, 1997).
49. J. D. Jackson, Classical Electrodynamics (Wiley, 1999).
50. M. Born and E. Wolf, Principles of Optics (Pergamon, 1964).
51. L. Mandel and E. Wolf, Optical Coherence and Quantum Optics (Cambridge University, 2008).
52. W. R. Smythe, “The double current sheet in diffraction,” Phys. Rev. 72, 1066–1070 (1947). [CrossRef]
53. A. Drezet, J. C. Woehl, and S. Huant, “Diffraction of light by a planar aperture in a metallic screen,” J. Math. Phys. 47, 072901 (2006). [CrossRef]
54. J. A. Stratton and L. J. Chu, “Diffraction theory of electromagnetic waves,” Phys. Rev. 56, 99–107 (1939). [CrossRef]
55. S. Guha and G. D. Gillen, “Vector diffraction theory of refraction of light by a spherical surface,” J. Opt. Soc. Am. B 24, 1–8 (2007). [CrossRef]
56. S. R. Ghorpade and B. V. Limaye, A Course in Multivariable Calculus and Analysis (Springer, 2010).
57. M. Hillenbrand, D. P. Kelly, and S. Sinzinger, “Numerical solution of nonparaxial scalar diffraction integrals for focused fields,” J. Opt. Soc. Am. A 31, 1832–1841 (2014). [CrossRef]
58. A. Walther, The Ray and Wave Theory of Lenses (Cambridge University, 1995).
59. R. P. Feynman, R. B. Leighton, and M. Sands, The Feynman Lectures on Physics (California Institute of Technology, 2010), http://www.feynmanlectures.caltech.edu.
60. K. Y. Bliokh, A. Y. Bekshaev, A. G. Kofman, and F. Nori, “Photon trajectories, anomalous velocities and weak measurements: a classical interpretation,” New J. Phys. 15, 1–17 (2013).
61. W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, Numerical Recipes: The Art of Scientific Computing, 3rd ed. (Cambridge University, 2007).
63. T. D. Visser and E. Wolf, “The origin of the Gouy phase anomaly and its generalization to astigmatic wavefields,” Opt. Commun. 283, 3371–3375 (2010). [CrossRef]
64. J. P. Rolland, K. P. Thompson, K. S. Lee, and J. Tamkin Jr., T. Schmid and E. Wolf, “Observation of the Gouy phase anomaly in astigmatic beams,” Appl. Opt. 51, 2902–2908 (2012). [CrossRef]