## Abstract

Simulation of scattering from optical surfaces is usually based on Monte Carlo methods in which the bidirectional scattering distribution function (BSDF) of the optical surfaces are sampled randomly by many rays, resulting in long calculation times. In order to accelerate the simulation, a quasi-analytical phase space model is proposed. In this model, few rays are traced from the object and image space to the target surface to determine the illumination and acceptance areas in phase space, where these areas can be conveniently coupled simultaneously in the spatial and angular domain. Since no random sampling is involved in the phase space model, no statistical noise perturbs the result and the surface scattering simulation can be greatly accelerated. Additionally, due to the use of real raytracing, the phase space model removes the limitation of paraxial approximation, which usually limits the accuracy of deterministic stray light analysis models. Meanwhile, by the discretization of the optical surfaces into subareas, this new approach is able to model freeform surfaces with arbitrary geometries, and space-variant BRDFs can be applied for different subareas of the optical surface.

© 2020 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

## 1. Introduction

Since the invention of the first computer program to simulate light scattering in optical systems [1], the developers of such programs have always been fighting with the extensive numerical computations. Nowadays stray light analysis is implemented in most commercial optical design software, in which ray-based methods are primarily used and require the tracing of trillions of rays to yield useful information about the stray light level of a moderately complicated optical system. This large number of required rays results from the nature of light scattering, which is a stochastic ray-splitting process with a large amount of possible ray paths compared to conventional deterministic raytracing. While large computational power is consumed to trace rays along all these ray paths, most of the ray paths do not reach the detector and therefore do not contribute to useful information about the stray light level, resulting in large computational redundancy. To solve this problem, a technique called important sampling has been widely applied in Monte Carlo methods to guide the scattered rays to the target surfaces rather than spreading them over the whole hemisphere [2]. But in order to apply important sampling, the user needs to prespecify the important ray trajectories along which the rays should be guided, which is a non-trivial task especially for complicated systems. Another method to reduce the computational cost is backward raytracing, which is advantageous for wide-angle systems with much larger chief ray angles in the object space than in the image space [2].

Apart from the above methods, efforts have also been made to develop deterministic models for stray light analysis [3,4,5]. For example, Peterson proposed an analytical model that connects the stray light contribution from a single surface with its refractive index and marginal ray height [4], which is extremely simple but significantly limited by the paraxial approximation and the assumption of zero chief ray height. Greynolds reported a method based on bidirectional raytracing [5], in which the light scattering is evaluated in the overlapping area between the beamlets from the object and image, but unfortunately the toolbox is closed source and the details are not available.

In our previous work, we developed a phase space model for the analysis of the autofluorescence effect in microscope lenses [6]. In this method, we discretize a lens into several thin z-slices and trace a small number of paraxial rays from the object and image planes to the target slice to obtain the key parameters that determine the stray light contribution from this slice. The raytracing data gives the irradiance, the illuminated area of the slice, as well as the acceptance area and angle as seen from the detector, based on which we are able to obtain the autofluorescence contribution from this slice by calculating the phase space overlap of the illumination and acceptance beam. This model is deterministic and provides fast and accurate calculation of the autofluorescence contribution from each element of the optical system [7].

In this paper, we propose an extension of the phase space model to analyse surface scattering in optical systems with arbitrary geometries. In the extended method, instead of discretising the volume of lenses into slices, we divide the scattering surface into small subareas, where the phase space overlap of the illumination and acceptance regions can be evaluated based on real raytracing data. By the implementation of bidirectional real raytracing, this extended phase space model can very accurately simulate first order stray light or single scattering effects, while the calculation time can be reduced by orders of magnitude due to the very small number of rays that need to be traced compared to Monte Carlo methods.

## 2. Models and methods

#### 2.1 Physical model

The main idea of the phase space model is to trace only the rays that determine the key parameters for stray light generation. Figure 1 shows a small subarea on a reflective surface as well as the parameters that determine its stray light contribution. In Fig. 1, the red ray indicates the illumination rays from the object, which determines the amount of energy received by the subarea *dS* of the optical surface, while the ratio of the scattered energy to the received energy is determined by the BRDF of the surface together with the solid angle *dΩ* within which the scattered rays can reach the detector.

Based on the BRDF of the surface and the solid acceptance angle of acceptance *dΩ*, the ratio of the scattered energy that reaches the detector over the incident energy in a surface subarea can be calculated by integrating the BRDF over the solid acceptance angle *dΩ*:

*ϕ*

_{i}is the flux of the incident light and

*θ*are the deflection and azimuthal angles of the incident and scattered rays.

_{i}, φ_{i}, θ_{s}, φ_{s}According to the definition of the BRDF, when the incident angle is fixed, the BRDF becomes a 2D function of *θ _{s}* and

*φ*. The angles of the rays from the four corners of the detector then form a quadrangle in the

_{s}*θ*space, in which the integration of Eq. (1) can be calculated by integrating the 2D scattering distribution function in this quadrangle. Furthermore, for optical systems with a small field of view (FOV), or when considering a single pixel on the detector, the solid angle

_{s}- φ_{s}*dΩ*is usually so small that the value of BRDF is considered to be constant within the acceptance angle. In this case, the integration of Eq. (1) can be further simplified as:

*E*is the irradiance of the incident light in the surface subarea,

_{sub}*S*is the area of the subarea of the optical surface,

_{sub}*S*is the area of the pixel in the detector intermediate image and

_{pixel}*η*cos

_{c }= BRDF(θ_{i},φ_{i},θ_{s},φ_{s}) ·*θ*is the coupling coefficient that corresponds to the fraction of the incident flux that is scattered to the pixels.

_{i}· dΩ#### 2.2 Sampling method

According to the above discussion, it is more advantageous to divide the optical surface into equal-area subareas (isoenergetic sampling). The type of grid used for surface discretization depends on the geometry of the surface. For example, a Cartesian grid is always isoenergetic, but it is not able to accurately describe curved boundaries, and therefore is usually used to describe non-circularly symmetric surfaces. On the other hand, a special isoenergetic polar grid is used to describe the circularly symmetric surfaces with the smallest number of sampling points. Figure 2 shows an optical surface with a Gaussian irradiance distribution, as indicated by the colour code. As can be seen from the red mesh, an isoenergetic polar grid with only 7 rings (49 subareas) is enough to describe this distribution, and the small number of subareas used for surface discretization greatly reduces the number of rays that need to be traced in the phase space simulation.

Similarly to the sampling of the optical surfaces, the sampling of the entrance pupil should also be isoenergetic. In this case, the Cartesian grid is again not the best option because it induces artefacts due to the Moiré effect, especially when Cartesian discretization of the surface is applied. Therefore, it is better to sample the entrance pupil on a Fibonacci grid, which avoids the artefacts while guaranteeing uniform sampling. The details of the Fibonacci sampling grid are discussed in Appendix A.

For both Cartesian and isoenergetic sampling of the surface, the sampling density is dependent on the uniformity of the distribution of irradiance and solid acceptance angle. If both quantities are uniformly distributed on the surface, only a few subareas are enough to characterize the scattering property of the whole surface. The uniformity of the irradiance distribution depends on the apodization, vignetting and aperture angle of the ray bundle inside the optical system. Additionally, the projection effect of large field angles also gives rise to non-uniformity of the irradiance distribution. In general, for aperture-dominant systems with a small FOV and a large aperture (e.g., typical telescope systems), the irradiance distribution is rather uniform throughout the system and dense sampling of the target surface is usually not required. A quantitative evaluation of the impact of sample density on the accuracy is given in Sec. 4.

The discretization of optical surfaces also allows us to apply space-variant BSDFs to different surface locations, which is useful when the inhomogeneity of residual surface errors should be considered. As shown in Fig. 2, the stray light contribution from a surface subarea is calculated based on the irradiance and BSDF at this surface location.

#### 2.3 Calculation methods of the acceptance angle

The distribution of the solid acceptance angle *dΩ* on a surface is more complicated than the irradiance distribution. In this section, three methods to calculate the distribution of the solid acceptance angle are proposed and their principles are explained, while a detailed comparison of the methods will be given in Sec. 4 after the discussion of the two examples.

### 2.3.1 Intermediate image method

The first method is based on the intermediate image of the detector seen by the optical surface. From the perspective of imaging, the acceptance angle of a subarea on an optical surface is the solid angle subtended by the intermediate image of the detector as seen from this subarea. The intermediate image can be found by tracing the paraxial marginal and chief rays from the detector to the optical surface. Once the position and size of the intermediate image are found, the solid angle subtended by the intermediate image for each subarea can be analytically calculated [8]. Moreover, since the solid angles subtended by the pixels are usually very small due to their small sizes, the acceptance angle of each pixel can be conveniently calculated by dividing the projected area of the pixel intermediate image by the square of the distance between the surface subarea and the pixel intermediate image.

However, in real optical systems, the intermediate image of the detector that a surface subarea sees usually deviates from the paraxial intermediate image. The reason is that real imaging optical systems are usually imperfect due to the presence of aberrations, which result in a distorted, blurred, or curved intermediate image of the detector. What further complicates the problem is that different subareas on an optical surface correspond to different pupil coordinates, and most aberrations except for distortion vary with the pupil coordinates, meaning that each subarea on the optical surface sees a unique aberrated intermediate image determined by the corresponding pupil coordinates of the subarea. Therefore, in order to determine the exact position of a pixel in the aberrated detector intermediate image, we need to consider the pupil coordinates defined by the distance of the subarea to the chief ray, as well as the field coordinate defined by the position of the pixel in the detector. Figure 3 illustrates the relationship between the pupil and field coordinates with the corresponding subarea and pixel locations. Once the pupil and field coordinates are determined, the pixel position shift can be calculated from the transverse aberration. There are multiple ways to determine the amount of aberrations that the intermediate image suffers from, but the most efficient method is to calculate the Seidel sums of the portion of optical system between the target surface and the image plane. Based on the Seidel sums, the transverse aberration of a pixel can be calculated as follows:

*W*is the Seidel representation of the wavefront error,

*y*is the absolute pupil coordinate,

_{p}*R*is the radius of the reference sphere and

*n*is the refractive index in the intermediate image space.

There are several advantages in using the Seidel sums, the first being that they can be easily calculated from the surface parameters or by tracing a few primary rays, which makes this approach extremely efficient. The second advantage is that the field dependence of the aberrations is explicitly given, therefore we only need to calculate the Seidel coefficients for one field and the aberrations for other field points can be directly obtained. The disadvantage of the Seidel representation is that it does not consider higher order aberrations, which leads to errors when those aberrations are present. In principle, for well corrected circularly symmetric systems, it is usually sufficient to consider only the primary aberrations since the coefficients of high order aberrations are typically orders of magnitudes lower. But in a sophisticated system involving freeform surfaces, it might be necessary to consider the impact of higher order aberrations on the detector intermediate image. In this case, we can use the Zernike coefficients to represent the aberrations in the intermediate image space, but the calculation of the Zernike coefficients requires denser pupil sampling and the tracing of more rays, therefore being less efficient. Additionally, the scalability of the Zernike coefficients with field should be handled with care because different orders of the Zernike coefficients can have different field-dependence.

The impact of aberrations on the detector intermediate image largely depends on the type of aberration. As mentioned above, the simplest case is distortion because it is independent of the pupil coordinates, meaning that if only distortion is present, every subarea of the optical surface sees the same distorted intermediate image of the detector. In this case, the positions of the off-axis pixels are shifted radially, inducing a radial variance of the pixel size that must be considered when the solid angles subtended by the pixels are calculated. Figure 5(f) shows the distorted detector grid in the presence of distortion.

Besides distortion, other aberrations vary with the pupil coordinates. Figure 4 shows the intermediate image seen by the surface subareas in the presence of spherical aberration, coma, and astigmatism, and Fig. 5 shows the distorted detector grid on the intermediate image plane. The blue and green arrows in Fig. 4(a) indicate the intermediate images of the detector seen by an on-axis point *O* and off-axis points *E* and *F* in the presence of spherical aberration, while Fig. 5(a) shows the distorted detector grid seen by point *F*. From Fig. 4(a) and Fig. 5(a) we can see that spherical aberration shifts the pixel centers by a constant distance due to the fact that primary spherical aberration is field-independent. In contrast to spherical aberration, the impact of coma and astigmatism is field-dependent and varies for the subareas in the tangential and sagittal directions. As shown in Fig. 4(b), the three arrows *O’P*, *O’S*, and *O’T* indicate the intermediate image seen by on-axis point *O*, sagittal points *C* and *D*, and tangential points *E* and *F* in the presence of coma, and the distorted detector grids seen by the tangential point *E* and sagittal point *C* are shown in Figs. 5(b)–5(c), from which we can see that coma distorts and enlarges the detector grid. While both coma and spherical aberration do not bend the planar intermediate images, astigmatism and Petzval curvature do. As shown in Fig. 4(c), the red and green arrows indicate the curved intermediate image seen by the tangential points *E* and *F*, and sagittal points *C* and *D* in the presence of astigmatism, and we can see that the radii of curvature and sizes of the curved intermediate images are different for the tangential and sagittal points, so the intermediate images seen by the subareas have toroidal shapes. Figures 5(d)–5(e) show the distorted detector grid in the presence of astigmatism and field curvature.

The impact of aberrations on the intermediate image has usually been ignored in previous approaches based on inverse raytracing [3,7], but in reality the impact can be too large to be neglected because the shift of pixel positions due to aberrations induces errors in the calculation of the solid angles subtended by pixels as well as the values of the BRDF in the scattering directions. In general, if the shift of pixel position is larger than the width of a pixel, the impact of aberrations should be considered.

### 2.3.2 Inverse raytracing method

The above-mentioned method to calculate the acceptance angle based on the detector intermediate image can be applied to circularly symmetric systems without large high order aberration contribution from its elements. However, for systems without circular symmetry, or if high order aberrations are of concern, the intermediate image method may be difficult to implement, or the efficiency can be greatly decreased.

A more general method to calculate the acceptance angle is based on inverse real raytracing. In this method, rays are traced from several points on the detector into to the optical system. Similarly to what is shown in Fig. 1, the cone angle subtended by the inversely traced rays in a surface subarea gives the solid acceptance angle. Compared to the intermediate image method, the inverse raytracing method is more robust, but sacrifices efficiency since rays need to be traced from all corners of the pixels in order to calculate the irradiance distribution on the detector. Therefore, the inverse raytracing method can only be used to calculate the total scattered flux rather than the irradiance distribution on the detector.

### 2.3.3 Hybrid method

In order to alleviate the limitation of the inverse raytracing method in the calculation of the irradiance distribution, we propose a hybrid method which combines the inverse raytracing method with the intermediate image method.

The relationship between the hybrid method with the inverse ray tracing and intermediate image methods is depicted in Fig. 6. From the flowchart in Fig. 6 we can see that the first step of the hybrid method is the same as the inverse ray tracing method, in which we inversely trace several ray bundles from the centers and corners of the detector to the optical surfaces. Based on the inverse ray tracing result, we calculate a linear transformation matrix of the detector grid based on the intermediate image locations of the center and corners of the detector seen by the subareas of the optical surface. Subsequently, the linear transformation matrix is applied to distort the detector grid, based on which the subarea-pixel coupling algorithm of the intermediate image method is applied to calculate the contribution of each surface subarea to the stray light intensity in each pixel.

In Fig. 6, the blue, red, and green colors indicate the simulation steps implemented in the intermediate image, inverse raytracing, and hybrid methods. From the flowcharts we can see that the main difference between the three methods is the calculation of the detector intermediate image and the coupling between the subareas on the optical surface and the detector. Furthermore, we can also see that the intermediate image method is different from the inverse raytracing method in every step except for the calculation of the irradiance on the optical surface, while the hybrid method inherits inverse real raytracing from the inverse raytracing method and the coupling between the subareas and pixels from the intermediate image method.

The hybrid method is able to effectively calculate the stray light distribution on the detector for systems with arbitrary geometry, but due to the application of linear transformations to obtain the distorted detector grid, only the linear dependence on the field coordinate such as tilt, linear coma, or field-independent spherical aberration can be accurately modelled. The impact of aberrations with higher order dependence on the field coordinate are not accurately described, this being the main limitation of the accuracy and robustness of the hybrid method. Further details about the hybrid method will be discussed in Sec. 4.

### 2.3.4 Field windows

Besides aberrations, special care must also be taken with surfaces where the ray bundles are strongly separated, because subarea of such surfaces cannot see the entire intermediate image due to the truncation of light by the apertures downstream from the optical surface. An example of this special case is shown in Fig. 7, which shows a schematic sketch of a retrofocus lens with inversely traced ray trajectories. Here we can see that the inversely traced off-axis ray bundle is vignetted by two elements and is strongly separated from the on-axis ray bundle on the front element L1. Consequently, the surface subareas on L1 are only able to see a part of the detector through the apertures downstream from L1. The intermediate images of these apertures act as field windows through which the surface subareas on L1 see the detector. As shown in Fig. 7, the field windows for the surface subareas on L1 are the intermediate images of L2, L3, and the stop, which coincide with themselves if we neglect the small focal power of L2. For the two points *O* and *B* on L1, the parts of the detector that the two points can see are determined by the overlap of the projections of all field windows on the detector intermediate image plane through *O* and *B*, as indicated by the orange and pink arrows in Fig. 7. Therefore, for such surfaces displaying ray bundle separation, the field windows must be considered in the calculation of the solid acceptance angle subtended by the detector.

#### 2.4 Surface scatter model

Optical surface errors are usually modelled in the frequency domain by the power spectral density (PSD). An exact analytical form of the PSD is usually not available, but effective approximated forms have been proposed by stray light analysts. These approximated representations include the most computationally convenient Gaussian distribution [9] and the more accurate K-correlation or ABC model [10]. In this paper, we apply the K-correlation model to describe the surface errors.

The surface PSD is related to the BRDF of the optical surfaces by the surface scatter models, among which the Rayleigh-Rice and Harvey-Shack models are most commonly used. Both models give accurate results when the smooth surface approximation is fulfilled and when the correlation length of the surface error is relatively large (small-angle scattering dominant). However, for surfaces with short correlation length, the Rayleigh-Rice theory fails to predict the accurate BRDF for large incident and scattering angles. Therefore, here we apply the generalized Harvey-Shack (GHS) theory to predict the BRDF of optical surfaces based on the PSD [9]. Because the values of the BRDFs are directly calculated based on the relative locations between the surface subarea and pixel, no iterative random sampling of the BRDF is needed. This is another advantage of the phase space model over the Monte Carlo method. For the Monte Carlo method, the integral of the BRDF predicted by the GHS theory cannot be analytically written, and so direct random sampling of the cumulative probability distribution function is not possible. Therefore, in the Monte Carlo method, BRDFs predicted by the GHS can only be implemented by iterative random sampling of the BRDFs, which requires many iterations to generate a scattered ray. For small-angle scattering, the iterative algorithm can be even more time consuming than the raytracing itself.

## 3. Simulation results

To evaluate the properties and check the accuracy of the phase space model, test calculations have been performed for a circularly symmetric Ritchey–Chrétien telescope (RCT) and a Kirkpatrick-Baez telescope (KBT) with grazing incidence [11]. For the RCT, we apply the intermediate image method to calculate the point source transmittance (PST), defined as the scattered irradiance distribution on the image plane. For the KBT with no circular symmetry, we apply the inverse real raytracing method to calculate the total scattered flux, while the PST is calculated by the hybrid method.

#### 3.1 Ritchey–Chrétien telescope

Figure 8 shows the layout of the Cassegrain type Ritchey–Chrétien telescope (RCT) composed of two hyperbolic mirrors and a field lens. In the simulation, the impact of all primary aberrations on the detector intermediate image is considered and the aberrations are represented by the Seidel sums. For M2, the dominant aberrations in the detector intermediate image are the distortion and field curvature induced by the field lens, while for M1 the additional coma contributed by M2 also plays an important role.

Before the rays are traced, we divide M1 and M2 into 289 subareas on an isoenergetic grid with 17 rings, as shown in Fig. 9, from which we can see that the irradiance is almost uniformly distributed on the two mirrors except for the central obscuration.

The next step of the simulation is to trace rays from the image plane to the target surfaces to determine the intermediate image of the detector seen by the subareas on M1 and M2. In this process, a paraxial marginal ray is traced to determine the location of the intermediate image, after which the paraxial image heights of the pixel centers can be obtained by tracing paraxial chief rays from every pixel center to the intermediate image plane or more conveniently by scaling the detector grid with the magnification.

After obtaining the paraxial intermediate detector image, the shifts of the paraxial pixel points are calculated as described in Sec. 2.3.1. The relative positions between the surface subareas and detector pixels determine the field and pupil coordinates, which in turn define the transverse aberrations, from which the pixel shifts are calculated. Additionally, the actual areas of pixels in the intermediate image are calculated based on the distorted detector grid.

The final step of the simulation is to calculate the coupling coefficients between the surface subareas and pixels by Eq. (3). In order to do so, the scattering angles, the BRDF values, and solid acceptance angles need to be calculated for every subarea-pixel combination in the corresponding local coordinate systems. The calculation seems to be rather complicated because there are many subarea-pixel combinations, but the algorithm can be vectorized to accelerate the calculations. As an example, Fig. 10 shows the coupling coefficients of two tangential subareas on M1 and M2 with the pixels in the corresponding detector intermediate image. From Fig. 10 we can see that the peak of the coupling coefficient coincides with the specular reflection, and we can observe the distorted edges of the intermediate detector image due to aberrations of the field lens and M2.

After the coupling coefficients for all subareas and pixels are calculated, the PST is obtained by summing up the contributions from all subareas. Here we consider a wavelength of 632 nm and assume M1 and M2 to be smooth surfaces (*σ _{s}* = 0.0015 µm), which results in a small total integrated scattering (TIS) of 0.09%. Additionally, for the K-correlation model, the slope factor of the PSD is set as

*s*= 2, and the correlation length of the surface error is 30 µm, meaning that small-angle scattering is dominant. Based on these surface parameters, the PSTs of M1 and M2 are calculated and shown in Fig. 11, from which we can see that the PST of M2 has a sharper peak than that of M1. This is because the detector intermediate image seen by M2 is nearly four times as large as that seen by M1 and thus the acceptance angles of the pixels are much larger for M2. For the same reason, the total scattered flux from M2 to the detector is twice as large as that from M1, indicating that the M2 is more critical for stray light. This corresponds to the conclusion that elements with smaller marginal ray heights are more critical for stray light [6,4].

To validate the simulation results, we have also calculated the PSTs with the Monte Carlo method modelled by non-sequential raytracing in OpticStudio. In order to implement the BSDF modelled by the K-correlation model and the GHS, we have written an external DLL file that is called by OpticStudio to generate scattered rays.

Figure 12 shows the 1D cross-sections of the PSTs of M1 and M2 calculated by the phase space model and Monte Carlo raytracing. Here we can observe that the signal to noise ratio (SNR) of the result calculated by the Monte Carlo method is much lower than that of the phase space model due to inadequate number of rays that have been traced. Apart from that, a good agreement between the results of the two methods can be observed.

While a good agreement between the phase space and the Monte Carlo methods is observed, the greatest difference lies in the runtime. For the results shown in Fig. 12, the Monte Carlo method traces 2.8 × 10^{10} rays and takes 3.4 hours, while the phase space model only traces 1×10^{5} rays and takes 5.5s to calculate the PSTs of both mirrors with an even higher SNR based on a computer with two Intel Xeon E5-2690 v4 CPUs (16 cores @ 2.60 GHz). Therefore, the phase space model achieved an acceleration ratio of more than 2000 compared to the Monte Carlo method. Furthermore, if the optical system contains more surfaces, or if finer discretization of the detector is required, the advantage of the phase space model in speed and accuracy is more pronounced. Meanwhile, it should also be noted that in this simulation we have discretized the optical surfaces by an isoenergetic mesh with 17 rings, which is much more than enough to obtain an accurate result. In reality, an isoenergetic grid with 5 rings will yield visually identical results as shown in Fig. 12 and takes only 2s to complete the simulation. A more detailed discussion on the accuracy, discretization density and runtime of the intermediate image method is found in Sec. 4.

#### 3.2 Kirkpatrick-Baez telescope

The Kirkpatrick-Baez telescope (KBT) design is widely used in X-Ray optics which requires grazing incidence to obtain high reflectivity. As illustrated in Fig. 13, a KBT is composed of two toroidal mirrors placed perpendicular with each other. Therefore, circular symmetry is completely broken and both surfaces introduce significant amounts of astigmatism, making the intermediate image difficult to apply. Therefore, here we use the inverse raytracing method and the hybrid method to calculate the total scattered flux and the PSTs of M1 and M2.

The surface scattering model used in this example is the same as that used in Sec. 3.1. Again, we apply the GHS theory to predict the BRDF due to its accuracy for large incident and scattering angles. Here an incident beam with a short wavelength of 10 nm is considered, and the correlation length of surface error is 100 nm. Therefore, wide-angle scattering from high spatial frequency surface roughness is dominant.

Due to the loss of circular symmetry, a Cartesian grid with 80×80 sampling points is used for surface discretization. The entrance pupil is sampled by a Fibonacci grid with 10^{5} points, from which the incident rays are traced to the two mirrors. As shown in Fig. 14, the forward raytracing results give the irradiance distribution and the incident angles in each surface subarea.

Subsequently, four sets of rays are traced from the corners of the detector back into the optical system, and the solid angle between the rays from the four vertices gives the solid acceptance angle *dΩ* for each subarea of the two mirrors, as shown in Figs. 14(c)–14(d). From these two figures we can see that the value of *dΩ* is not uniformly distributed on each mirror, which mainly results from the large astigmatism introduced by the cylindrical surfaces and the projection effect due to grazing incidence. It is also evident that for both mirrors *dΩ* is rather small due to the small FOV, but here the values of BRDF cannot be assumed to be constant within *dΩ* because the BRDF is very sensitive to the scattering angle under grazing incidence. Therefore, the exact integration according to Eq. (1) must be calculated by numerical methods.

Finally, by summing up the scattered flux from all the subareas of each mirror, we obtain the contribution of each mirror to the stray light on the detector, as shown in Fig. 15. The large contribution from M2 mainly results from its larger solid acceptance angle. To validate these results, the Monte Carlo method is again used for comparison, as indicated by the orange bars in Fig. 15, from which we observe a good agreement between the two methods. The overall calculation time of the phase space model is 48 s which includes the tracing of 5×10^{5} rays and data processing, while the Monte Carlo method traces 8×10^{8} rays and takes 720 s. Here the acceleration ratio is not as large as that in Sec. 3.1 due to two reasons, the first being that processing the raytracing data is time-consuming when fine surface sampling is used, and the second being that only a small number of rays need to be traced in the Monte Carlo method since only the total scattered flux is calculated. However, here we have used a rather fine sampling grid to generate the figures with higher resolution for clearer demonstration, while in practice a coarser sampling of 10×10 is enough to generate similar results, as shown by the green bars in Fig. 15. By comparing the green and blue bars in Fig. 15, we find that the coarse sampling grid yields virtually identical results to the dense sampling grid, while shortening the runtime to 3 s.

Additionally, as shown in Fig. 16, the PSTs of the two mirrors are calculated by the hybrid method. As a validation of the hybrid method, Fig. 17 shows the PSTs calculated by the Monte Carlo method. Comparing Fig. 16 and Fig. 17, we find that the hybrid method yields similar results to the Monte Carlo method, which can also be seen from the cross-sections shown in Figs. 17(c)–17(d). The two dark holes in Figs. 17(a)–17(b) result from the element used to block the specular rays in the non-sequential raytracing. They are not seen in Fig. 16 because the scattered and specular rays can be directly decoupled in the phase space model. Additionally, from the relative RMS deviation of the results shown in Fig. 17, we observe a larger deviation for M1, for which there are mainly two reasons. The first is that the detector acceptance angle on M1 is very small, resulting in low SNR of the result calculated by the Monte Carlo method, and the second reason is the large astigmatism contribution from M2. Since astigmatism is not linearly dependent on the field coordinate, the distortion of the detector grid due to astigmatism cannot be accurately modelled by the linear transformation matrices used in the hybrid method. Therefore, if large aberrations which are not linearly dependent on the field coordinate are present in the detector intermediate image, modelling the detector grid distortion by linear transformations induces errors, decreasing the robustness of the hybrid method.

Furthermore, from the PSTs shown in Fig. 16, we observe that the axes of symmetry of the two PSTs, which correspond to the planes of incidence, are perpendicular to each other as a consequence of the perpendicular orientations of the two mirrors. Furthermore, we observe that the peaks of the PSTs do not coincide with the specular direction, which results from the fact that the peak of the BRDF does not coincide with the specular reflection for large incident angles as described by the GHS theory [9].

As expected, the hybrid method shows great advantage in terms of runtime. The calculation of the PSTs shown in Fig. 16 takes only 7 seconds for both mirrors when using a coarse 5×5 Cartesian grid for surface discretization. For this coarse surface discretization, 50000 rays have been traced to calculate the irradiance and acceptance angle distribution on the optical surfaces. Similar to the inverse ray tracing method shown in Fig. 15, increasing the discretization density of the optical surfaces only yield minor increase of the accuracy, while the number of traced rays are linearly dependent on the number of subareas and so is the calculation time.

In contrary, the Monte Carlo method traces 4×10^{10} rays, taking 11 hours to produce a result with a relatively poor SNR (see Fig. 17). Therefore, an acceleration factor of 5600 is achieved for this case. Additionally, if the SNR of the PST of M1 is increased to be the same as that of M2, the total calculation time is increased to 22 hours. The speed advantage of the hybrid method is particularly pronounced for this case because the BRDF is concentrated in a small angular range under grazing incidence, making the random sampling of the BRDF in the Monte Carlo method extremely slow.

## 4. Critical review

In the last two sections we have used examples to discuss the principles of the three methods of calculating the stray light level on the image plane based on the phase space model. In this section, we present a critical review and detailed comparison of the three methods. Based on the critical review, we discuss the applicability and limitations of each method as well as selection criteria in different simulation scenarios.

Table 1 shows a detailed comparison of the applicability and limitations of the three methods, from which we can see that the intermediate image method should be used as long as the system is circularly symmetric, while for systems without circular symmetry, the inverse raytracing method should be used if no detector discretization is required. The hybrid method is the only option if the scattered irradiance distribution on the detector is to be calculated for systems without circular symmetry. Therefore, the hybrid method is currently the most general method that is suitable for nearly all optical systems, while its major weakness being the lack of robustness due to the neglection of the non-linear field dependent aberrations.

In general, the phase space model is completely geometric, and diffraction effects are not considered. The scattered light distribution on the detector is obtained by incoherently superposing the scattered light from different surface subareas. Therefore, the phase space model is expected to yield accurate results as long as the correlation area of the residual surface error is smaller than the area of the surface subareas, so that the scattered light from different subareas can be considered to be incoherent. On the other hand, the Monte Carlo method assumes an infinitely small correlation area of the residual surface error, which means that it can only model scattering due to high frequency surface roughness, while the phase space model is able to model scattering from mid-spatial-frequency (MSF) errors by selecting a proper size for the surface subareas.

As discussed in Sec. 3, the phase space model shows great advantage over the Monte Carlo method in terms of accuracy and runtime. Due to the parallelization of the subarea-pixel coupling algorithm, the intermediate image method is the most efficient, while the inverse raytracing and hybrid methods are less efficient because processing the reverse raytracing data is time-consuming. In order to evaluate the relationship between the accuracy of the phase space model with the sampling density and runtime, we use an example of a single lens imaging system with a large FOV shown in Fig. 18(a) and calculate the PST of the front surface by the intermediate image method, the hybrid method, and the Monte Carlo method.

The logarithmically scaled RMS relative error of the result calculated by the intermediate image method with different numbers of rings used for surface discretization is depicted in Fig. 18(b). From this figure we can see that the relative error rapidly converges to zero as the number of rings increases, and the relative error is larger in the presence of apodizations, meaning that more rings are needed if the irradiance distribution on the optical surface is not uniform. Figure 18(c) shows the dependence of the RMS relative error on the runtime. Large runtime differences between the methods leads to significant separation of their curves, and here the phase space model holds the clear advantage. In Fig. 18(c), from left to right, the runtime is increased due to the increase of the sampling density of the optical surface or the increase of the number of traced rays. From Fig. 18(c), we see that the intermediate image method is the most efficient and its error converges rapidly to zero, while the hybrid method is slower than the intermediate image method but still much faster than the Monte Carlo method. It should also be noted that the error of the hybrid method converges to a finite value instead of zero, which results from the fact that the large distortion, astigmatism, and Petzval curvature introduced by the rear surface of the lens due to the large FOV cannot be accurately modelled by the linear transformations used in the hybrid method. As indicated by the green dashed line in Fig. 18(c), for a RMS relative error of 0.05, the intermediate image method achieves an acceleration ratio of 10^{5} compared to the Monte Carlo method. Here the acceleration factor is strongly dependent on the required RMS relative error and is expected to be even higher if smaller RMS relative error is required.

## 5. Summary

In this paper, we have reported a phase space model for fast analysis of surface scattering in optical systems as well as three methods to implement this model in simulations. The results shown in the two examples demonstrate that the phase space model is able to efficiently simulate surface scattering in optical systems with arbitrary geometries with accurate results, while achieving an acceleration ratio of 50 to 10^{5} compared to the Monte Carlo method.

Among the three methods, the hybrid method is the most universal and can be applied to calculate the PSTs of systems with arbitrary geometries. However, the robustness of the hybrid method is limited by its use of linear transformation matrices to distort the detector grid. In future work, it should be improved by modelling higher order aberrations in the calculation of the detector grid distortion.

## Appendix A: Isoenergetic sampling of incident rays by the Fibonacci grid

The Fibonacci grid guarantees quasi-isoenergetic sampling of the illumination ray bundle. Depending on whether the object space is finite or afocal, angular or spatial sampling by the Fibonacci grid should be applied.

If the object space is finite, the rays should be uniformly distributed on the spherical cap defined by the incident ray cone. In this case, the deflection angle *θ* and azimuthal angle *φ* of the rays sampled on the Fibonacci grid are calculated as [12]:

*i*is the index of the ray,

*N*is the total number of sampled rays and

*NA*is the object space numerical aperture of the optical system.

In case of afocal object space, the rays should be uniformly distributed on the entrance pupil, and the spatial coordinates of the rays are calculated in the polar coordinate system as follows:

*i*is the index of the ray,

*N*is the total number of sampled rays and

*D*is the entrance pupil diameter. Figure 19 shows a Fibonacci grid with 1000 sampling points calculated with Eq. (6).

Furthermore, it should be noted that the Fibonacci sampling is quasi-isoenergetic, meaning that the sampling is isoenergetic only when the Fibonacci grid is denser than the grid on which the optical surface is discretised. In our simulations we have fixed the density of the Fibonacci grid such that there are at least 50 rays in each subarea of the optical surface. Once this condition is fulfilled, the sampling of the entrance pupil is considered to be isoenergetic and apodizations can be modelled by assigning weightings to the incident rays according to their pupil coordinates.

## Disclosures

The authors declare no conflicts of interest.

## References

**1. **T. S. Chou, “GUERAP: General Unwanted Energy Rejection Analysis Program User's Manual[R],” Honeywell Inc St. Petersburg FL Aerospace Division (1972).

**2. **E. Fest, * Stray Light Analysis and Control*, (SPIE Press, Bellingham, Washington, 2013).

**3. **M. G. Dittman, E. Donley, and F. Grochocki, “Deterministic sequential stray light analysis,” Proc. SPIE **7794**, 77940T (2010). [CrossRef]

**4. **G. L. Peterson, “Analytic expression for in-field scattered light distribution,” Proc. SPIE **5178**, 184 (2004). [CrossRef]

**5. **A. W. Greynolds, “Stray light computations: Has nothing changed since the 1970s?” Proc. SPIE **6675**, 66750B (2007). [CrossRef]

**6. **X. Lu, O. Rodenko, Y. Zhang, and H. Gross, “Efficient simulation of autofluorescence effects in microscope lenses,” Appl. Opt. **58**(13), 3589–3596 (2019). [CrossRef]

**7. **X. Lu, Y. Zhang, and H. Gross, “General analysis and optimization strategy to suppress autofluorescence in microscope lenses,” Appl. Opt. **58**(27), 7404–7415 (2019). [CrossRef]

**8. **H. Gotoh and H. Yagi, “Solid angle subtended by a rectangular slit,” Nucl. Instrum. Methods **96**(3), 485–486 (1971). [CrossRef]

**9. **A. Krywonos, “Predicting surface scatter using a linear systems formulation of non-paraxial scalar diffraction,” Ph.D. Dissertation, UCF (2006)

**10. **M. G. Dittman, “K-correlation power spectral density and surface scatter model,” Proc. SPIE **6291**, 62910R (2006). [CrossRef]

**11. **P. Kirkpatrick and A. V. Baez, “Formation of Optical Images by X-Rays,” J. Opt. Soc. Am. **38**(9), 766 (1948). [CrossRef]

**12. **CR Drost, “The golden spiral method,” https://stackoverflow.com/questions/9600801. May 2017. Last accessed 2020-07-15