Substrip-based registration and automatic montaging of adaptive optics retinal images

Ruixue Liu; Xiaolin Wang; Sujin Hoshi; Sujin Hoshi; Sujin Hoshi; Yuhua Zhang; Yuhua Zhang

doi:10.1364/BOE.514447

1. Introduction

Adaptive optics (AO) enhanced ophthalmoscopy, with its capability to visualize the retina at a cellular level in the living eye, has become an important imaging modality for studying retinal diseases and facilitating vision science research [1,2]. However, the AO image has been challenged by the continual eye motion ranging from large saccades to small microsaccades and drifts [3], even when fixating [4,5]. Eye movements cause errors in successive images in a series and within a frame. The influence of eye movement-induced errors is particularly significant in imaging systems based on raster scanning, e.g., adaptive optics scanning laser ophthalmoscopy (AOSLO) [6,7]. Thus, image registration is necessary and critical for AO retinal image analysis [6,7]. In AOSLO image registration, the strip-based image cross-correlation method first introduced by Vagal and colleagues and further improved by other groups in later developments has been a major strategy correcting for the intra- and interframe translation [6–8]. This method divides an image into a series of strips in the slow scanning direction. Each stripe contains minimal motion artifacts. The translation between every strip and the corresponding reference strip is estimated by image correlation. An important byproduct of this approach is the precise measurement of eye movement with high temporal frequency [6,7], which enabled real-time retinal tracking and image stabilization [9–14] and fine assessment of the fixational stability [6,15,16].

The strip-based image registration, however, is challenged by large retinal motion, e.g., when the eye movement > 30 pixels, the registration accuracy degrades [10]. Many frames in a series must be discarded due to inaccurate estimation of the large movement. Consequently, the final registered image comprises a small number of frames with minor motion [17]. Recently, Zhang and colleagues reported a strategy to enhance the ability to register images with large motions. This method first estimates the image translation at the whole frame level and makes a synthetic reference frame with an expanded field of view (FOV). Then, all images are registered to this reference using the strip-based method, producing a registered image with a larger FOV than that obtained using a single reference frame [17].

AO image montage is another crucial task for accurately assessing retinal structure, e.g., investigating photoreceptor spatial distribution on the retina [18]. The montage of contiguous frames requires cell-to-cell precision in the overlapping region. Manual montaging has often been a practical approach, demanding a considerable time [18]. Li and co-workers presented an AOSLO automatic montage using the local scale invariant features [19]. Chen and colleagues developed an algorithm using scale invariant feature transforms (SIFT) and random sample consensus (RANSAC) [18] and demonstrated automatic montaging of confocal, split-detection, and dark field images [18]. Recently, they reported a constellation-based alignment strategy to minimize the montaging inaccuracy caused by the intensity variation in AO images [20]. Since these methods work with registered single images, the montage is susceptible to intraframe distortions within individual image [21,22].

In this study, we developed a substrip-based algorithm to improve AOSLO image registration and automatic montage.

2. Methods

2.1 Strip and substrip based image registration: the concept

In general, to register a series of consecutive images, the translation between the image (I) and the reference (R) frame can be estimated by the normalized cross-correlation (NCC) [23], which can be calculated by fast Fourier transform (FFT) [10], to find the position with the highest correlation [24–26].

(1)$$C({i,j} )= \sum R({x,y} )I({x - i,y - j} )$$

where C(i, j) is the cross-correlation value at the position (i, j), R(x, y) and I(x, y) are the reference and the image intensity at the coordinates (x, y), respectively.

The cross-correlation results are determined by multiple factors, including the translation between two frames, the intra-frame distortions, and the imaging noise. In the strip-based AO image registration, an image is divided into a series of strips along the fast scanning direction [6]. Compared to the reference frame, an image strip to be registered (I) can be considered as a linear superimposition of three images or components (Fig. 1): the first one contains the portion overlapping with the reference frame and is padded with zeros in the non-overlapping area (I_in), the second comprises the portion of non-overlapping (i.e. outside the reference frame) area and is padded with zeros in the overlapping area (I_out), and the third is composed with the noise at each pixel of the image to be registered (I_noise). Generally, the image noise can be removed or reduced by applying a filter (e.g., a Gaussian kernel and a standard deviation of 1 pixel in this manuscript) to the raw images before computing the NCC. Thus, the cross-correlation can be expressed as,

(2)$$C({i,j} )= \sum R({x,y} )[{{I_{in}}({x - i,y - j} )+ {I_{\textrm{out}}}({x - i,y - j} )} ]$$

Including I_out in the calculation can result in “frame out errors” [10,17]. To minimize the impact of I_out, the retinal motion may be estimated using just the portion of the strip within the overlap region. We refer to this approach as substrip-based registration (Fig. 1), expressed as:

(3)$$C({i,j} )= \sum R({x,y} ){I_{in}}({x - i,y - j} )$$

Fig. 1. Substrip-based registration. The substrip used for calculating the cross-correlation is a portion of the whole strip and locates within the overlap region of the strip and the reference frame.

Download Full Size | PDF

2.2 Assess the impact of substrip width on the NCC and retinal motion measurement

We collected 25 AOSLO videos acquired in 11 eyes of 8 human subjects randomly selected from a group of elderly participants (aged 56-71 years) enrolled in a study of age-related macular degeneration (AMD) [27,28]. The samples include 2 eyes in 2 subjects with normal macular health (step 1 by the Age-Related Eye Disease Study (AREDS) 9-step severity scale), 5 eyes with early AMD (AREDS step 2-4), and 4 eyes with intermediate AMD (AREDS step 5-8). The AOSLO image acquisition was detailed elsewhere [28] and is summarized here.

The AOSLO is a research instrument developed in our lab [29–31]. It uses a low coherence superluminescent diode as the light source and acquires retinal images over a field of view of 1.3° inside the human eye at a frame rate of 15 frames/second. A frame consists of 512 × 512 pixels. Before imaging, the subject’s pupil was dilated with one to two drops of 1.0% tropicamide and 2.5% phenylephrine. During imaging, AOSLO was performed to image the macular cone photoreceptors. The participant’s head was aligned and stabilized using a head mount with a chinrest. The subject’s view was directed by a fixational target (a green light dot) moving on a calibrated grid. At each grid point, the light dot stopped for ∼3-5 seconds, during which 45 to 75 frames were acquired. The Retinal images were recorded continuously across an area of 15° × 15° to 20° × 20°.

To evaluate the impact of substrip width on the NCC and retinal motion estimation, we minimized the influence of significant eye movement and retinal distortion on the calculation of NCC by choosing the video episodes acquired without eye blinking and significant distortions. Each episode comprises ∼ 30 frames. We divided every frame (480 × 512 pixels) into 30 strips (512 × 16 pixels) and calculated the NCC coefficients and the motion using a series of substrips with a width from 1 to 512 pixels. This process disclosed the impact of substrip width on the NCC and retinal motion estimation and provided a heuristically optimal substrip width for image registration (the Results section).

2.3 Substrip-based registration and montage: the algorithm

The algorithm (Fig. 2) was written with MATLAB (R2023a; The MathWorks Inc., Natick, MA). The specific process and technical strategy are described here.

Fig. 2. AO image registration and montage based on the substrip method.

Download Full Size | PDF

2.3.1 Pre-processing

First, the algorithm crops the top 32 lines of each frame to remove the distorted portion caused by the vertical galvo scanning at the beginning of the frame, yielding the images with 512 × 480 pixels (1.3 × 1.2 degrees). Then, it corrects for the sinusoidal distortion of the retinal image caused by the resonant scanning.

2.3.2 Automatic grouping and selection of a reference frame for each group

The program divides the video into groups based on frame-to-frame translations. This is done by calculating the correlation between the adjacent frames in the video using fast Fourier phase correlation [32]. Consecutive images with a correlation coefficient greater than a certain threshold (0.02, based on Ref. [32] and our previous manual image registration [28]) are assigned to the same group. When the correlation coefficient of two adjacent frames falls below this threshold, it indicates a notable change in the images, possibly due to fast eye saccade or blinking. These images will not be excluded for registration but instead serve as a delimiter to start a new group (Fig. 3).

Fig. 3. Automatic grouping and selecting reference frames using fast Fourier transform phase correlation. The cross-correlation coefficients are computed between consecutive frames of the AOSLO video. Frames with coefficients ≥ 0.02 are grouped together, and the frame with the highest coefficient is chosen as the reference frame within the group.

Download Full Size | PDF

Within each group, a frame that gives the highest sum of correlation coefficients with both the preceding and succeeding frames is selected as the reference for this group (Fig. 3).

2.3.3 Substrip-based registration in each group

At this step, each frame is divided into a series of strips (in this study, a frame was divided into 30 strips, each containing 16 rows). Since it is unknown whether a strip intersects with the reference frame and, if so, which end of the strip lies within the overlapping region, the software selects two substrips (i.e., 64 × 16 pixels) at the left and right ends of the strip and calculates the NCC of the two substrips to the reference frame. If the strip overlaps with the reference frame and the overlap width ≥ the substrip width, this operation will identify the substrip (i.e., one of the 2 ends) that is entirely within the overlap region and yields a higher NCC coefficient than the other (that may be either entirely or partially outside the overlap area). If the higher NCC coefficient exceeds a predetermined threshold (0.7, adopted from our AOSLO image registration [28]), the entire strip will be shifted with the corresponding motion estimated with this substrip. The strip will be discarded if the 2 NCC coefficients all < the threshold.

Denote the coordinates of the peak NCC coefficient as (X_peak, Y_peak), the width and height of the substrip as W_sub and H_sub, the calculated movements (ΔX_sub, ΔY_sub) of the substrip to the reference frame are,

(4)$$({\Delta {X_{sub}},\Delta {Y_{sub}}} )= ({{X_{peak}},{\; }{Y_{peak}}} )- ({W_{sub}},{H_{sub}})$$

The motion (ΔX, ΔY) of the entire strip estimated from the substrip can be expressed as,

(5)$$({\Delta X,\Delta Y} )= ({1,1} )- [{{S_x}(p ),{S_y}(n )} ]+ ({\Delta {X_{sub}},\Delta {Y_{sub}}} )$$

where (1,1) represents the coordinates of the top-left corner of the substrip, and [S_x(p), S_y(n)] represents its corresponding coordinates in the whole image to be registered. Here, (p) indicates the position of the substrip within the whole strip, whether on the left or right, and (n) represents the nth strip in the image. For example, based on the previous context in this section, if the NCC coefficient of the left substrip is greater, then the starting horizontal coordinate of the strips S_x = 1; if the right has a larger NCC, then S_x = 512 - 64 + 1 = 449.

In practical computation, dark areas in the image (such as shadows of blood vessels or specific lesions) that lack sufficient structural information may lead to false peaks in the NCC calculation and result in registration errors. To address this problem, the algorithm detects these dark areas based on the image's pixel values. If a substrip is in such an area, the algorithm moves the substrip from the end of the strip towards the middle, avoiding these regions to ensure accurate registration.

The algorithm includes a self-check and correction routine termed outlier strip removal and imputation (ORI). Within the 30 strips of a single frame, any strip whose motion exceeds ± 2 standard deviations (SD) of the eye movements is considered an outlier (due to local distortion or insufficient structural features, Figs. 4(a) & 4(b)) and will be removed (Figs. 4(c) & 4(d)). Subsequently, the missing strips (i.e., the M_th strip) are imputed by reducing the search range of the NCC peak (Figs. 4(e) – 4(g)). Specifically, in the horizontal direction, the algorithm calculates the average horizontal coordinates of the peaks in all strip NCC images with the motion within ± 2 SD, denoted as < X>, and sets the reduced search range in the horizontal direction as [ < X > - W_sub, <X > + W_sub], where W_sub is the substrip width. In the vertical direction, since the strips shift downwards progressively, the positions of the peaks also move downward accordingly. Thus, the search range for the peak can be set between the vertical coordinates of two adjacent peaks, [Y_{(peak, M-1)}, Y_{(peak, M + 1)}]. Within this range, if the NCC peak exceeds the threshold (same as the previous step), the NCC peak’s coordinates are used to estimate the movement of the corresponding strip with Eq. (6). If the NCC peak in this range is below the threshold, the corresponding strip will be vacant. If the number of qualifying strips is less than half of the total strips, this frame will be discarded.

Fig. 4. Outlier removal and image imputation. (a) The initial computation of retinal motion. The movements beyond ± 2 SD are outliers. (b) The map of the registered frame/strip number. The yellow strip overlaps with other strips due to incorrect motion estimation. The color bar shows the number of registered strips. (c) Retinal movements without outliers. (d) The map of the registered frame/strip number shows the missing strip after outlier strip removal. (e) Imputation of the missing motion. (f) The map of the registered frame/strip number after imputation. (g) Image imputation is achieved by reducing the search range based on the peak coordinates of normal substrips within the image. The color bar represents the correlation coefficient.

Download Full Size | PDF

The intraframe distortion of the reference frame is mitigated following the methods reported by Vogel et al. [6], Bedggood et al. [21] and Azimipour et al. [33]. After the intraframe distortion is corrected, the relative movements of all strips in the image to the reference frame are adjusted accordingly with the average displacement of the corresponding reference strips.

After the substrip-based registration, ORI, and intraframe distortion mitigation, the program obtains the relative motion of each strip to the reference frame, and each point in the AOSLO image can be mapped to the reference frame's coordinate system. Figures 4(b), 4(d), and 4(f) show the number of frames registered at each pixel of the reference frame. By summing the intensities at each point and dividing it by the counts at each point, we obtain the registered AO image of each group.

2.3.4 Substrip-based montaging

Once the program has completed the image registration in all groups, it automatically montages the registered images by the group number with cell-for-cell precision in the overlapping areas. The process is initiated by registering images of groups 1 and 2 using the substrip-based registration to generate “Registered Montage 1” (Fig. 2), which serves as the reference to align the registered image of group 3. This process continues until the last group's registered image is montaged. A major challenge for this process is that a misplacement of a single image can lead to a wrong montage. Therefore, the algorithm involves multiple routines to estimate and validate retinal motion to ensure accurate montaging.

Step 1: Estimate the translation between the images to be montaged at the whole image level. Suppose a sub-montage of the registered images in groups 1 to (N-1) has been generated, which will serve as the reference to align the registered image of group N. The software uses the entire frames to compute the image intensity and phase correlations between these two images. If the shift difference estimated by image intensity and phase correlations < 3 pixels (the size of a foveal cone), the software will consider all results valid. Then, it uses the mean shift as the whole-image-level motion and proceeds to Step 5 for strip-level montaging. The expanded sub-montage will serve as the new reference to align the registered image of the next group through this whole-image-level motion estimation and substrip-registration-based montaging until the registered image of the last group is montaged.

However, if the shift difference estimated by image intensity and phase correlations > 3 pixels, the algorithm goes to Step 2 to determine the whole-image-level shift using a more robust calculation.

Step 2: Estimate the translation between the images to be montaged in sub-regions. The software divides the image to be montaged into 9 sub-regions (each includes 170 × 160 pixels in this paper). Then, it selects an inner area of 64 × 64 pixels with the brightest average intensity within each sub-region to calculate the NCC with the reference image (the sub-montage generated in the preceding steps). Suppose the image overlaps with the reference and the overlap size ≥ sub-region size. In that case, this process will find at least one sub-region within the reference frame, yielding a high NCC peak and an accurate estimate of the image displacement.

Step 3: Validate the whole image translation. The program picks up the shift associated with the highest NCC among the 9 sub-regions from Step 2 as the “probable overall shift” of the whole image to determine the overlapping region of the image to be montaged and the reference (i.e., the sub-montage generated in the preceding steps). Next, the software calculates the NCC of the overlapping regions and compares the computed translation with the “probable overall shift.” If the difference is smaller than 3 pixels, the program considers the current frame can be correctly connected to the (adjacent) preceding sub-montage and advances to Step 5. If not, go to Step 4.

Step 4: Change the reference frame. The significant difference between the calculated frame translation and the “probable overall shift” indicates that the current frame cannot be added to the montage. Suppose the group number of the current image is N in the montaging series. The software will select the registered sub-montage (N-3) as a new reference to recalculate the NCC with the images in groups (N-2), (N-1), and N. If the result still proves futile, it is concluded that the montage stops at the group (N-1). A new montage series starts from group N.

Step 5: Montage at the strip level using substrip based registration. As described in section 2.3.3, the algorithm divides the images into strips, each containing 16 lines, and calculates the translations of these strips using the substrips within the overlapping area. The software computes the standard deviation (SD) of the movements of all stripes. The montage is correct if the whole image shift calculated in the previous steps falls within ± 2SD. Otherwise, revert to Step 4. Finally, the program shifts each strip of the registered image of the Nth group according to the strip motion, producing a (sub) montage at the strip level. This (sub) montage will serve as the new reference and the registered image from the (N + 1)th group. This process continues until all frames are eventually montaged.

3. Results

3.1 Impact of substrip width on NCC coefficient and retinal motion estimation

The examination in 25 AOSLO videos acquired in 11 eyes of 8 human subjects disclosed a characteristic variation of the NCC coefficient and retinal motion with different substrip widths (Fig. 5): (I) When the substrip width is small, e.g., < 10 pixels, although the NCC coefficient is high, the computed eye movements are inconsistent. (II) With increased substrip width, the NCC coefficients and the estimated retinal movements became consistent. (III) When the substrip width slightly exceeds the overlapping area of the two frames, while the NCC coefficient decreases, the computed motion remains stable. (IV) The NCC coefficient decreases with further increased substrip width, and the computed motion becomes erroneous.

Fig. 5. The relationships between the NCC coefficient, the calculated retinal movement, and the substrip width. W_overlap is the width of the overlapping region of the frame and the reference. W_min and W_max represent the shortest and the longest substrip width that can be used to calculate the NCC and measure the retinal motion reliably.

Download Full Size | PDF

In the 25 AOSLO videos of 11 eyes, 76% of the W_min is below 64 pixels, 85% below 128 pixels, and 99% below 256 pixels (Fig. 6(a)). The W_max varied from the width of the overlap region (W_overlap) to the width of the whole strip (W_whole) (Fig. 6(b)). These data provided a reference for choosing an optimal substrip width for registration. In this study, we adopted a substrip width of 64 pixels for image processing.

Fig. 6. Substrip width for successful image registration. (a) The successfully registered frames and the minimum (W_min) substrip widths. (b) The maximum substrip width W_max for successful registration and the overlap of the strips and the reference frame. Each marker represents the average maximum width of all the substrips in an image.

Download Full Size | PDF

3.2 Substrip-based vs. whole strip-based image registration [6, 7, 10, 12, 23]

When the eye movement is small, both methods produced good registration (Fig. 7). However, when the motion is large, the substrip-based approach yielded better results (Fig. 8(a)), whereas the whole strip method made out-of-frame errors (Fig. 8(b)).

Fig. 7. AOSLO image registertion when eye motion was small. (a) Substrip registration with the substrip size of 64 × 16 pixels. (b) Whole strip retistration with 512 × 64 pixels. The scale bar is 50 µm. (c) The number of registered images using substrip registration. (d) The number of registered images using whole strip registration. The color bar represents the number of registered strips.

Download Full Size | PDF

Fig. 8. AOSLO image registraion when eye motion was large. (a) Substrip registration with a substrip size of 64 × 16 pixels. (b) Whole strip registration with 512 × 16 pixels. The scale bar is 50 µm. (c) The number of registered images using substrip registration. (d) The number of registered images using whole strip registration. The color bar represents the number of strips used for registration.

Download Full Size | PDF

The substrip-based approach could register more frames than the whole strip method. With increased image translation, the success rate of the whole strip method decreased (black dash line, Fig. 9). Meanwhile, the substrip & ORI method maintained a success rate > 85% within the translation range between 0 to 448 pixels (Fig. 9).

Fig. 9. Image registration success rates using different methods. The registration rates (the percentage of successfully registered frames compared to the total number of frames) of the four methods as a function of the translation. Whole: whole strip method. Whole strip & ORI: whole strip with outlier strip removal and imputation (ORI). Sub: substrip method. Sub & ORI, substrip ORI method. Substrip width: 64 pixels.

Download Full Size | PDF

Figure 10 shows the eye movements measured in an AOSLO video using the substrip and whole strip methods. Since the out-of-frame errors affected the NCC calculation using the whole strip method, many strips were discarded due to low NCC values (Fig. 10(a)). In contrast, the substrip approach measured large movements (Fig. 10(b)). The ORI further improved the measurement accuracy (Fig. 10(c)), allowing more strips to be registered.

Fig. 10. Eye movements. Eye motion measured by (a) Whole strip method. (b) Substrip-based method without ORI. (c) Substrip with ORI. Total 86 frames. Black arrows point to the same reference frame for all methods. NCC coefficient threshold was 0.7. Strip height: 16 rows. (d) The magnified view of box (d) in panel (a). (e) The magnified view of box (e) in panel (c).

Download Full Size | PDF

The two methods yielded identical measurements of the retinal motion when whole strip registration was successful (Fig. 11).

Fig. 11. The comparison of the retinal movement measured using the whole strip and substrip & ORI methods. (a) X-direction movement. (b) Y-direction movement.

Download Full Size | PDF

Figure 12 presents the corresponding registered image. The substrip registration produced a larger image than the whole strip method did (Figs. 12(a) & (b)). Furthermore, the substrip registration yielded an improved image signal-to-noise ratio due to registering more strips/frames, thereby rendering a more apparent photoreceptor structure (Figs. 12(c) & (d)).

Fig. 12. AOSLO image registration. (a) Image registered using the whole strip method. (b) Image registered using the substrip alogrithim. Both images were obtained from the same video consisting of 86 frames using the same reference frame. The whole-strip method registered 11 frames, while the substrip method registered 76 frames. Rod photoreceptors are better revealed in the substrip registered image. (c) and (d), Magnified view of yellow boxes in (a) and (b), respectively. Color arrowheads point to two noticeable differences. (e) and (f), The normalized image intensity profiles across the lines in panels (c) and (d). Scale bars: 100 µm.

Download Full Size | PDF

3.3 Substrip-based montage

Figure 13(a) shows a montage of the AOSLO foveal images acquired in a normal human eye. The montage comprised 446 frames registered from a total of 558 consecutive frames. The video was automatically divided into 22 groups. Each group contained 5 to 112 frames. The registered image renders a smooth mosaic of well-resolved foveal cone photoreceptors. Figure 13(b) is the motion trajectory of the eye following the moving target, a clockwise movement that started from the top left corner of the scanning raster, returned to the starting point, and finally fixated on the scanning field center. The montage extends 3.1° × 2.6°.

Fig. 13. Automatic registration and montage of continuously recorded AOSLO foveal images from a normal healthy eye (Visualization 1). (a) The montage contains 446 frames out of 588 frames. Substrip size: 64 pixels × 16 pixels. NCC coefficient threshold: 0.7. Scale bar is 100 µm. (b) Eye motion trajectory during the acquisition. The eye fixation/movement started at the coordinate origin point (0,0). The dimensions are in degrees.

Download Full Size | PDF

Figure 14 plots the eye movements during the image acquisition of Fig. 13, revealing large motion (several tens of arcminutes) during the eye tracking of the moving target (Fig. 14(a)) and small movements (a few arcminutes or less) resulting from the combination of drift and micro-saccades within a short duration (Fig. 14(b)).

Fig. 14. Eye motion measured from AOSLO images while the eye following a moving fixational target. (a) Eye motion over 39.2 seconds. (b) Eye motion over 1.5 seconds. The discontinuities in the motion traces were caused by eye blinking or significantly large eye movements, such as saccades.

Download Full Size | PDF

Figure 15 presents an AOSLO foveal montage acquired in a patient with AMD. The AOSLO video consisted of 1000 consecutive frames and was automatically divided into 63 groups. Each group contained 2 to 18 frames. The montage was composed of 620 frames and covers 2.7° × 2.3°.

Fig. 15. Automatic registration and montage of continuously recorded AOSLO foveal images acquired in the eye of a patient with AMD (Visualization 2) Substrip size: 64 pixels × 16 pixels. NCC coefficient threshold: 0.7. The scale bar is 100 µm. (b) Eye motion trajectory during the acquisition. The dimensions are in degrees. The eye fixation/movement started at the coordinate origin point (0,0).

Download Full Size | PDF

Figure 16 shows the automatically montaged AOSLO images taken from the eyes of the normal subject and the patient with AMD shown in Figs. 14 & 15 and corresponding eye motion trajectories.

Fig. 16. Automatic registration and montage of AOSLO images. (a) AOSLO images acquired in a normal human eye (Visualization 3). The montage spans 22.6° × 4.7° including 2298 frames out of 3284 frames. (b) Eye motion trajectory during the acquisition of the images shown in (a). (c) AOSLO images taken in the eye of a patient with AMD extend 21.3°× 2.6° from 660 frames out of 1151 frames (Visualization 4). (d) Eye motion trajectory during the acquisition of the images shown in (c). The eye fixation/movement started at the coordinate origin point (0,0). Scale bars: 300 µm.

Download Full Size | PDF

The software was evaluated using the AOSLO images acquired in 22 eyes of 22 human subjects, including participants with normal chorioretinal health (n = 11) and patients with intermediate AMD (n = 11). All these images have been manually montaged with cone-for-cone precision in the overlapped areas in previous studies [28]. We compared these images with our software registered and montaged images. In 16501 frames of AOSLO images of the normal eyes, the software misplaced 11 (0.07%) frames in the montage, whereas in a total of 29051 frames of AOSLO images of the patients with AMD, the software misplaced 149 (0.51%) frames. Hyporeflective areas lacking fine structural features due to the presence of large blood vessels or AMD lesions were the primary cause of misplacement.

4. Discussion

By a systematic examination of the impact of the non-overlapping or non-correlating structure on AO image registration through an NCC procedure (Figs. 5&6), we developed a substrip-based algorithm for AOSLO image registration and montage. Compared with the conventional whole strip based registration, the substrip approach has improved accuracy and robustness in processing AO images acquired in eyes with large motion. Further, it can automatically montage AOSLO images with cell-to-cell precision in normal and AMD-affected retinas.

4.1 Substrip vs. whole strip registration

Strip based registration has been an established strategy for registering AOSLO images and measuring the eye’s fine movement [7,10,23,34], but its performance is limited by the retinal motion amplitude. This study demonstrates that the NCC is affected by structures outside the overlapping areas of the consecutive frames, which are determined by the retinal motion. The NCC coefficient calculated using the whole strip is small when the overlap is small. The estimated eye motion can be erroneous (Fig. 5). Although we can adopt a high NCC or low motion threshold to ensure an accurate registration in practice, e.g., excluding images with a translation > 30 pixels from the image registration [10], the registered image can only be formed by a small number of frames with a limited signal-to-noise ratio. This approach is particularly problematic for important tasks like in vivo imaging of the retinal pigment epithelium cells using autofluorescence, which requires hundreds of frames. Subjects with retinal disease often have poor fixation stability, causing significant intraframe distortion and interframe translation.

The discarded images due to the low NCC calculated from the whole strips are not necessarily of poor quality, but rather their large displacement from the reference frame prevents accurate correlation calculations. Using the substrip-based registration, we calculate the NCC using the overlap area of the image and the reference frame, thereby avoiding the errors induced by the images outside the overlap and enabling the successful registration of images with large amplitude motion. Using a substrip with a width of 64 pixels, we could register images with a displacement of up to 448 pixels (Figs. 7 & 8). The new algorithm effectively increased the registerable images with a success rate exceeding 85% in most cases (Figs. 9 & 10). The robust registration utilizing most images produced the final image with an extended field of view. It enhanced the signal-to-noise ratio, thereby allowing for better revelation of retinal structures such as the rod photoreceptors (Fig. 12).

The key to substrip-based registration is that the NCC is assessed with common structures only in the overlapping region, thereby enhancing the registration [35,36]. However, an excessively short substrip may contain insufficient information for accurate correlation calculations (Fig. 5). Thus, the substrip should have a specific length that ensures accurate correlation calculations. Our study using a well-selected group of human eyes with normal health and with AMD provided an estimate of the optimal substrip width. A substrip width of 64 pixels can work for most cases, and a larger substrip size can certainly be adopted in cases where the 64-pixel substrip cannot make accurate registration.

4.2 Automatic montaging

Automatic montaging of the continuously recorded AOSLO images with a cone-for-cone precision is a natural extension of the substrip based registration (Figs. 13, 15&16). The software aligns the adjacent frames using the overlapping area structure. However, the alignment is achieved through a substrip registration of the overlapping area of the image (to be registered) with the (already) registered or montaged images (in preceding steps). Thus, this process differs from previously published methods using local scale invariant features [19], scale invariant feature transforms (SIFT) and random sample consensus (RANSAC) [18], and constellation-based alignment [20]. Those methods rely on multiple structural features in the adjacent frames and perform the registration over the whole frame. Our approach is accomplished at the substrip level by correctly identifying the overlapping areas in adjacent frames and achieved by multiple-parameter examination and validation (described in section 2.3.4).

In most cases of the 22 subjects, including normal control participants and patients with early to intermediate AMD with a total of 45552 frames of AOSLO images, our program can montage consecutive frames with an overlapping area width of 64 pixels. This performance is comparable to or more robust performance than the pioneering work by Chen and co-workers, whose algorithm requires > 75-pixel columns of overlap in normal subjects and > 250-pixel columns in patients with retinitis pigmentosa (RP) [18]. However, we must acknowledge that this comparison may be inadequate since the disease conditions of the two groups of patients (AMD and RP) may be different regarding their impact on the retina and retinal images.

4.3 Precise measurement of eye motion

Like the whole strip based image registration enabling precise measurement of eye movements [15,16], the substrip-based registration and automatic montaging allow for precise measurement of retinal movements with improved accuracy and increased dynamic range (Figs. 13–16). AOSLO imaging of the eye following a moving target may be a potential tool for high-precision analysis of eye tracking during object pursuit.

4.4 Limitations and future work

Intraframe distortion and inter-frame rotation: Eye motion induced retinal image distortions in AOSLO have challenged image registration and montaging [17,21,22,33]. In this study, we have implemented methods reported by Bedggood et al. and Azimipour et al. [21,33]. However, these methods require a large number of frames (at least hundreds) at a single position to eliminate intra-frame distortions [33]. Our AOSLO videos were continuously recorded while the subject followed a moving fixational target. The number of images is less than 75 frames at a specific location. Thus, the intra-frame distortion was not completely corrected, causing blurs in the montage (Fig. 13). We have not implemented methods to correct the image rotation [37]. Thus, the image blur in the montage may also be attributed to this technical inadequacy. Future work will develop new strategies to mitigate distortion and image rotation effects.

Frame misplacement in montage: Automated AOSLO montage has a higher misplacement rate in the images acquired in the patient eyes than in images of the normal subjects (0.51% vs. 0.07%), requiring further improvement. Lesion caused hyporeflective areas, large backward saccades are recognized causes. Future studies need to identify other causative factors and improve the software.

Computing time: Using a GPU (GeForce RTX 3080 Ti, Nvidia Co., Santa Clara, CA) and MATLAB's GPU processing functions, the software can process images at 2 frames/second. A video of ∼1000 frames continuously recorded over a 20° field of view took ∼8 minutes to complete a montage, including automatic grouping, reference frame selection, intra-group registration, and image montage (Fig. 16(b)). The non-GPU-executable MATLAB functions and data transfer between the CPU and GPU are primary time-consuming processes. We will further optimize the algorithm by offloading most of the computations to the GPU and fully leveraging the GPU's parallel processing capabilities to achieve online image registration and montage.

5. Conclusions

The substrip-based registration approach provides an accurate measurement of the retinal movements. It significantly improves AOSLO registration accuracy, allowing for registering images with large motion and enabling automatic montage. Automatic montage reduces the manual labor required to generate high-quality AO large field-of-view retinal images and is promising to improve the productivity of image analysis.

Funding

National Institutes of Health (R01EY024378, R01EY034218); W. M. Keck Foundation; Carl Marshall and Mildred Almen Reeves Foundation; and Prevent Blindness/Dr. H. James and Carole Free Catalyst Award for Innovative Research Approaches for AMD.

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results and the software presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. D. R. Williams, S. A. Burns, D. T. Miller, et al., “Evolution of adaptive optics retinal imaging [Invited],” Biomed. Opt. Express 14(3), 1307–1338 (2023). [CrossRef]

2. J. I. W. Morgan, T. Y. P. Chui, and K. Grieve, “Twenty-five years of clinical applications using adaptive optics ophthalmoscopy [Invited],” Biomed. Opt. Express 14(1), 387–428 (2023). [CrossRef]

3. D. Purves, G. Augustine, D. Fitzpatrick, et al., “Neuroscience 2nd edition. sunderland (ma) sinauer associates,” Types of Eye Movements and Their Functions (2001).

4. R. M. Steinman, G. M. Haddad, A. A. Skavenski, et al., “Miniature eye movement,” Science 181(4102), 810–819 (1973). [CrossRef]

5. D. Ott and W. Daunicht, “Eye movement measurement with the scanning laser ophthalmoscope,” Clinical Vision Sciences 7(6), 551 (1992).

6. C. R. Vogel, D. W. Arathorn, A. Roorda, et al., “Retinal motion estimation in adaptive optics scanning laser ophthalmoscopy,” Opt. Express 14(2), 487–497 (2006). [CrossRef]

7. S. B. Stevenson and A. Roorda, “Correcting for miniature eye movements in high-resolution scanning laser ophthalmoscopy,” in Ophthalmic Technologies XV, (SPIE, 2005), 145–151.

8. S. B. Stevenson, A. Roorda, and G. Kumar, “Eye tracking with the adaptive optics scanning laser ophthalmoscope,” in Proceedings of the 2010 symposium on eye-tracking research & applications, (ACM, 2010), 195–198.

9. L. C. Sincich, Y. Zhang, P. Tiruveedhula, et al., “Resolving single cone inputs to visual receptive fields,” Nat. Neurosci. 12(8), 967–969 (2009). [CrossRef]

10. Q. Yang, J. Zhang, K. Nozato, et al., “Closed-loop optical stabilization and digital image registration in adaptive optics scanning light ophthalmoscopy,” Biomed. Opt. Express 5(9), 3174–3191 (2014). [CrossRef]

11. K. V. Vienola, B. Braaf, C. K. Sheehy, et al., “Real-time eye motion compensation for OCT imaging with tracking SLO,” Biomed. Opt. Express 3(11), 2950–2963 (2012). [CrossRef]

12. D. W. Arathorn, Q. Yang, C. R. Vogel, et al., “Retinally stabilized cone-targeted stimulus delivery,” Opt. Express 15(21), 13731–13744 (2007). [CrossRef]

13. Q. Yang, D. W. Arathorn, P. Tiruveedhula, et al., “Design of an integrated hardware interface for AOSLO image capture and cone-targeted stimulus delivery,” Opt. Express 18(17), 17841–17858 (2010). [CrossRef]

14. C. K. Sheehy, Q. Yang, D. W. Arathorn, et al., “High-speed, image-based eye tracking with a scanning laser ophthalmoscope,” Biomed. Opt. Express 3(10), 2611–2622 (2012). [CrossRef]

15. N. R. Bowers, A. E. Boehm, and A. Roorda, “The effects of fixational tremor on the retinal image,” J Vis 19(11), 8 (2019). [CrossRef]

16. N. R. Bowers, J. Gautier, S. Lin, et al., “Fixational eye movements in passive versus active sustained fixation tasks,” J Vis 21(11), 16 (2021). [CrossRef]

17. M. Zhang, E. Gofas-Salas, B. T. Leonard, et al., “Strip-based digital image registration for distortion minimization and robust eye motion measurement from scanned ophthalmic imaging systems,” Biomed. Opt. Express 12(4), 2353–2372 (2021). [CrossRef]

18. M. Chen, R. F. Cooper, G. K. Han, et al., “Multi-modal automatic montaging of adaptive optics retinal images,” Biomed. Opt. Express 7(12), 4899–4918 (2016). [CrossRef]

19. H. Li, J. Lu, G. Shi, et al., “Automatic montage of retinal images in adaptive optics confocal scanning laser ophthalmoscope,” Opt. Eng. 51(5), 1 (2012). [CrossRef]

20. M. Chen, R. F. Cooper, J. C. Gee, et al., “Automatic longitudinal montaging of adaptive optics retinal images using constellation matching,” Biomed. Opt. Express 10(12), 6476–6496 (2019). [CrossRef]

21. P. Bedggood and A. Metha, “De-warping of images and improved eye tracking for the scanning laser ophthalmoscope,” PLoS One 12(4), e0174617 (2017). [CrossRef]

22. P. Bedggood and A. Metha, “Towards distortion-free imaging of the eye,” PLoS One 16(6), e0252876 (2021). [CrossRef]

23. A. Dubra and Z. Harvey, “Registration of 2D images from fast scanning ophthalmic instruments,” in International Workshop on Biomedical Image Registration, (Springer, 2010), 60–71.

24. R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification and Scene Analysis (Wiley, 1973), Vol. 3.

25. J. R. Buck, M. M. Daniel, and A. C. Singer, Computer Explorations In Signals and Systems using MATLAB (Prentice-Hall, Inc., 1997).

26. P. Stoica and R. L. Moses, Spectral Analysis of Signals (Pearson Prentice Hall, 2005), Vol. 452.

27. X. Wang, S. R. Sadda, M. Ip, et al., “In vivo longitudinal measurement of cone photoreceptor density in intermediate age-related macular degeneration,” Am. J. Ophthalmol. 248, 60–75 (2023). [CrossRef]

28. Y. Zhang, X. Wang, M. E. Clark, et al., “Imaging of age-related macular degeneration by adaptive optics scanning laser ophthalmoscopy in eyes with aged lenses or intraocular lenses,” Trans. Vis. Sci. Tech. 9(8), 41 (2020). [CrossRef]

29. A. Meadway, C. A. Girkin, and Y. Zhang, “A dual-modal retinal imaging system with adaptive optics,” Opt. Express 21(24), 29792–29807 (2013). [CrossRef]

30. Y. Yu, T. Zhang, A. Meadway, et al., “High-speed adaptive optics for imaging of the living human eye,” Opt. Express 23(18), 23035–23052 (2015). [CrossRef]

31. Y. Yu and Y. Zhang, “Dual-thread parallel control strategy for ophthalmic adaptive optics,” Chin. Opt. Lett. 12(12), 121202 (2014). [CrossRef]

32. B. S. Reddy and B. N. Chatterji, “An FFT-based technique for translation, rotation, and scale-invariant image registration,” IEEE Trans. on Image Process. 5(8), 1266–1271 (1996). [CrossRef]

33. M. Azimipour, R. J. Zawadzki, I. Gorczynska, et al., “Intraframe motion correction for raster-scanned adaptive optics images using strip-based cross-correlation lag biases,” PLoS ONE 13(10), e0206052 (2018). [CrossRef]

34. K. Grieve, E. Gofas-Salas, R. D. Ferguson, et al., “In vivo near-infrared autofluorescence imaging of retinal pigment epithelial cells with 757 nm excitation,” Biomed. Opt. Express 9(12), 5946–5961 (2018). [CrossRef]

35. J. Lewis, Fast Normalized Cross-Correlation (Industrial light & magic, 2011).

36. J. P. Lewis, “Fast template matching,” in Vision Interface (Quebec City, QC, 1995), 15–19.

37. X. Hu and Q. Yang, “Real-time correction of image rotation with adaptive optics scanning light ophthalmoscopy,” J. Opt. Soc. Am. A 39(9), 1663–1672 (2022). [CrossRef]

Name	Description
Visualization 1	Automatic registration and montage of continuously recorded AOSLO foveal images from a normal healthy eye.
Visualization 2	Automatic registration and montage of continuously recorded AOSLO foveal images acquired in the eye of a patient with AMD.
Visualization 3	Automatic registration and montage of AOSLO images from a normal human eye. The montage spans 22.6° × 4.7° including 2298 frames out of 3284 frames.
Visualization 4	Automatic registration and montage of AOSLO images from the eye of a patient with AMD extends 21.3°× 2.6° from 660 frames out of 1151 frames

Substrip-based registration and automatic montaging of adaptive optics retinal images

Abstract

1. Introduction

2. Methods

2.1 Strip and substrip based image registration: the concept

2.2 Assess the impact of substrip width on the NCC and retinal motion measurement

2.3 Substrip-based registration and montage: the algorithm

2.3.1 Pre-processing

2.3.2 Automatic grouping and selection of a reference frame for each group

2.3.3 Substrip-based registration in each group

2.3.4 Substrip-based montaging

3. Results

3.1 Impact of substrip width on NCC coefficient and retinal motion estimation

3.2 Substrip-based vs. whole strip-based image registration [6, 7, 10, 12, 23]

3.3 Substrip-based montage

4. Discussion

4.1 Substrip vs. whole strip registration

4.2 Automatic montaging

4.3 Precise measurement of eye motion

4.4 Limitations and future work

5. Conclusions

Funding

Disclosures

Data availability

References

Supplementary Material (4)

Data availability

Cited By

Figures (16)

Equations (5)

Biomedical Optics Express