This paper describes an endoscopic-inspired imaging system employing a micro-electromechanical system (MEMS) micromirror scanner to achieve beam scanning for optical coherence tomography (OCT) imaging. Miniaturization of a scanning mirror using MEMS technology can allow a fully functional imaging probe to be contained in a package sufficiently small for utilization in a working channel of a standard gastroesophageal endoscope. This work employs advanced image processing techniques to enhance the images acquired using the MEMS scanner to correct non-idealities in mirror performance. The experimental results demonstrate the effectiveness of the proposed technique.
© 2014 Optical Society of America
Early diagnosis of cancers, or other diseases, is paramount to the outcome and success of patient treatment . While there are many diagnostic tools available to doctors, including biopsy and other invasive methodologies, perhaps one of the most invaluable is in-vivo imaging. Medical endoscopic imaging allows an endoscope to view or image tissue inside of a human body. This technique has been heavily utilized in the medical field as a tool for diagnosis and has proven useful to nearly all fields of medicine; perhaps, the first and most common application is imaging the gastrointestinal tract.
Many current endoscopic imaging systems use a charge-coupled device (CCD) imaging probe. This type of device is capable of very reliable, high-definition digital tissue imaging ; however, several advantages can be realized using a different optical modality, as with optical coherence tomography (OCT). OCT has evolved as a noninvasive method of obtaining cross-sectional microstructure information from biological tissue, and this optical technique boasts a longitudinal and lateral spatial resolution on the order of microns . The main advantage of OCT over other traditional systems is that this method offers the possibility of non-invasive, in vivo visualization of sub-surface tissue with high resolution three-dimensional images.
Mating endoscopic applications with the miniature and multi-functional nature of micro-electromechanical systems (MEMS) technology yields a very powerful diagnostic tool. MEMS technology utilizes processes developed by the semiconductor industry to realize complex mechanical systems on a very small scale. Feature sizes in MEMS devices are commonly in the tens to hundreds of micrometers range . The versatility of MEMS technology allows the miniaturization of very complex systems.
By uniting these three technologies, this work discusses the target application for a micromirror in endoscopic imaging using optical coherence tomography. MEMS micromirror scanners have been used in a vast array of applications . MEMS and OCT systems have been previously demonstrated in handheld devices  and endoscopic MEMS implementations including electrothermal actuation [7–9], torsional angled vertical-comb actuation , electrostatic actuation with a bonded mirror plate , magnetic actuation , and dynamic focus-tracking [13,14]. This work discloses the combination of an electrostatically actuated, double-gimbaled MEMS mirror using serpentine springs  with both time-domain and spectral-domain OCT systems in tissue imaging intended for the eventual integration with endoscopic applications. The electrostatic gap-closing actuation (or parallel plate actuation) mechanism can achieve a small device footprint and low actuation power. A computer vision based image de-warping technique is incorporated with such an image acquisition device, for the first time to the best of our knowledge, to improve the image quality.
2. Device description and fabrication
The MEMS micromirror device employs a biaxial gimbal structure allowing two-dimensional scanning of the mirror. With a die size of less than 2 mm square, the usable mirror area is circular with a diameter of ~800 µm. The fabrication of these devices is similar to previously published three-dimensional scanning mirrors , but here a silicon-on-insulator (SOI) wafer is used. SOI processing technology achieves a uniform-thickness device layer that would be otherwise very challenging using the prior published fabrication process; the use of SOI also streamlines the fabrication process making it faster and cheaper and more feasible for integration with medical technology. SOI also improved the mirror surface quality due to elimination of release etching holes. The fabrication process, depicted in Fig. 1, begins with a 100 mm diameter SOI wafer. A silicon nitride layer is deposited on both sides of the wafer using plasma-enhanced chemical vapor deposition (PECVD). This layer is patterned, and potassium hydroxide (KOH) is used to etch square holes, from the backside, through the silicon handle wafer and down to the buried silicon oxide layer. Following the silicon etch, the silicon nitride mask is removed through etching in phosphoric acid at 160 °C.
A Cr adhesion layer and an Au electrode layer are sputter deposited on the top surface of the mirror. The thickness of this Au and Cr layer (75 nm and about 5 nm, respectively) is optimized to provide nearly 100% reflectivity while remaining as thin as possible to reduce the stress in the film. After metal deposition, the shape of the mirror is defined and is etched through the metal layers with commercially available wet chemical etchants for Cr/Au. The same pattern is etched through the silicon device layer of the SOI wafer using deep reactive ion etching. Hydrofluoric (HF) acid is used to chemically etch the silicon oxide layer and fully release the devices, generating the micromirror shown under optical microscope in Fig. 2.
To actuate the micromirror, a set of four electrodes are placed beneath the mirror. These electrodes are fabricated on raised silicon pillars to decrease the distance between the electrodes and the mirrors and increase the force on the mirror for a given applied voltage. The detailed fabrication process of the pillar electrodes is described in .
3. Micromirror scanning characteristics
The testing of the micromirror devices is performed by applying a voltage waveform to one or more electrodes while the others are grounded. The angular deflection of the mirror is calculated from the spatial displacement of a laser spot reflecting off the micromirror. The scanning performance of the mirror is tested both with static voltages, Fig. 3(a), and also AC voltage signals tuned to the mechanical resonant frequency of the devices, Fig. 3(b). The observed resonant AC result required a DC bias of 60 V; the DC bias ensured that the voltage applied to the mirror was always greater than zero and had the added benefit of reducing the AC voltage necessary for resonant scanning.
Under static actuation, the angular deflection is very non-linear with respect to voltage. However, when the applied voltage is tuned to the device’s resonant frequency, the response becomes very linear.
In addition to the measurement of deflection angle versus applied voltage, it is important to characterize the frequency response of the device. This is achieved by applying an AC signal of constant amplitude to the mirror. While measuring the deflection of the mirror, the frequency of the applied signal is swept through a range of frequencies. When the frequency of the applied voltage waveform matches the mechanical resonant frequency of the device, the magnitude of the deflection is maximized. A curve showing the deflection of the micromirror versus the frequency of the applied voltage waveform is displayed in Fig. 4.
As seen in Fig. 4, the fabricated micromirror has a mechanical resonant frequency of 472 Hz for the inner axis and 399 Hz for the outer axis. These resonant frequencies allow for faster readjustment of the position of the mirror, resulting in faster image acquisition and, ultimately, a higher frame rate.
4. Imaging results
The micromirror devices that have been fabricated enabled successful imaging using both time-domain and spectral-domain OCT systems. Due to the non-linear nature of the parallel-plate actuated devices driven at non-resonant frequencies, calibration of the driving waveforms is necessary for linear spatial scanning in the two-dimensional imaging.
4.1 Calibration for device actuation
Calibration is completed through mapping the spatial displacement of the beam for a given array of input voltages. With the voltage applied and the corresponding spatial position of the focused laser spot, the actuation waveform can be calculated using cubic interpolation. This also allows the voltages necessary for high spatial density images to be generated from much less dense calibration data. Figure 5 shows the original measured spatial pattern consisting of 100 by 100 pixels of the scanned laser beam before algorithmic calibration followed by densely linearized scan pattern. The dwell time of each pixel was around 0.01 seconds.
After calibration, spatial imaging is completed using a time-domain OCT system. For testing, a structure composed of checkerboard patterns of various sizes was fabricated. The spatial images generated from the time-domain OCT system and the test structure are displayed in Figs. 6 and 7, where Fig. 6 shows the distorted pattern before calibration.
These images (~2 mm by ~2.5 mm) demonstrate the capability of the imaging system both to acquire large field of view images but also to resolve features as small as 50 µm. With an additional focusing lens, higher resolution as small as 5 µm can be achieved as described in  at the expense of reduced field of view. The slight skewing visible towards the center of both of the images is caused by a non-linearity introduced into the system by the digital-to-analog converter and rectifier system. This issue can be resolved through the use of analog output channels.
4.2 Optical coherence tomography (OCT) imaging
OCT imaging can be accomplished in two different modalities: time-domain and spectral-domain systems. For time-domain OCT the depth data is contained in the signal at different times during acquisition. The utilization of a broadband, low-coherence light source enhances the axial resolution for imaging underlying structures. To image through the depth of the sample, the reference arm is physically scanned over a distance corresponding to the image’s cross-sectional depth. This process dictates the time-sequential nature of time-domain OCT. Imaging protocol must also conduct this reference arm scanning for each individual lateral pixel along the surface of the sample to produce a three-dimensional image [3,16]. The time-domain OCT results reported in this work utilized a rapid-scanning optical delay (RSOD) in the reference arm to achieve imaging. The RSOD allows for much faster scanning capabilities for real-time imaging compared to that using an approach to linearly translate the reference mirror. The incorporated RSOD is based on a grating-lens optical phase delay line with an additional length of single-mode fiber (SMF) in the sample arm for dispersion management . Figure 8(a) shows the time-domain OCT layout utilized in this study.
Alternatively in spectral-domain OCT, the depth data is conveyed by the spectral interference due to different wavelengths of light. The reference and sample arm set-up is much like that for time-domain OCT; however, the optical path length of the reference arm is fixed. Additionally, instead of the detector and demodulation scheme, the interference is detected by a spectrometer consisting of a diffractive grating and a linear detector array to measure the intensity for each wavelength for a particular spectral band. The spatial resolution of the detector or camera dictates the ranging distance, sometimes referred to as the imaging depth, of the OCT system. For each lateral point on the sample, the system instantaneously captures the spectral interferogram and processes the data through Fourier transformation to reveal the depth information from the sample. The spectral-domain system, shown in Fig. 8(b), does not need to sequentially scan through the sample depth as is required in the time-domain system [16,18].
To demonstrate the MEMS mirror’s capability across both the time-domain and spectral-domain OCT platforms, cross-sectional images of various samples have been generated. Shown in Fig. 9 is a two-dimensional cross-sectional image of several microscope cover glass slides (~600 μm by ~1 mm), stacked in a stair-step like structure, obtained through time-domain OCT. The image comprised of 5 A-lines was acquired at a rate of 2 mHz, and the OCT system used a center wavelength of 1325 nm with a bandwidth greater than 100 nm. For this experiment, the micromirror was not operated at resonance due to the limit of data acquisition hardware. Therefore, a low scan speed was used to work with the actuation non-linearity. With faster sampling and data acquisition to minimize the non-linear effect from the elastic mechanical torque of the tethers, the mirror can potentially be driven at resonance approaching 500 Hz.
The staggered image of boundaries between the glass slides is caused by the dissimilar refractive indices of the glass and air. The higher refractive index of the glass increases the path length to the same boundary as compared to the same path in air when interpreted by the image acquisition algorithm.
OCT imaging of tissue was completed using a spectral-domain OCT system. The OCT system has a center wavelength of 1310 nm with a bandwidth of 83 nm and utilizes a 92 kHz InGaAs linescan camera. The original acquired image contained 1000 A-lines in each B-frame obtained by scanning the mirror at 16 Hz (for the same reasons stated above), yielding an acquisition rate of 16,000 A-lines per second. Figure 10 shows the resulting OCT image, but due to slightly non-linear spatial movement resulting from non-ideal actuation, the resulting OCT image shows indications of motion artifacts. To refine the image and minimize the distortion, a scanning speed correction technique proposed by Liu et al.  for use in freehand OCT systems utilizes the cross-correlation between adjacent A-scans to estimate their lateral displacement. This relation enables interpolation to uniformly re-distribute A-scans along the lateral axis. It is important to note that lateral scaling is skewed by this technique and a horizontal scale bar is no longer meaningful. The processing time for one frame of 1000 A-scans required approximately 0.2 seconds when implemented in MATLAB®. With a further optimized processing scheme, Liu et al. achieved a rate of over 62,000 A-scans per second using a graphical processing unit (GPU) and parallel processing, thereby demonstrating the compatibility of this algorithm with real-time imaging. Figure 11 displays the motion corrected, cross-sectional view of a mouse ear.
Although the system still requires further optimization, the image in Fig. 11 of the mouse ear demonstrates successful imaging of tissue using the fabricated micromirrors. In the image, the surface of the ear is clearly visible along with the internal cartilage layer of the ear.
5. Image processing
Though the proposed MEMS micromirror can capture images successfully, it is observed that some spatially non-uniform distortions often occur during the imaging procedure, in which the distortion could be different from region to region. This distortion mainly arises from the non-linearity of device capacitance as the mirror is tilted. In addition, resonant scanning for the proposed system demands higher frequencies which introduce further distortion and divergence from theory. For the time-domain OCT platform, sinusoidal resonant scanning requires a higher operating frequency not only for the MEMS mirror, but also for the RSOD, posing additional challenges to image quality . As shown in Fig. 12, the captured chess board image, Fig. 12(a), is heavily distorted in comparison with the ground truth, Fig. 12(b). In order to correct the distortion, the computer vision  and digital imaging researchers  presented global de-warping methods, e.g., radial distortion model, tangent distortion model, etc. In these conventional methods, pairs of feature points are extracted automatically to build correspondence between the distorted image and the ground-truth image so that the radial distortion parameters and tangent distortion parameters are computed and applied on the distorted image globally. However, the above global methods cannot deal with the spatially non-uniform distortion in our case because it is observed that the distortions in our captured images are different from region to region. In order to tackle this problem, a local transformation based method is introduced to correct the distortion in the calibration process. There are four key steps in our proposed approach: 1) establish feature-point correspondences between the distorted image and the ground-truth image; 2) parameterize the distorted image and the ground-truth image, respectively, with the feature points; 3) compute the local warping parameters from the corresponding sub-region around each feature point; 4) de-warp the distorted image by applying the local warping parameters.
5.1 Feature-point correspondence between the distorted image and the ground-truth image
Given a chess-board image, it is easy to locate the feature points by using an interesting point detector. In our approach, a Harris corner detector  is applied to extract the feature points in the distorted image and the ground-truth image. To correct some misplacements of feature points and to improve the precision of the registration, an interactive interface is developed to further add, delete, and adjust the feature points as well as their indices in this calibration process. The corresponding feature points (marked in red in Fig. 13) in the distorted image and the ground-truth image are assigned the same index. The interactive tool is shown in Fig. 13. Due to the distortion, blurring, and other artifacts, it is difficult to label the feature points precisely, which may lead to errors in the de-warping process. In order to reduce the risk of mislabeling, our system provides a zoom-in function to increase the precision in the labeling.
5.2 Image parameterization
After building the feature-points correspondence, we carry out Delaunay 2D triangulation  based on the feature points and the four corners of the distorted image. The triangular mesh after subdivision is copied to the ground-truth image. Based on the triangular mesh, each feature point forms its corresponding sub-region by including its one-ring neighborhood as shown in Fig. 14. Also, each sub-region in the distorted image can find its counterpart in the ground-truth image. For the pixel in an image, we compute its parameterized coordinate as follows:
5.3 Sub-region based local warping parameters computation
For the purpose of computing the local warping parameters for the feature points, we assume the feature point and its one-ring neighborhood in the triangular mesh share the same local warping parameters which can be expressed as a decomposition of translation, rotation, and scaling matrices as follows:
It is observed that one feature point may belong to different sub-regions. In order to ensure that the warping transformations, when applied to the feature point, are consistent with respect to one another, we require that the shared feature point be transformed to the same point:
In order to transform the distorted image into the ground-truth image while maintaining these consistency requirements, a minimization framework is formulated as follows to transform the feature points in the distorted image to those in the ground-truth image and enforce the consistency constraint
The matrix norm is the Frobenius norm, or the square root of the sum of the squared matrix elements. If one substitutes Eq. (2) into Eq. (5), one obtains a formulation of the problem in terms of the transformation coefficients between the distorted image and the ground-truth image. Solving for these coefficients in a least-squares-sense corresponds to solving a simple system of linear equations. The solution of this linear system can be quickly computed by means of the LU decomposition .
5.4 Distorted image de-warping
Once the sub-region based local warping parameters are computed, we can de-warp other captured distorted images by applying the local warping parameters to each pixel x according to the parameterizing weights to the feature points to obtain the new pixel x':Figs. 15 and 16.
For better evaluation, we also perform global de-warping on the distorted image, in which the global de-warping parameters are computed by a straightforward linear regression of the transformation between all the feature point pairs in the distorted image and the ground-truth image. The root-mean-square error (RMSE) is computed for the distorted image, the result of global de-warping, and the result of local de-warping with Eq. (7). The experimental results show that the proposed local de-warping method is better than the global de-warping method.
The image processing algorithm runs on the platform with Intel Core I7 2.7 GHz CPU and 8 GB RAM and is implemented by C# programming language. In the calibration procedure, the time consumption depends on the feature-point locating time because some human-computer interactions are needed to make sure the location is correct. However, it is worthy of noting that we need to run the calibration only once for future use. Once the de-warping parameters are determined, the distortion removal operation can be performed in real time. In our implementation, the time cost of the de-warping operation in Fig. 15(c) is 0.045 seconds, while the time cost of Fig. 16(c) is 0.051 seconds.
We also applied the image processing algorithm to the spectral-domain OCT image of the mouse ear (Fig. 10). Figure 17 shows the corrected image. The de-warping parameters obtained from the results shown in Fig. 15 are used for this process.
A two-dimensional, electrostatically-actuated MEMS mirror was successfully paired with both rapid-scanning-optical-delay time-domain and spectral-domain OCT systems to produce cross-sectional images. To correct the distortion of the acquired images, a computer vision based image de-warping technique is developed, in which the local warping parameters are computed and applied to the distorted images. The experimental results demonstrate the effectiveness of the proposed approach, and this work demonstrates the applicability of MEMS technology in tissue imaging and affirms its candidacy for endoscopic applications. However, to incorporate this technology in eventual clinical applications, there are several critical next steps. As a core focus of this paper, distortions from system non-linearity have posed a significant challenge to obtaining high-quality images, and further device optimization to reduce non-linearity would help better refine image clarity. Additionally, design and fabrication optimization should yield more robust devices with smaller footprints. This will enable integration with miniature endoscopic packaging that is adaptable and adjustable to accommodate various medical procedures. Another important component of medical imaging is speed; by increasing the resonant frequency of the designed mirrors through control of the tether structures, higher OCT frame rates can be achieved for real-time diagnostic capability.
This work was supported by the National Institutes of Health (Grant # R01 EB007636). Device fabrication was primarily conducted at the Washington Nanofabrication Facility at the University of Washington. Ethan Keeler is supported by the National Science Foundation Graduate Research Fellowship Program under Grant No. DGE-1256082.
References and links
1. D. Provenzale, “Screening and surveillance of gastrointestinal cancers,” in Gastrointestinal Cancers: A Companion to Sleisenger and Fordtran's Gastrointestinal and Liver Disease, A. Rustgi and J. M. Crawford, eds. (Saunders Publishing, 2003), pp. 193–204.
3. D. Huang, E. A. Swanson, C. P. Lin, J. S. Schuman, W. G. Stinson, W. Chang, M. R. Hee, T. Flotte, K. Gregory, C. A. Puliafito, and J. G. Fujimoto, “Optical coherence tomography,” Science 254(5035), 1178–1181 (1991). [CrossRef] [PubMed]
4. M. C. Wu, O. Solgaard, and J. E. Ford, “Optical MEMS for lightwave communication,” J. Lightwave Technol. 24(12), 4433–4454 (2006). [CrossRef]
5. S. T. S. Holmstrom, U. Baran, and H. Urey, “MEMS laser scanners: a review,” J. Microelectromech. Syst. 23(2), 259–275 (2014). [CrossRef]
6. C. D. Lu, M. F. Kraus, B. Potsaid, J. J. Liu, W. Choi, V. Jayaraman, A. E. Cable, J. Hornegger, J. S. Duker, and J. G. Fujimoto, “Handheld ultrahigh speed swept source optical coherence tomography instrument using a MEMS scanning mirror,” Biomed. Opt. Express 5(1), 293–311 (2014). [CrossRef] [PubMed]
7. J. Sun, S. Guo, L. Wu, L. Liu, S.-W. Choe, B. S. Sorg, and H. Xie, “3D In Vivo optical coherence tomography based on a low-voltage, large-scan-range 2D MEMS mirror,” Opt. Express 18(12), 12065–12075 (2010). [CrossRef] [PubMed]
8. T. Xie, H. Xie, G. K. Fedder, and Y. Pan, “Endoscopic optical coherence tomography with new MEMS mirror,” Electron. Lett. 39(21), 9–10 (2003). [CrossRef]
10. A. D. Aguirre, P. R. Hertz, Y. Chen, J. G. Fujimoto, W. Piyawattanametha, L. Fan, and M. C. Wu, “Two-axis MEMS scanning catheter for ultrahigh resolution three-dimensional and en face imaging,” Opt. Express 15(5), 2445–2453 (2007). [CrossRef] [PubMed]
11. W. Jung, D. T. McCormick, J. Zhang, L. Wang, N. C. Tien, and Z. Chen, “Three-dimensional endoscopic optical coherence tomography by use of a two-axis microelectromechanical scanning mirror,” Appl. Phys. Lett. 88(16), 163901 (2006). [CrossRef]
12. K. H. Kim, B. H. Park, G. N. Maguluri, T. W. Lee, F. J. Rogomentich, M. G. Bancu, B. E. Bouma, J. F. de Boer, and J. J. Bernstein, “Two-axis magnetically-driven MEMS scanning catheter for endoscopic high-speed optical coherence tomography,” Opt. Express 15(26), 18130–18140 (2007). [CrossRef] [PubMed]
13. M. Strathman, Y. Liu, X. Li, and L. Y. Lin, “Dynamic focus-tracking MEMS scanning micromirror with low actuation voltages for endoscopic imaging,” Opt. Express 21(20), 23934–23941 (2013). [CrossRef] [PubMed]
14. B. Qi, P. A. Himmer, M. L. Gordon, V. X. D. Yang, D. L. Dickensheets, and A. I. Vitkin, “Dynamic focus control in high-speed optical coherence tomography based on a microelectromechanical mirror,” Opt. Commun. 232(1–6), 123–128 (2004). [CrossRef]
15. P. B. Chu, I. Brener, C. Pu, S. Lee, S. Member, J. I. Dadap, S. Park, K. Bergman, N. H. Bonadeo, T. Chau, M. Chou, R. A. Doran, R. Gibson, R. Harel, J. J. Johnson, C. D. Lee, D. R. Peale, B. Tang, D. T. K. Tong, M. Tsai, Q. Wu, W. Zhong, E. L. Goldstein, L. Y. Lin, and J. A. Walker, “Design and nonlinear servo control of MEMS mirrors and their performance in a large port-count optical switch,” J. Microelectromech. Syst. 14(2), 261–273 (2005). [CrossRef]
17. Y. Chen and X. Li, “Dispersion management up to the third order for real-time optical coherence tomography involving a phase or frequency modulator,” Opt. Express 12(24), 5968–5978 (2004). [CrossRef] [PubMed]
18. J. F. Boer, “Spectral/Fourier domain optical coherence tomography,” in Optical Coherence Tomography: Technology and Applications, W. Drexler and J. G. Fujimoto, eds. (Springer, 2008), pp. 147–175.
19. X. Liu, Y. Huang, and J. U. Kang, “Distortion-free freehand-scanning OCT implemented with real-time scanning speed variance correction,” Opt. Express 20(15), 16567–16583 (2012). [CrossRef]
20. V.-F. Duma, K.-S. Lee, P. Meemon, and J. P. Rolland, “Experimental investigations of the scanning functions of galvanometer-based scanners with applications in OCT,” Appl. Opt. 50(29), 5735–5749 (2011). [CrossRef] [PubMed]
21. G. Bradski and A. Kaehler, Learning OpenCV (O’Reilly, 2008).
22. J. G. Fryer and D. C. Brown, “Lens distortion for close-range photogrammetry,” Photogramm. Eng. Remote Sensing 52, 51–58 (1986).
23. C. Harris and M. Stephens, “A combined corner and edge detector,” in Proceedings of Fourth Alvey Vision Conference, (University of Sheffield, 1988), pp. 147–151.
24. P. Cignoni, C. Motani, and R. Scopigno, “DeWall: A fast divide and conquer Delaunay triangulation algorithm in Ed,” Comput. Aided Des. 30(5), 333–341 (1998). [CrossRef]
25. T. A. Davis, “Umfpack version 4.1 user guide,” Tech. Rep., University of Florida, TR-03–008 (2003).