In this study, we developed a single-channel stereoscopic video imaging modality based on a transparent rotating deflector (TRD). Sequential two-dimensional (2D) left and right images were obtained through the TRD synchronized with a camera, and the components of the imaging modality were controlled by a microcontroller unit. The imaging modality was characterized by evaluating the stereoscopic video image generation, rotation of the TRD, heat generation by the stepping motor, and image quality and its stability in terms of the structural similarity index. The degree of depth perception was estimated and subjective analysis was performed to evaluate the depth perception improvement. The results show that the single-channel stereoscopic video imaging modality may: 1) overcome some limitations of conventional stereoscopic video imaging modalities; 2) be a potential economical compact stereoscopic imaging modality if the system components can be miniaturized; 3) be easily integrated into current 2D optical imaging modalities to produce a stereoscopic image; and 4) be applied to various medical and industrial fields.
© 2015 Optical Society of America
There are two general types of three-dimensional (3D) images: computer-generated images that rely on the construction of a virtual 3D object and stereoscopic images . In stereoscopy, the depth perception in an image, whether acquired by an optical system or perceived by the human eye, is generated from two slightly different two-dimensional (2D) planes. The human brain perceives the depth information from 2D images based on several factors such as the light pattern, relative size, overlapping, shade, color, movement, and stereopsis . Inter-ocular distance, which is defined as the distance between the human eyes or the distance between two cameras, provides stereoscopic vision, leading to a more accurate understanding of an object’s depth, especially at relatively short distances .
Stereoscopic images can be generated after recording, processing, and superimposing two or more images of the same object taken from different view orientations . Stereoscopic imaging methods have found extensive applications in medical science, education, and cinema . Various improvements have been suggested to enhance the detail, precision, and usability of stereoscopic images, as well as clarity of perception and compliance.
Despite the various advantages of conventional dual-channel stereoscopic imaging modalities (SIMs) , there are some limitations, such as device size, difficulty in adjusting the convergence angle, and location relative to the object. In dual-channel SIMs, two cameras at different locations simultaneously obtain disparity between left and right images of objects . Optical setups with two or more cameras have typically been used for professional stereoscopic video imaging [8,9]; however, one of the cameras may have a different temporal synchronization, geometrical calibration, and color balance characteristics . If such characteristics are neglected, the final stereoscopic image may induce physiological side effects in viewers . Such problems reveal the necessity for further development of a SIM based on a single camera, also known as a planar catadioptric stereo (PCS) system [6,11,12]. The PCS system has several advantages over conventional dual-channel SIMs, including convenient adjustment of such system parameters as brightness, gain, and offset, and the hardware system for the camera synchronization can be simplified because only a single camera is used .
In previous studies, an optical biprism [12,14] or mirror system [11,15] has been placed at the entrance of an optical channel to simultaneously record the left and right images with a definite image disparity. Although this approach leads to smaller devices and an easier setup, further improvement is still needed because only one half of the camera resolution is used effectively . A single-channel SIM based on a transparent rotating deflector (TRD) has also been suggested as an alternative to generate lateral image disparity for depth perception  and size measurement . The TRD is an optical window that acquires image disparity. In this case, the image disparity is a function of the refractive index and the rotation angle of the TRD .
Utilizing the advantages of our previous study , this study introduces a single-channel stereoscopic video imaging modality based on TRD (SSVIM-TRD). The SSVIM-TRD was characterized by evaluating the generation of a stereoscopic video image, rotation performance of the TRD, heat generated by rotating components, and image quality and its stability. The structural similarity (SSIM) index was used as a standard evaluation method to compare image quality. It is a full reference metric approach for objective image quality assessment in terms of similarity between two images by measuring the relative quality of the secondary processed image to the primary reference image . Image disparity was estimated in terms of simplified human stereovision. A unique, innovative subjective analysis was also performed to evaluate improvement of human depth perception.
2. Materials and methods
2.1 Setup of stereoscopic imaging modality
A microcontroller unit (MCU; AtMega 128, Atmel® AVR®) was used to efficiently manage and synchronize the camera and motor. A complementary metal-oxide semiconductor (CMOS) camera (acA2000-50gm NIR, Basler, Germany) equipped with a fixed focal lens (67-715, F/1.417; Edmund Optics, USA) was used to obtain stereoscopic video images. An achromatic doublets lens with a 75 mm focal length was used as an objective lens. A hardware image acquisition triggering function was used to obtain the images at specific times using the CMOS camera, which was set to 38 fps.
The TRD was made from an optical glass, SF10, with a refractive index of 1.7, a size of 8 × 12 mm2, and a length of 20 mm. It was placed on the stepping motor and fixed on an aluminum TRD mount with an anodized surface to prevent further oxidization. It was located inside a closed housing box to avoid unwanted ambient light [Fig. 1(a)]. The TRD was continuously rotated by a two-phase unipolar stepping motor (A4K-M245; Autonics, Korea) that was controlled by a stepping motor driver (MAI-2MT-ST V2.1; M.A.I, Korea) and the MCU, which was also used to manipulate the rotation direction and speed of the stepping motor. Figure 1(a) shows a photograph of the SSVIM-TRD. Figure 1(b) illustrates how a camera, TRD, and objective lens consisting of the SSVIM-TRD can generate two virtual cameras with converging optical axes.
2.2 Performance evaluation of stereoscopic imaging modality
The operation timing of the SSVIM-TRD components and the rotation angle and direction of the TRD are dependent on the input clock signals generated by the MCU. After TRD rotation, there is a short time delay to prevent any possible image distortion due to hunting oscillation by the stepping motor. During clockwise (CW) and counterclockwise (CCW) rotation, the left and right images are acquired at a specific time using the hardware triggering function of the camera and are displayed on a monitor in real-time.
In a typical two-phase unipolar stepping motor, the direction and rotation speed of stepping motor are altered depending on the excitation input signals. The signals generated by the MCU for the motor rotation were evaluated using an oscilloscope (WaveRunner 6050; LeCroy, USA) to investigate their accuracy and precision. The stepping motor was set to rotate 9° from the optical axis with a 14 ms interval followed by a 6 ms time delay for both CW and CCW rotation.
Heat generation by the stepping motor may result in malfunction during long-term operation. A thermographic camera (CS620; Flir Systems, USA) was used for a visual inspection of the heat distribution in the SSVIM-TRD. An infrared digital thermometer (DT8380; Cheerman, China) was also used to measure the surface temperatures of the stepping motor and motor driver. To minimize heat generation in the SSVIM-TRD, a heat sink and motor fan were employed as a heat reduction system (HRS). An evaluation of the heat generation was performed both with and without the HRS.
2.3 Stereoscopic display and image storage
Both active shutter and polarization 3D display methods were used because of the use of full resolution, brighter stereoscopic images, and enable multiuser observation of the stereoscopic images [19,20]. A commercial active shutter (P701; GEFORCE 3D VISION; NVIDIA, China) and polarization glasses (FPG-200F; LG, Indonesia) were used to observe a stereoscopic video image. After real-time stereoscopic image processing, Frame-Sequential layout was selected to display live stereoscopic video images with regard to sequential left and right image acquisition; continuous left and right images were sequentially displayed on a 3D monitor (CM22WS; Samsung, or, 27MD53D; LG) using Stereoscopic Player (2.3.0). In addition, Side-By-Side display layout was selected, as it is the most common standard method used for recorded videos. For this purpose, continuous left and right images were stored using the NI Vision toolkit in Labview, and then an .avi stereoscopic video output file containing side-by-side left and right images was produced using MATLAB via stereoscopic image processing.
2.4 Evaluation of light distribution
During the rotation of the TRD, the left and right images are deflected from an optical axis. This may introduce image distortions, which can be partly evaluated by calculating the coefficient of variation (CV) of the images. The CV has been used to analyze the light distribution over an illuminated surface . It was assumed that images taken under the same conditions would have equal CVs; however, if any differences in the light distribution were observed, then the rotation of the TRD could affect the image brightness.
A white diffuse reflectance target (SRT-99-100; Labsphere, USA) was imaged for CW and CCW rotation of the TRD from 0° to 40° with a 10° interval. The TRD was assumed to have a perfect shape without internal reflections. Images were taken under room lighting conditions and the CV was evaluated using MATLAB by calculating the averages and standard deviations of image columns. Each measurement was repeated 11 times.
2.5 Evaluation of image quality
The SSIM index was used to evaluate the image quality under the TRD rotation. Consecutive images were obtained from an object under a constant lighting condition. Two test image groups (TIGs), left and right image groups, were obtained. Images in each group were compared in pairs. A control image group (CIG) was obtained when the TRD was aligned along the optical axis of camera and objective lens. The experiment was done for 45 different color image pairs of the TIGs and CIG, and SSIM indices were evaluated in terms of homogeneity and normal distribution (Kolmogorov-Smirnov test, p-value > 0.05). Then, an independent-samples t-test was used to compare rotation and non-rotation conditions in terms of SSIM. Through the evaluation, it is determined whether the applicant detects any noise, vibration, or brightness and color changes caused by TRD rotation while displaying sequential images.
In addition, absolute values of pixel-by-pixel subtraction were calculated to achieve detailed information about image differences. The pixel subtractions were applied to each color channel and then averaged over each pixel. Results were presented in a distribution map. This method can evaluate the amounts and locations of all slight displacements and the changes in color, brightness, and contrast of images.
2.6 Depth perception simulation of human stereovision
To simulate the depth perception of human vision, the image disparity generated by the SSVIM-TRD was calculated and compared to human stereovision. Displacement of the entrance area of the TRD, DTRD, is equal to:
Where LTRD is the length and α is the rotation angle of the TRD [Fig. 2 (top)]. The angular difference between the left and right imaging directions, θ, was assumed equal to that of human stereovision. To simplify the conditions, the objective lens was assumed to be ideal and its thickness was neglected. Thus, the simulated distance between eye and object based on the image disparity generated by the SSVIM-TRD can be calculated from Eq. (2), utilizing DTRD, the focal length of the objective lens, and similarity of triangles [Fig. 2 (bottom)]:
Where F is the focal length of the objective lens, Dhuman is the interocular distance of a human, and hhuman is the distance between the eyes and object. Considering Dhuman ≈65 mm, hhuman is computable for a given F.
2.7 Subjective analysis of depth perception
Subjective analysis was performed to evaluate the depth perception capability of observers in the SSVIM-TRD. Imaging targets were prepared using four cylindrical bars with different sizes and were placed on a flat plate at similar distances from the center. Then, the SSVIM-TRD was placed at a fixed working distance from the top of the imaging target [Fig. 3(a)]. Observers could only see the cross section and a very slight part of the bodies of the bars [Fig. 3(b)]; therefore, it was assumed that they could not easily predict the height of the bars. Images were captured with TRD rotation for a stereoscopic image and without TRD rotation for a 2D image. Regardless of the capturing mode, after stereoscopic image processing, all images were displayed using active 3D method. It is clear that stereoscopic image processing had no effects on 2D images. The experiment was repeated 11 times with different bars in various situations. Real and sham stereoscopic images were shown randomly and 15 blinded observers were asked to identify the real stereoscopic images and sort the bars in terms of height for each image.
Regarding the type of experiment design and imaging conditions, statistical improvement in correct answers related to real stereoscopic images may indicate depth perception capability. A paired-samples t-test was conducted to compare the correct scores of each group to see whether there was a significant difference.
3.1 Performance evaluation of stereoscopic imaging modality
Figures 4(a) and (b) show the input signals for CW and CCW rotation of the TRD, respectively. The typical electrical signal of 5 V was generated by the MCU, and as programmed, a motor rotation and time delay series was executed in 20 ms.
An achromatic doublets lens with a 75 mm focal length was used as an objective lens, providing a final optical magnification of around 3x. Field of view was cropped 15 × 12 mm2 and depth of field was about 4 mm. Exposure time was set on 2.000 ms.
Thermographic images of the SSVIM-TRD, recorded before and 20 min after operation, confirmed that heat was generated by the stepping motor and motor driver over time. Although a slight temperature increase of the CMOS camera was also observed, this temperature change was small compared to that of the stepping motor. Therefore, it was neglected, assuming that it would not cause malfunction in the SSVIM-TRD.
Temperatures of the stepping motor and motor driver followed a similar pattern in the temperature variation with and without the HRS: an exponential increase followed by steady-state convergence to a constant temperature. However, the maximum temperature with the HRS was much lower than that without the HRS; after 150 min of operation, the temperatures of the stepping motor and motor driver were 82 °C and 44 °C without the HRS, respectively, and approximately 42 °C and 26 °C with the HRS, respectively.
3.2 Light distribution
As illustrated in Fig. 5, the average CV was 3.3% and was independent of the rotation angle of the TRD. Although there was a slight difference at the rotation angles of 20° and 25°, it might be caused by experimental error.
3.3 Evaluation of image quality
The sequential image pairs of TIGs [Figs. 6(a) and (b)] and CIG [Fig. 6(c)] were captured and analyzed. Figures 6(d)-6(f) show the distribution map of absolute value of pixel differences for an example image pair. The SSIM for sequential image pairs of CIG and TIGs were calculated and compared as shown in Fig. 6(g). The results show no significant difference in image quality in terms of stability of SSIM during the rotation of TRD.
3.4 Depth perception simulation of human stereovision
According to the results of the distance calculation, the current setup of the SSVIM-TRD, with LTRD ≈20 mm, α ≈16°, and F = 75 mm, generates an angular difference between left and right images of θ ≈4.2°, which is similar to human vision at a distance of around 87 cm from the eyes.
3.5 Subjective analysis of depth perception
Scores of each group were normally distributed (Kolmogorov-Smirnov test, p-value > 0.05). A paired-samples t-test was therefore used to compare the correct scores of respondents for 2D and stereoscopic images. There was a significant difference between correct scores of the 2D images group (M = 0.153, SD = 0.104) and the stereoscopic images group (M = 0.537, SD = 0.303); t = −4.410, p-value = 0.0017.
When compared to conventional dual-channel SIMs, SSVIM-TRD may have several advantages as follows: reduction in the size and cost of the system components, adjustable depth perception, and agreement of image characteristics such as colors and brightness. Similar previous studies [5,16] and our previous studies [17,22] had some limitations, such as fixed parallel image disparity, limited adjustable display frequency, wire connection of active shutter glasses, size of the TRD, and motor oscillation. All of these can be mostly solved using the SSVIM-TRD.
The SSVIM-TRD benefits from the use of a single objective lens that leads to a converging optical axes stereoscope. As a result, it can adjust the relative position of stereoscopic image and display screen by controlling the convergence and focus. Most stereoscopic displaying systems have side effects that can be minimized by ensuring the image disparity and relative position of the observed image are compatible.
A big image disparity between the left and right views is similar to that when observing objects that are very close to one’s eyes; moderate fatigue and nausea are therefore inevitable in long-term observation of a stereoscopic image. In the SSVIM-TRD, the image disparity can be adjusted by regulating the rotation angle of the TRD and the lens system for different working distances and magnifications [Fig. 3]. Therefore, in addition to image disparity sufficient to provide perfect stereovision, this feature can reduce the associated side effects. However, there are some limitations on the increase of image disparity in a SSVIM-TRD due to the practical limitations on the increase of the rotation angle and decrease of the working distance. Compared to new dual-aperture single camera stereoscopic imaging system, SSVIM-TRD may have a bigger lens system due to the TRD and motor, and a slightly lower frame rate; however, the adjustable image disparity is its main advantage. It should be noted that there is no spatial conflict in size and position of the two virtual apertures in SSVIM-TRD, as well as less difficulty in blocking the light path and preserving color quality .
Regardless of the display methods, the location of the stereoscopic image as perceived by the brain is dependent on the convergence point of the imaging modality . When a parallel imaging mode is adopted, the stereoscopic image is observed in front of the display screen. However, when the left and right images are acquired by focusing at a certain point by converging two optical axes, as in the current imaging modality, the stereoscopic image may be partially or fully behind the display screen. It has been shown that the most comfortable zone for image formation is behind the screen , and this is present only in converging optical axes designs.
The other important advantage of a SSVIM-TRD is that the focusing area of the camera is always exactly at the convergence point of the optical axes because of the use of a single optical channel and camera. Hence, for different working distances, there is no need to readjust the imaging modality to put the final image in the comfortable zone. In fact, the output of such an approach would be placed in the best possible situation in the comfortable zone; however, users can ignore this feature and change it depending on the application and lens combination.
The objective lens of the SSVIM-TRD was selected from among achromatic doublets with a 50.8 mm diameter to minimize all possible aberrations. However, because of the oblique TRD direction and the incomplete symmetry of the incoming light pathway through the objective lens, a few distortions and aberrations may influence the image quality. Distortions and aberrations may be removed by either replacing the optics or image processing. Rectification has been performed previously in SIMs, including PCS systems , and may be applied to improve the image quality.
The heat generation by both the motor and motor driver resulted in a temperature rise less than the maximum temperature given by the product specifications: the insulation class type-B motor has a maximum operation temperature of 130 °C, and the SLA7062M driver has an operation temperature range of −20 °C to + 85 °C. However, a more efficient HRS may be required for operation periods longer than the 150 min period used in this study.
The light distribution for the rotation angles of the TRD was quasi-consistent up to 40° for both CW and CCW rotations. The slight variations in the CV might be caused by internal reflection and surface reflectance of the TRD during the experiment. Despite the minor difference, it may be expected that the CV will remain consistent at larger rotation angles. In addition to the light distribution analysis, morphological image analysis and mathematical measurements  may be beneficial to allow any possible distortions and aberrations to be eliminated, resulting in better image quality.
The hunting oscillation of the stepping motor may introduce image distortion in an unfocused image during image acquisition. However, this study found that there was no significant change in image quality in terms of stability of SSIM between consecutive TIGs and the CIG. It must be noted that the SSIM index is a full reference metric approach for image quality assessment and it cannot be expected to achieve absolute similarity, even in sequential images, because of possible electronic noise in the acquisition device or small mechanical movements of the device. However, the 0.965 SSIM for consecutive images indicates high similarity in both TIGs and CIG. Nonetheless, at a higher rotation speed or bigger rotation angle, replacing the stepping motor with a DC or servomotor may eliminate the possible hunting oscillation, lead to more accurate rotation, and enhance the motion profile.
As the most important goals of all SIMs, depth perception enhancement and stereoscopic vision were investigated in a subjective analysis, and the improvements were greater than expected. Although this study allowed 60 s for each test, the observations indicated that most respondents were able to answer the test much faster. Although stereoscopic image processing was applied to all images and all of them were shown in 3D mode, respondents were able to identify almost 91% of real and sham stereoscopic situations. Further studies are needed to define depth perception improvement by stereoscopic systems in various situations. It may be beneficial to design a more reliable and comprehensive experiment to compare different stereoscopic imaging methods and devices.
Some of the contributing factors can strongly affect the others. For example, it is obvious that a greater length of deflector can generate greater image disparity with a constant rotating angle. However, the length of the deflector is inversely proportional to the effective entrance area of the deflector, which in turn is directly proportional to image brightness and inversely proportional to image disparity. On the other hand, with a constant frequency, as the rotation angle increases, the image disparity and the required torque of the motor will increase and the effective entrance area of the deflector will decrease. Therefore, the following contributing factors in the SSVIM-TRD should be considered: the length, effective area, refractive index, and rotation angle of the TRD; rotor inertia and torque; image disparity and brightness; vibration noises; and motor size. In addition, the optimal location for stereoscopic image formation, frame rate sufficient for image display, and compatibility of the image disparity with human eyes should be considered to enhance stereoscopic image quality.
In addition to the limitation of contributing factors adjustment, it should be noted that SSVIM-TRD needs faster camera, and consequently twice duty cycle of camera in comparison with dual camera stereo-imaging systems. It means that the exposure time could be affected by rotation time; therefore, more light sources and faster motors sometimes may be needed.
In conclusion, the SSVIM-TRD, which is based on a single optical channel and detector, has been developed to obtain real-time live stereoscopic video images. Optimization of the aforementioned contributing factors, the size of the camera sensors, and various lens combinations provides an opportunity for more studies on the topic and development of commercial products with various applications. The SSVIM-TRD may be applied to a variety of fields from entertainment to medicine. Furthermore, it has the potential to become an economical compact SIM if the system components can be miniaturized.
References and links
1. B. Mendiburu, 3D Movie Making: Stereoscopic Digital Cinema from Script to Screen (Focal Express, 2009), pp. 2. 21. 25–28. 73–77.
2. B. G. Blundell and A. J. Schwarz, Volumetric Three-Dimensional Display Systems (John Wiley & Sons, 2000), pp. 12–16.
4. S. T. Barnard and M. A. Fischler, “Computational stereo,” Comput. Surv. 14(4), 553–572 (1982). [CrossRef]
5. W. Choi, V. Rubtsov, and C. J. Kim, “Miniature flipping disk device for size measurement of objects through endoscope,” J. Microelectromech. Syst. 21(4), 926–933 (2012). [CrossRef]
9. P. Merkle, K. Müller, and T. Wiegand, “3D video: Acquisition, coding, and display,” IEEE Trans. Consum. Electron. 56(2), 946–950 (2010). [CrossRef]
10. F. L. Kooi and A. Toet, “Visual comfort of binocular and 3D displays,” Displays 25(2-3), 99–108 (2004). [CrossRef]
11. W. B. Ng and Y. Zhang, “Stereoscopic imaging and computer vision of impinging fires by a single camera with a stereo adapter,” Int. J. Imaging Syst. Technol. 15(2), 114–122 (2005). [CrossRef]
12. D. Lee and I. Kweon, “A novel stereo camera system by a biprism,” IEEE Trans. Robot. Autom. 16(5), 528–541 (2000). [CrossRef]
13. H. H. P. Wu, “Rectification of stereoscopic video for planar catadioptric stereo systems,” IEEE Trans. Circ. Syst. Vid. 17(6), 686–698 (2007). [CrossRef]
14. K. B. Lim and Y. Xiao, “Virtual stereovision system: new understanding on single-lens stereovision using a biprism,” J. Electron. Imaging 14(4), 043020 (2005). [CrossRef]
15. S. A. Nene and S. K. Nayar, “Stereo with mirrors,” in Proceedings of the IEEE International Conference on Computer Vision, (Institute of Electrical and Electronics Engineers Inc.1998), 1087–1094. [CrossRef]
16. G. Chunyu and N. Ahuja, “A refractive camera for acquiring stereo and super-resolution images,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, (IEEE Computer Society, 2006), 2316–2323. [CrossRef]
18. Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: From error visibility to structural similarity,” IEEE Trans. Image Process. 13(4), 600–612 (2004). [CrossRef] [PubMed]
19. J. Hong, Y. Kim, H. J. Choi, J. Hahn, J. H. Park, H. Kim, S. W. Min, N. Chen, and B. Lee, “Three-dimensional display technologies of recent interest: principles, status, and issues [Invited],” Appl. Opt. 50(34), H87–H115 (2011). [CrossRef] [PubMed]
20. I. Sexton and P. Surman, “Stereoscopic and autostereoscopic display systems,” IEEE Signal Process. Mag. 16(3), 85–99 (1999). [CrossRef]
21. Z. Qin, K. Wang, F. Chen, X. Luo, and S. Liu, “Analysis of condition for uniform lighting generated by array of light emitting diodes with large view angle,” Opt. Express 18(16), 17460–17476 (2010). [CrossRef] [PubMed]
22. W. H. Jang, H. Kang, T. Son, J. Park, E. Jun, and B. Jung, “A new concept of stereoscopic imaging system using single optical channel and a deflector: Pilot study,” in Proceedings of SPIE, Progress in Biomedical Optics and Imaging, 89491P (2014).
23. S. Y. Bae, R. J. Korniski, J. M. Choi, M. Shearn, P. Bahrami, H. Manohara, and H. K. Shahinian, “Development of a miniature single lens dual-aperture stereo imaging system towards stereo endoscopic imaging application,” Opt. Eng. 51(10), 103202 (2012). [CrossRef]
24. Z. Wang and A. C. Bovik, “A universal image quality index,” IEEE Signal Process. Lett. 9(3), 81–84 (2002). [CrossRef]