We investigated the question of how the perception of three-dimensional information reconstructed numerically from digital holograms of real-world objects, and presented on conventional displays, depends on motion and stereoscopic presentation. Perceived depth in an adjustable random pattern stereogram was matched to the depth in hologram reconstructions. The objects in holograms were a microscopic biological cell and a macroscopic metal coil. For control, we used real physical objects in additional to hologram reconstructions of real objects. Stereoscopic presentation increased perceived depth substantially in comparison to non-stereoscopic presentation. When stereoscopic cues were weak or absent e.g. because of blur, motion increased perceived depth considerably. However, when stereoscopic cues were strong, the effect of motion was small. In conclusion, for the maximization of perceived three-dimensional information of holograms on conventional displays, it seems highly beneficial to use the combination of motion and stereoscopic presentation.
© 2011 OSA
Digital holography is a technology for imaging three-dimensional (3D) objects. A digital hologram, recorded with a digital camera, results from the interference between a reference wave front and an object wave front reflected from or transmitted through an object. When the propagation directions of the reference and object waves are collinear (in-line geometry) several holograms can be combined to approximate the full Fresnel field (amplitude and phase) in the camera plane and therefore reconstruct the complex field in any object plane . However, when an angle is introduced (off-axis geometry) a single hologram can be sufficient to reconstruct the digital complex wave [2,3]. We can refer to both off-axis single captures and full Fresnel fields (multiple captures combined through phase-shifting interferometry , for example) as digital holograms.
Digitally recorded holograms can be reconstructed numerically or alternatively optoelectronically. In principle, optoelectronic reconstruction allows one to view the depicted object or scene from different angles and accommodate (focus) to different depths in the reconstruction. Similarly, by using numerical reconstruction it is possible to compute different angular views of the hologram to different depths. While writing this paper, the development of optoelectronic holographic display technology is in general not mature enough to admit affordable high-quality displays. Therefore, before this technology reaches that point, displaying numerical reconstructions on conventional stereo or non-stereo displays seems to be a useful alternative. There is a precedent for the preparation of optically captured 3D data for display on conventional stereo displays, from the field of confocal microscopy . It should be noted, however, that unlike holograms and real physical 3D scenes, conventional stereoscopic displays only have focus cues that specify the depth of display surface but not other depth planes in the scene. Therefore, the natural correspondence between convergence of the eyes and focus is mostly disrupted, which may cause perceptual distortions, double images, and even eyestrain [6,7].
Any single numerical hologram intensity reconstruction can only contain monocular depth information, such as texture gradients and linear perspective, and shading. The use of multiple numerical reconstructions of different perspectives and reconstruction depths allows one to extract much more of the 3D information from holograms, however. Since conventional stereo displays are far from being truly 3D it is of interest to investigate the general question of how the information from multiple reconstructions should be presented on these displays to optimise 3D perception. For example, one way of combining hologram reconstructions is to generate two reconstructions that form a stereo pair so that one of them is near-focused and one far-focused . Because of binocular fusion this approach gives a single perception of increased depth of focus combined with stereoscopic depth. In general, the knowledge of the relationship between information presentation and perception will allow the design of better ways of mediating the depth information of hologram reconstructions on conventional displays.
1.1 Depth perception is affected by a number of factors
Monocular depth cues are properties of images that do not require the use of two eyes for the perception of depth. In ordinary natural views there are many such cues, e.g. linear perspective, texture gradients, shading, accommodation, atmospheric effects, and occlusion . Stereoscopic cues are based on the fact that a 3D object or view produces slightly different images in the two eyes. In these images, the local differences of corresponding points, i.e. disparities, code 3D shape and depth. Angular disparities increase as the depth differences in a view increase. Motion parallax is also an effective depth cue. It refers to the relative motion of images of object points at different distances when the observer or the object moves .
When one views complex natural objects or scenes, usually a multitude of depth cues are simultaneously available and can contribute to the perception of depth and 3D shape. The interaction of cues can occur in many possible ways [9,10] and result in more accurate perception of depth.
In this study, we consider the use of stimuli that combine stereoscopic presentation and motion from a set of digital hologram reconstructions. Therefore, the interaction of binocular disparity and motion is of particular interest. The findings of Nawrot and Blake  suggested that information about stereopsis (binocular disparity) and information about structure from motion (motion parallax) are integrated in the brain. Tittle and Braunstein  showed that stereoscopic and motion parallax cues co-operate facilitatively in supra-threshold judgement of depth. Bradshaw and Rogers  found that there is considerable sub-threshold summation of motion parallax and binocular disparity. According to their study, the combination of both kinds of depth cues resulted in a decrease of thresholds nearly by a factor of 2.
Thus, in the basic research literature it is well established that motion parallax and binocular disparity can summate both at threshold level as well as in supra-threshold perception of depth. However, in the basic research literature, nearly all studies of cue interaction seem to have used synthetic – mainly computer generated – objects. In this study, we investigated the potential of applying these results to the perception of reconstructions from holograms of complex real-world objects.
The specific purpose of this study was to investigate the question of how the perception of depth in hologram reconstructions depends on the availability of stereoscopic and motion depth cues. We used non-stereoscopic and stereoscopic images with and without motion. The paper shows that using multiple numerical reconstructions presented as combinations of motion and stereo pairs indeed can significantly enhance the mediation of holographic depth information to the viewer. In addition, the effect of spatial frequency content of images on perceived depth was explored.
2.1 Stereoscopic depth matching tool
A random pattern stereoscopic software tool was developed for the estimation of perceived depth in objects and images (see Fig. 1 ). The principle of the random dot stereogram  was utilized for producing stereoscopic depth. The tool consisted of a random pattern noise field with 256 grey levels and of the size of 450 × 512 pixels. The random field had two rectangular areas, each of 450 × 128 pixels. The stereoscopic depth of the rectangular areas could be adjusted by using graphical sliders. The depth was produced so that the position of the right eye image of the rectangular area was changed either to the right or left. A rightward change (positive disparity) produced depth behind the display surface and a leftward change (negative disparity) produced depth in front of the display surface. By means of two graphical sliders the observer adjusted the stereoscopic depth of the upper and lower rectangles so that they corresponded to the perceived depth of certain points (typically the near edge and far edge) in the images displayed beside the tool.
2.2 Computing stereoscopic depth
The disparity (η) between the left and right eye images on the display surface is defined here as the difference in the horizontal positions of corresponding points in the images projected to the right (R′) and left eye (L′) η = R′ – L′. Thus, if the object is located behind the screen, η is positive (Fig. 2(A) ), and if the object is located in front of the screen, η is negative (Fig. 2(B)). From similar triangles L′R′P and LRP in Fig. 2(A) we haveFig. 2(B), where the object plane is in front of the screen. In Eq. (1), η = R′ – L′ is the disparity on the display surface, α is the distance between the eyes measured individually for each observer (6 - 7 cm), D is the viewing distance, i.e. the distance between the observer and the display, and Δd is the depth relative to display surface. Depth Δd can be solved from this equation asEq. (2) also applies to negative depths where the object ‘protrudes’ out of the display towards the observer.
2.3 Physical 3D objects
For validating the perceived depth measurements we used real (physical) objects placed on the surface of the display or at a short distance behind the display surface plane on the left side of the display (see Fig. 3(A) ). The object on the display surface was a matchbox that was hanging by two threads so that its longest side was pointing directly out of the surface towards the observer. The side length was 5.3 cm. The object placed on the left side of the screen 8 cm behind the display surface plane was the back of a book for four observers and a rim of a table lamp shade for one observer.
2.4 Hologram sequences
One sequence of holograms was macroscopic (Fig. 3(B)) and one microscopic (Fig. 3(C)). In the macroscopic hologram sequence, the object was a small metal coil. The coil object was captured at different angles by rotating the object out-of-plane about the vertical axis from left to right in steps of 3 deg. The video was trimmed to 23 frames and shown at a rate of 15 frames per second and looped back and forth continuously during the perceptual task. The amplitude of the rotational movement was 73 deg. The size of the reconstruction was 512 × 512 pixels, which corresponds to 13.8 × 13.8 cm2 and 8.8 × 8.8 deg2 of visual angle at a viewing distance of 90 cm.
The microscopic object was a K562 leukaemia cell captured by a transmission digital holographic microscope (DHM T1000® from Lyncée Tec). The cell, surrounded by a physiological medium containing other smaller pieces of matter, was maintained in a dielectric field cage and was rotated using dielectrophoretic motion . The recorded hologram, corresponding to the interference between a reference wave and the optical wave passing through the cell, is processed to reconstruct the cell depth map for each orientation [2,3]. The video was trimmed to 8 frames and shown at a rate of 15 frames per second looped back and forth continuously during the perceptual task. The amplitude of the rotational movement was 39 deg. A longer sequence was not used because in that case the adjacent matter would have overlapped the leukaemia cell. The size of the reconstruction was 539 × 412 pixels corresponding to 14.5 × 11.1 cm2 and 9.2 × 7.1 deg2 at a viewing distance of 90 cm.
2.5 Stereoscopic video from non-stereo image sequence
Stereoscopic videos were prepared for a stereoscopic screen. The original videos depicted objects (coil and cell) that were rotating with constant speed about the vertical axis producing motion in depth. The vertical axes of rotation were approximately in the centre of the objects.
It was possible to use successive frames of reconstructed holograms as a stereo pair. The left and right eye images were taken pair-wisely from the sequence so that in motion from left to right the left image of a pair was presented to the right eye and the right image to the left eye. The right hand image of the previous pair always became the left hand image of the next pair.
2.6 Stereoscopic display
The stereoscopic display (Hyundai W240 S) presented the left and right eye images as interlaced with respect to odd and even pixel rows. Odd and even pixel rows had opposite polarisation. The observer wore glasses where the left and right sides also had opposite circular polarisation. Thus, every other pixel row was seen by the left eye and the interleaving rows were seen by the right eye. The advantage of circular polarisation over linear polarisation was that such a system is insensitive to possible small tilt of the observer’s head. The size of the display was 24 inches diagonally. The resolution was 1920 × 1200 pixels. Because of interlacing the resolution at each eye was 1920 × 600 pixels. The pixel size was 0.027 cm.
2.7 Luminance, gamma, and contrast
The stimuli were presented approximately in the middle of the display in the vertical direction. The display was at a distance of 90 cm from the observer. The photometrically (Minolta LS 110) estimated gamma was 1.83 in the middle of the display. When the observer wore polarising glasses the maximum luminance of the display was about 30 cd/m2. The range of luminances in the coil stimulus was 0.22 - 10 cd/m2. Thus, the Michelson contrast [cM = (Lmax-Lmin)/(Lmax + Lmin)] was equal to 0.96. The range of luminances and Michelson contrast for the cell object were 0.22 - 15.6 cd/m2 and 0.97, respectively.
For all real physical objects (Fig. 3(A)) and hologram reconstructions of real objects (Figs. 3(B) and 3(C)) the observers estimated perceived depth by using the depth-matching tool, in which the ‘method of adjustment’ was utilized. The observers were required to adjust the stereoscopic depth in the random pattern so that the depth was subjectively equal to their perception of the depth in the test objects. The hologram reconstructions were presented in four different conditions: non-stereo (‘mono’) static, stereo static, non-stereo with motion, and stereo with motion. The purpose of using real physical objects in addition to reconstructions from holograms of real objects was to show that the stereoscopic depth-matching tool produces roughly correct results with little variability, that is, that the tool is valid and reliable (see results in Section 3).
For the matchbox object the task was to adjust stereoscopic depth perceptually equal to the depth of the side closest to the observer, which was 5.3 cm in front of the display surface. With the book object and lamp object the task was to produce depth behind display surface (8 cm).
For the coil object, the task was to adjust stereoscopic depths equivalent to the perceived depth of the near and far edge of the coil when the coil was pointing towards the observer.
For the K562 leukaemia cell object, the task was to adjust stereoscopic depths equivalent to the perceived depths of the near and far surfaces of the cell. The cell was a phase-modulating object and in phase contrast reconstructions appeared partially transparent. Therefore, the positions of the small approximately geostationary satellite matter on each side in Fig. 3(C) depended on the orientation of the cell. Also, because of partial transparency, the structures behind the front surface of the cell could be seen when it was moving, since the structures at the near and far parts of the cell were moving in opposite directions. To some extent, this effect resembles the kinetic depth effect [11,16,17].
The mean of six matches was taken as the estimate of the perceived depth, corresponding to the perception of purely stereoscopic depth.
Altogether five observers participated in the experiments. Four observers were familiar with digital holography and 3D imaging. Three of them had limited or no experience of visual psychophysical experiments. One observer ('RN') had an extensive experience of psychophysical experiments. One of the observers ('MH') was completely naïve as to the purpose of the study and did not have any knowledge of holography or experience of visual psychophysical experiments.
The validity of the stereoscopic depth matching tool was evaluated by using the physical 3D objects mentioned previously: the front object (a matchbox) was placed on the surface of the display, and the book object (a lamp object in the case of one observer) was placed behind the display plane on the left side of the display. The combined results of all five observers are shown at the bottom of Fig. 4 . The mean of the matched distances averaged across observers was 8.74 cm (standard error = 0.27) and 5.61 cm (standard error = 0.79) for the rear object (true depth = 8 cm) and the front object (true depth = 5.3 cm), respectively. Thus, the estimates produced with the stereoscopic depth-matching tool were reasonably close to the real values. According to Johnston , depth perception of a cylinder pattern is nearly veridical at a distance close one metre, while at viewing distances longer or shorter than that depth perception is underestimated or overestimated, respectively. The viewing distance used in our study was 90 cm, at which, in agreement with Johnston’s findings, nearly veridical depth perception was observed
Using the stereoscopic depth-matching tool the observers also produced matches for the holographic coil and cell objects. These results are also shown in Fig. 4. When the coil was presented as non-stereo and static, the matched stereoscopic depth was only 1.73 cm while the true depth of the depicted object was 9.9 cm. Thus, the matched stereoscopic depth was hugely underestimated. Adding motion to the non-stereo presentation of the same object increased the matched stereoscopic depth to 3.7 cm, which was still grossly underestimated. The difference between the depth estimates with and without motion was statistically significant (Mann-Whitney test: U = 396.5; p = 0.0204). Stereoscopic presentation of the coil object as a static image or with back and forth rotational motion produced somewhat overestimated matched stereoscopic depths on average. These were 12.4 cm for the static stereoscopic presentation and 11.3 cm for stereoscopic presentation with motion. The latter was closer to the real depth value of 9.9 cm. The difference between these results was quite small but statistically significant (U = 172.5; p = 0.0034).
For the cell object presented as non-stereo and with rotational back and forth motion the matched stereoscopic depth was 1.78 cm and when presented as stereo and with motion it was 4.26 cm. The true depth on the display was about 5.8 cm. The difference between static and motion presentation was statistically highly significant (U = 622; p<0.0001). Note that without motion depth estimates for the cell object, with or without stereoscopic presentation, could not be produced. Thus, the interaction of motion and stereo was particularly substantial in this case.
The cell object appears to be relatively blurred in comparison to the coil object. This can be seen in the amplitude spectra of the objects shown in Fig. 5 . The spectrum of the cell object declines much more rapidly than the spectrum of the coil object. Thus, we hypothesize that the lack of high spatial frequencies could make the computation of stereo correspondence more difficult and, therefore, reduce stereoscopic depth information. The reduced stereoscopic information could have led to the inability to see depth in the static stereoscopic cell object. This might have been compensated for by motion parallax, which could explain the additive effect of motion on perceived depth in the presence of stereo. In order to test this hypothesis we performed an additional experiment where we measured perceived stereoscopic depth for the coil object blurred by various amounts. The blurring was produced by Fourier-domain filtering with a circularly symmetric Butterworth low-pass filter of various cut-off frequencies  defined asFig. 6 .
The results for the low-pass filtered static stereo coil object and stereo coil object with motion are shown in Fig. 7 . As Fig. 7 shows, considerable blurring had a clear effect on the matched stereoscopic depth for the static stereoscopic image. The effect became gradually less prominent as the cut-off frequency of the filter increased. However, when the stereoscopic presentation was combined with motion the effect of blur was considerably smaller. This result supports the hypothesis that the absence of high spatial frequencies may explain the additivity of stereo and motion in perceived depth for the cell object.
In order to take a little bit closer look at the role of spatial frequencies in the perception of depth in natural looking 3D objects, we studied the perceived depth in coil images that were band-pass filtered to different spatial frequency bands. In this case we used a log-Gaussian band-pass filter in the Fourier domain given byFig. 8 .
The results are shown in Fig. 9 . As Fig. 9 shows the bandwidth of depth perception is quite wide and even wider when the object is moving. For a wide range of spatial frequencies motion does not have any substantial additive effect in comparison to static stereo presentation. Motion increases depth perception mainly at the extreme spatial frequency ends where the amount of image information is low.
The results showed that in non-stereo presentation of hologram reconstructions motion could increase perceived depth in comparison to static image presentation. Further, stereoscopic presentation increased perceived depth considerably in comparison to non-stereoscopic presentation if stereoscopic cues are strong. Considerable image blur weakened depth perception so that perceived depth became small. Thus, very low spatial frequencies, at least relative to object size, do not seem to mediate stereo information efficiently. The results obtained by using band-pass filtered images showed that both at very low and very high spatial frequencies perceived depth was reduced. Motion seemed to alleviate this effect, however.
4.1 Spatial frequency effects
According to the present results the spatial frequency bandwidth of supra-threshold perception of depth at half height was roughly four octaves. At the both ends of spatial frequency range the visibility of image features became weak and of low contrast. This reduced the visibility of monocular cues as well as probably made the neural computation of binocular correspondence in the brain difficult and, therefore, reduced disparity information. Thus, there was less information available to tell the observer that the object was extended in depth, and thus at the lowest and highest spatial frequencies the object looked essentially flatter. Previous research on spatial scale and human stereovision suggests that stereo information is mediated by multiple spatial-frequency selective mechanisms . According to the present results supra-threshold depth perception operates at least at a four-octave range of spatial frequencies of a natural like object. When stereoscopic presentation is combined with motion the bandwidth of depth perception is further increased.
4.2 Summation of binocular disparity and motion cues
In the basic research literature, nearly all studies of cue interaction seem to have used synthetic computer generated objects. Our study with natural like complex objects confirms the existence of the interaction of stereo and motion. It also suggests that motion has a particularly strong relative additive effect on perceived depth (supra-threshold perception) when stereoscopic cues are weak or ambiguous — as in the case of blur or when only high spatial frequencies are present — but when the stereoscopic cues are strong the additive effect is absent or small.
4.3 Underestimation of depth in non-stereo images
Depth was considerably underestimated in static non-stereo images. This may partly be due to the lack of depth cues, i.e. stereo and motion parallax. On the other hand, since the image was viewed using two eyes, there is explicit stereoscopic information that tells the observer that the image is actually flat. This effect may be accentuated when compared with stereoscopic depth, since stereoscopic cues produce a very vivid experience of depth.
4.4 Which setup of hologram reconstruction is most suitable for combining motion and stereo?
In this study we produced each stereoscopic presentation by using reconstructions from successive frames of a hologram sequence of a rotating scene, as detailed previously. This method is only suitable for hologram sequences of scenes whose objects change position monotonically, e.g. rotating scenes. An alternative to this setup would be to compute reconstructions from a single hologram corresponding to the views seen by the left and right eye, and therefore not require rotation of the object. Reconstructions of different areas of a hologram correspond to different views . Thus, the reconstruction of an area from the left half of a hologram could be used as the left eye image and the reconstruction of an area from the right half of the hologram could be used as the right eye image. This alternative setup would be suitable for hologram setups where there is sufficient propagation distance between the object (or image of the object) and sensor, such as a regular digital holography setup for macroscopic scenes  and a Gabor holography setup for microscopic objects . As an extension to this alternative setup, the reconstruction could be done optoelectronically for naked eye viewing . If the reconstructed field covered both eyes of the observer, the captured scene would be perceived stereoscopically.
4.5 Possible effects of speckle noise in reconstructions
Just as with other types of noise , speckle noise reduces the visibility and recognizability of object information to human observers. Speckle noise deteriorates the perception of information at relatively high spatial frequencies in particular . In perception, the computation of stereo correspondence in the brain requires the detection of corresponding features in the left and right eye images. It is most obvious that the detection and matching of corresponding features is limited by speckle noise if its power spectral density is high enough in comparison to the internal neural noise in the visual system. Therefore, depth perception should be enhanced following the reduction of speckle noise to a level where the noise is no longer visible. In addition, if the left and right eye images have no hologram pixels in common (i.e. if no hologram pixel contributes to both left and right eye images), then the speckle patterns of the left and right eye images will be statistically independent. Therefore, the binocular summation of the left and right eye images will reduce perceived noise. Further, if the scene changes sufficiently between successive frames (micrometer movement or distortion would be sufficient for macroscopic diffusely reflective scenes) then the speckle between successive frames will be statistically independent. Assuming the frame rate of the video is high enough (e.g. 25 Hz or more), a further reduction of speckle will occur because of temporal low-pass filtering in the eye.
4.6 Practical implications to viewing of holographic reconstructions of microscopic objects
The present study clearly demonstrates that the viewing method can have a great effect on depth perception of numerical hologram reconstructions presented on conventional displays. If depth cues are weak in holographic reconstructions, it will probably be highly beneficial for the observer to have a possibility to use both motion and stereo presentation in order to maximise 3D information. This conclusion is particularly relevant in such cases where the objects viewed are novel to the observer – as was in the case of the microscopic cell object of this study – and, therefore, the interpretation of image features and monocular depth cues can be difficult. By extension, this would also be an important consideration when choosing effective and impressive content for current optoelectronic hologram reconstruction devices that have technical limitations due to unavailability of sufficiently high-resolution display components (compared to the resolution of conventional holographic plates).
The authors would like to thank Dr. Emmanouil Darakis for capturing the hologram sequence of the coil object. This research received funding from the European Community’s Seventh Framework Programme FP7/20072013 under grant agreement no. 216105 (“Real 3D”), Academy of Finland, and Science Foundation Ireland under the National Development Plan.
References and links
1. Y. Frauel, T. J. Naughton, O. Matoba, E. Tajahuerce, and B. Javidi, “Three-dimensional imaging and processing using computational holographic imaging,” Proc. IEEE 94(3), 636–653 (2006). [CrossRef]
2. P. Marquet, B. Rappaz, P. J. Magistretti, E. Cuche, Y. Emery, T. Colomb, and C. Depeursinge, “Digital holographic microscopy: a noninvasive contrast imaging technique allowing quantitative visualization of living cells with subwavelength axial accuracy,” Opt. Lett. 30(5), 468–470 (2005). [CrossRef] [PubMed]
3. T. Colomb, F. Montfort, J. Kühn, N. Aspert, E. Cuche, A. Marian, F. Charrière, S. Bourquin, P. Marquet, and C. Depeursinge, “Numerical parametric lens for shifting, magnification, and complete aberration compensation in digital holographic microscopy,” J. Opt. Soc. Am. A 23(12), 3177–3190 (2006). [CrossRef] [PubMed]
5. I. J. Cox and C. J. R. Sheppard, “Digital image processing of confocal images,” Image Vis. Comput. 1(1), 52–56 (1983). [CrossRef]
8. T. Lehtimäki and T.J. Naughton, “Stereoscopic viewing of digital holograms of real-world objects,” presented at Capture, Transmission and Display of 3D Video, article no. 39, Kos, Greece, 7–9 May 2007.
9. I. P. Howard and B. J. Rogers, Seeing in Depth, vol. 2 (I Porteous, 2002).
14. B. Julesz, “Binocular depth perception of computer generated patterns,” Bell Syst. Tech. J. 39(5), 1125–1162 (1960).
15. C. Reichle, T. Müller, T. Schnelle, and G. Fuhr, “Electro-rotation in octopole micro cages,” J. Phys. D Appl. Phys. 32(16), 2128–2135 (1999). [CrossRef]
19. R. C. Gonzalez and E. E. Woods, Digital Image Processing, 2nd ed. (Prentice-Hall, Inc., 2002).
21. M. Kempkes, E. Darakis, T. Khanam, A. Rajendran, V. Kariwala, M. Mazzotti, T. J. Naughton, and A. K. Asundi, “Three dimensional digital holographic profiling of micro-fibers,” Opt. Express 17(4), 2938–2943 (2009). [CrossRef] [PubMed]