## Abstract

We provide a review of recent advances in 3D surface imaging technologies. We focus particularly on noncontact 3D surface measurement techniques based on structured illumination. The high-speed and high-resolution pattern projection capability offered by the digital light projection technology, together with the recent advances in imaging sensor technologies, may enable new generation systems for 3D surface measurement applications that will provide much better functionality and performance than existing ones in terms of speed, accuracy, resolution, modularization, and ease of use. Performance indexes of 3D imaging system are discussed, and various 3D surface imaging schemes are categorized, illustrated, and compared. Calibration techniques are also discussed, since they play critical roles in achieving the required precision. Numerous applications of 3D surface imaging technologies are discussed with several examples.

©2011 Optical Society of America

## 1. Introduction

The physical world around us is three-dimensional (3D); yet traditional cameras and imaging sensors are able to acquire only two-dimensional (2D) images that lack the depth information. This fundamental restriction greatly limits our ability to perceive and to understand the complexity of real-world objects. The past several decades have marked tremendous advances in research, development, and commercialization of 3D surface imaging technologies, stimulated by application demands in a variety of market segments, advances in high-resolution and high-speed electronic imaging sensors, and ever-increasing computational power. In this paper, we provide an overview of recent advances in surface imaging technologies by use of structured light.

The term “3D imaging” refers to techniques that are able to acquire true 3D data, i.e., values of some property of a 3D object, such as the distribution of density, as a function the 3D coordinates $(x,y,z)$. Examples from the medical imaging field are computed tomography (CT) and magnetic resonance imaging (MRI), which acquire volumetric pixels (or voxels) of the measured target, including its internal structure.

By contrast, surface imaging deals with measurement of the $(x,y,z)$ coordinates of points on the surface of an object. Since the surface
is, in general, nonplanar, it is described in a 3D space, and the imaging problem is
called 3D surface imaging. The result of the measurement may be regarded as a map of the
depth (or range) *z* as a function of the position $(x,y)$ in a Cartesian coordinate system, and it may be expressed in the
digital matrix form $\{{z}_{ij}=({x}_{i},{y}_{j},),\text{}i=1,2,\dots ,L,\text{}j=1,2,\dots ,M\}$. This process is also referred to as 3D surface measurement, range
finding, range sensing, depth mapping, surface scanning, etc. These terms are used in
different application fields and usually refer to loosely equivalent basic surface
imaging functionality, differing only in details of system design, implementation,
and/or data formats.

A more general 3D surface imaging system is able to acquire a scalar value, such as
surface reflectance, associated with each point on the nonplanar surface. The result is
a point cloud $\{{P}_{i}=({x}_{i},{y}_{i},{z}_{i},{f}_{i}),\text{}i=1,2,\dots ,N\}$, where ${f}_{i}$ represents the value at the *i*th surface point in the
data set. Likewise, a color surface image is represented by $\{{P}_{i}=({x}_{i},{y}_{i},{z}_{i},{r}_{i},{g}_{i},{b}_{i}),\text{}i=1,2,\dots ,N\}$, where the vector $({r}_{i},{g}_{i},{b}_{i})$ represents the red, green, and blue color components associated with
the *i*th surface point. Spectral surface properties may also be
described by vectors of larger dimension.

One principal method of 3D surface imaging is based on the use of “structured light,” i.e., active illumination of the scene with specially designed 2D spatially varying intensity pattern. As illustrated in Fig. 1, a spatially varying 2D structured illumination is generated by a special projector or a light source modulated by a spatial light modulator. The intensity of each pixel on the structured-light pattern is represented by the digital signal $\{{I}_{ij}=(i,j),\text{}i=1,2,\dots ,I,\text{}j=1,2,\dots ,J\}$, where $(i,j)$ represent the $(x,y)$ coordinates of the projected pattern. The structured-light projection patterns discussed herein are 2D patterns.

An imaging sensor (a video camera, for example) is used to acquire a 2D image of the scene under the structured-light illumination. If the scene is a planar surface without any 3D surface variation, the pattern shown in the acquired image is similar to that of the projected structured-light pattern. However, when the surface in the scene is nonplanar, the geometric shape of the surface distorts the projected structured-light pattern as seen from the camera. The principle of structured-light 3D surface imaging techniques is to extract the 3D surface shape based on the information from the distortion of the projected structured-light pattern. Accurate 3D surface profiles of objects in the scene can be computed by using various structured-light principles and algorithms.

As shown in Fig. 1, the geometric relationship between an imaging sensor, a structured-light projector, and an object surface point can be expressed by the triangulation principle as

The key for triangulation-based 3D imaging is the technique used to differentiate a single projected light spot from the acquired image under a 2D projection pattern. Various schemes have been proposed for this purpose, and this tutorial will provide an overview of various methods based on the structured-light illumination.

In a more general sense, actively illuminated structured-light patterns may include spatial variations in all $(x,y,z)$ directions, thus becoming a true 3D structured-light projection system. For example, the intensity of projected light may vary along the optical path of the projected light owing to coherent optical interference. However, most structured-light 3D surface imaging systems use 2D projection patterns. Therefore, in this paper, we restrict our discussions of “structured light” to the uses of 2D structured-light patterns only.

Figure 2 represents a computer animation (Media 1) of a structured-light 3D imaging system to demonstrate its working principle. An arbitrary target 3D surface is illuminated by a structured-light projection pattern. In this particular case, the structured-light pattern is a spatially varying multiple-cycle color spectrum. A color imaging sensor acquires the image of the target 3D surface under the structured-light illumination. In the animation, we dynamically change the geometric shape of the 3D surface. The image captured by the imaging sensor varies accordingly. Based on the distortion of the structured-light pattern seen on the sensed image in comparison with the undistorted projection pattern, the 3D geometric shape of the target surface can be computed accurately.

Numerous techniques for surface imaging by structured light are currently available. In this review, we first classify all techniques into sequential (multiple-shot) or single-shot categories, as illustrated schematically in Fig. 3, which may be regarded as a road map for this technology. If the target 3D object is static and the application does not impose stringent constraint on the acquisition time, multiple-shot techniques can be used and may often result in more reliable and accurate results. However, if the target is moving, single-shot techniques have to be used to acquire a snapshot 3D surface image of the 3D object at particular time instance.

We further classify the single-shot techniques into three broad categories: techniques using continuously varying structured-light patterns, techniques using 1D encoding schemes (strip indexing), and techniques using 2D encoding schemes (grid indexing). Each technique has its own advantages and disadvantages, depending on the specific applications. There are some possibilities of combining different techniques together to achieve some intended benefits. The details of these techniques are provided in Sections 2–5.

Section 6 discusses issues related to performance evaluation of 3D surface imaging systems. Section 7 reviews camera and projector calibration techniques that are critical to the successful operation of any structured-light 3D surface imaging systems. Section 8 provides a few examples of applications.

It would be an impossible task to cover all possible 3D surface imaging techniques in this paper. Instead, we have selected representative techniques and present them in a tutorial fashion that will help readers gain perspective of the entire field as well as understand fundamental technical principles and typical system characteristics.

## 2. Sequential Projection Techniques

#### 2.1. Binary Patterns and Gray Coding

The binary coding [1–4] uses black and white stripes to form a sequence of projection
patterns, such that each point on the surface of the object possesses a unique binary
code that differs from any other codes of different points. In general,
*N* patterns can code ${2}^{N}$ stripes. Figure 4 shows a
simplified 5-bit projection pattern. Once this sequence of patterns is projected onto
a static scene, there are 32 $(={2}^{5})$ unique areas coded with unique stripes. The 3D coordinates $(x,y,z)$ can be computed (based on a triangulation principle) for all 32
points along each horizontal line, thus forming a full frame of the 3D image.

Binary coding technique is very reliable and less sensitive to the surface characteristics, since only binary values exist in all pixels. However, to achieve high spatial resolution, a large number of sequential patterns need to be projected. All objects in the scene have to remain static. The entire duration of 3D image acquisition may be longer than a practical 3D application allows for.

#### 2.2. Gray-Level Patterns

To effectively reduce the number of patterns needed to obtain a high-resolution 3D
image, gray-level patterns are developed. For example, one can use *M*
distinct levels of intensity (instead of only two in the binary code) to produce
unique coding of the projection patterns. In this case, *N* patterns
can code ${M}^{N}$ stripes. Each stripe code can be visualized as a point in an
*N*-based space, and each dimension has *M* distinct
values [4,5]. For example, if $N=3$, and *M* is 4, then the total number of unique code
stripes is 64 $(={4}^{3})$. In comparison, for 64 stripes with a binary code, 6 patterns are
needed. There is an optimization in designing the binary and gray coding patterns.
The goal is to maximize some type of distance measure among all unique code
words [6]. For practical 3D imaging
applications, to be able to distinguish adjacent stripes is important. Figure 5 (bottom) shows an example of gray-level coding
patterns optimized in Hilbert space [7].

#### 2.3. Phase Shift

Phase shift is a well-known fringe projection method for 3D surface imaging. A set of sinusoidal patterns is projected onto the object surface (Fig. 6). The intensities for each pixel $(x,y)$ of the three projected fringe patterns are described as

where ${I}_{1}(x,y)$, ${I}_{2}(x,y)$, and ${I}_{3}(x,y)$ are the intensities of three fringe patterns, ${I}_{0}(x,y)$ is the DC component (background), ${I}_{mod}(x,y)$ is the modulation signal amplitude, $\varphi (x,y)$ is the phase, and*θ*is the constant phase-shift angle.

Phase unwrapping is the process that converts the wrapped phase to the absolute phase. The phase information $\varphi (x,y)$ can be retrieved (i.e., unwrapped) from the intensities in the three fringe patterns:

*k*is an integer representing projection period. Note that unwrapping methods only provide a relative unwrapping and do not solve for the absolute phase. The 3D $(x,y,z)$ coordinates can be calculated based on the difference between measured phase $\varphi (x,y)$ and the phase value from a reference plane [9]. Figure 8 illustrates a simple case, where

#### 2.4. Hybrid Method: Phase Shift +Gray Coding

As we discussed in Subsection 2.3, there are two major problems with phase-shift techniques: the unwrapping methods only provide a relative unwrapping and do not solve for the absolute phase. If two surfaces have a discontinuity of more than $2\pi $, then no method based on unwrapping will correctly unwrap these two surfaces relative to each other. These problems, often called “ambiguity,” can be solved by using a combination of gray-code projection and the phase-shift techniques. Figure 9 shows an example of combining gray-code projection with phase shift in a 32 stripe coding sequence. The gray code determines absolute range of phase without any ambiguity, while the phase shift offers subpixel resolution beyond the number of stripes provided by the gray code [10,12]. However, hybrid methods require a greater number of projections and do not lend themselves well to 3D imaging of dynamic objects.

#### 2.5. Photometrics

Photometric stereo, pioneered by Woodham [13], is a variant approach to shape from shading. It estimates local surface orientation by using a sequence of images of the same surface taken from the same viewpoint but under illumination from different directions [14–16] (Fig. 10). It thus solves the ill-posed problems in shape from shading by using multiple images. Photometric stereo requires all light sources to be point light and only estimates the local surface orientation (gradients $p,q$). It assumes continuities of the 3D surface and needs a “starting point” (a point on object surface whose $(x,y,z)$ coordinates are known) for its 3D reconstruction algorithms.

## 3. Full-Frame Spatially Varying Color Pattern

Major drawbacks of the sequential projection techniques include its inability to acquire the 3D object in dynamic motion or in a live subject such as human body parts. We now present a few single-shot 3D surface imaging techniques that take advantage of color information or a unique encoding scheme in the projection pattern and require only one acquired image of the object under the color pattern illumination to derive the full frame of the 3D image with $(x,y,z)$ coordinates of each visible point in the scene.

#### 3.1. Rainbow 3D Camera

Figure 11 illustrates the basic concept of the
Rainbow 3D Camera [17–25]. Unlike conventional stereo, which must extract corresponding
features from a pair of stereo images to calculate the depth value, the Rainbow 3D
camera projects a spatially varying wavelength illumination onto the object surface.
The fixed geometry of the rainbow light projector establishes a one-to-one
correspondence between the projection angle, *θ*, of a plane of light
and a particular spectral wavelength *λ*, thus providing
easy-to-identify landmarks on each surface point. With a known baseline B and a known
viewing angle *α*, the 3D range values corresponding to each
individual pixel can be computed by using a straightforward triangulation principle,
and a full frame of the 3D range image can be obtained in a single snapshot at the
camera’s frame rate (30 frames/s or faster).

#### 3.2. Continuously Varying Color Coding

It is possible to compose various continuously varying color patterns to encode the spatial location information [24]. For example, we can construct an intensity variation pattern for each color channel of a projector such that, when added together, these patterns in individual color channels form a continuously varying color pattern. Figure 12 shows an example of intensity variation patterns for three additive primary color channels. When they are added together, a rainbow-like color projection pattern is formed. Note that this type of color pattern does not necessarily follow a linear variation relationship in color spectrum (wavelength). However, since the ratios among the contributions from each color channel are known, the decoding scheme is easy to derive and implement.

## 4. Stripe Indexing (Single Shot)

Stripe indexing is necessary to achieve robust 3D surface reconstruction, because the order in which the stripes are observed is not necessarily the same as the order in which the stripes are projected. This is due to the inherent parallax existing in triangulation-based 3D surface imaging systems and possibility to stripes missing from the acquired image because of occlusion of the object 3D surface features. We now present a few representative stripe indexing techniques.

#### 4.1. Stripe Indexing Using Colors

Color image sensors usually have three independent acquisition channels, each
corresponding to a spectrum band. The linear combination of the values of these color
components can produce an infinite number of colors. Three 8-bit channels can
represent 2^{24}
different colors. Such rich color information can be used to
enhance 3D imaging accuracy and to reduce acquisition time. For example, use of color
for stripe indexing in the projection patterns (Fig. 13) can help alleviate the ambiguity problem faced by phase-shift or
multiple-stripe techniques using monochromic patterns [26,27]. This type of
color-coded system can achieve real-time 3D surface imaging capability. It is also
possible to encode multiple patterns into a single color projection image, each
pattern possessing a unique color value in the color space. To reduce the decoding
error rate, one can select a color set in which each color has a maximum distance
from any other color in the set. The maximal number of colors in the set is limited
to the distance between colors that generate minimal cross talk in the acquired
images.

#### 4.2. Stripe Indexing Using Segment Pattern

To distinguish one stripe from others, one can add some unique segment patterns to each stripe (Fig. 14) such that, when performing 3D reconstruction, the algorithm can use the unique segment pattern of each stripe to distinguish them. This indexing method, proposed in [28], is intriguing and clever, but it only applies to a 3D object with a smooth and continuous surface when the pattern distortion due to surface shape is not severe. Otherwise, it may be very difficult to recover the unique segment pattern, owing to deformation of the pattern and/or discontinuity of the object surface.

#### 4.3. Stripe Indexing Using Repeated Gray-Scale Pattern

If more than two intensity levels are used, it is possible to arrange the intensity
levels of stripes such that any group of stripes (a sliding window of
*N* stripes) has unique intensity pattern within a period of
length [29]. For example, if three gray
levels are used (black, gray and white), a pattern can be designed as (Fig. 15)

**,**

*WGB***, etc.**

*GWB*

#### 4.4. Stripe Indexing Based on De Bruijn Sequence

A De Bruijn sequence [30] of rank
*n* on an alphabet of size *k* is a cyclic word in
which each of the ${k}^{n}$ words of length *n* appears exactly once as we
travel around the cycle. A simple example of a De Bruijn circle with $n=3$ and $k=2$ (the alphabet is {0, 1}) is shown in Fig. 16. As we travel around the cycle (either clockwise or
counterclockwise), we will encounter each of the
2^{3} =
8three-digit patterns 000, 001, 010, 011, 100, 101, 110, 111
exactly once. There is no repeated three-digit pattern in the sequence. In other
word, no subsequence is correlated to any other in the De Bruijn sequence. This
unique feature of the De Bruijn sequence can be used in constructing a stripe pattern
sequence that has unique local variation patterns that do not repeat
themselves [31–33]. Such uniqueness makes the pattern decoding an easier task.
The graph associated with a De Bruijn sequence is called a De Bruijn graph [34]. Now we show an example of using binary
combinations of $(R,G,B)$ colors to produce a color-indexed stripe based on De Bruijn
sequence. The maximum number of combinations of three colors is eight (=2${}^{3}$). Since we do not intend to use (0,0,0), we have only seven
possible colors. This problem can be solved by constructing a De Bruijn sequence with $k=7$, $n=3$. This results in a sequence with 343 stripes. If the number of
stripes is too many, one can use a reduced set of a De Bruijn sequence by setting $k=5$, $n=3$ [35]. The number of stripes
in this case is reduced to 125. There is an important constraint in constructing a
color-indexed stripe sequence using the De Bruijn technique: all neighboring stripes
must have different colors. Otherwise, some stripes with double or triple width would
occur, confusing the 3D reconstruction algorithms. This constraint can be easily
applied by using an XOR operant. Figure 17 shows a set of results with actual color-indexed stripe pattern. In
this stripe sequence, all neighboring stripes have different colors. Various
variations on the implementation of De Bruijn techniques can be used to generate
unique color-indexed, gray-scale-indexed, or other types of projection patterns for
3D surface imaging applications.

## 5. Grid Indexing: 2D Spatial Grid Patterns

The basic concept of 2D grid pattern techniques is to uniquely label every subwindow in the projected 2D pattern, such that the pattern in any subwindow is unique and fully identifiable with respect to its 2D position in the pattern.

#### 5.1. Pseudo-random Binary Array (PRBA)

One grid indexing strategy is to use a pseudo-random binary array (PRBA) to produce grid locations that can be marked by dots or other patterns, such that the coded pattern of any subwindow is unique. A PRBA is defined by an ${n}_{1}\times {n}_{2}$ array encoded using a pseudo-random sequence, such that any ${k}_{1}$ by ${k}_{2}$ subwindow sliding over the entire array is unique and fully defines the subwindow’s absolute coordinate $(i,j)$ within the array. The coding pattern of the binary array is generated based on a pseudo-random binary sequence using the primitive polynomial modulo ${2}^{n}$ method [36–40], where ${2}^{n}-1={2}^{{k}_{1}{k}_{2}}-1$, ${n}_{1}={2}^{{k}_{1}}-1$, ${n}_{2}={2}^{n}-1/{n}_{1}$. Figure 18 shows an example of a generated PRBA, where ${k}_{1}=5$, ${k}_{2}=2$, and thus ${n}_{1}=31$, ${n}_{2}=33$.

#### 5.2. Mini-patterns Used as Code Words

Instead of using a pseudo-random binary array, a multivalued pseudo-random array can be used. One can represent each value with a mini-pattern as special code word, thus forming a grid-indexed projection pattern [41]. Figure 19 shows an example of a three-valued pseudo-random array and a set of mini-pattern code words (shown at the lower right of the figure). Using the specially defined code words, a multivalued pseudo-random array can be converted into a projection pattern with unique subwindows.

#### 5.3. Color-Coded Grids

Another grid indexing strategy is to color code both vertical and horizontal stripes so that a 2D grid indexing can be achieved [42–44]. The vertical and horizontal stripe encoding schemes can either be the same or totally different, depending on applications (Fig. 20). There is no guarantee of the uniqueness of subwindows, but the colored stripes in both directions can help the decoding in most situations in establishing the correspondence. The thin grid lines may not be as reliable in pattern extraction as other patterns (dots, squares, etc.).

#### 5.4. 2D Array of Color-Coded Dots

There are alternative methods of generating the pseudo-random array. In [45,46] a brute force algorithm was proposed to generate an array that preserve the uniqueness of subwindows, but it may not exhaust all possible subwindow patterns. The method is relatively intuitive to implement in computer algorithms. For example, Fig. 21 (left) shows a $6\times 6$ array with subwindow size of $3\times 3$ using three code words (R, G, B). The computing procedure is as follows: first fill the upper left corner of the $6\times 6$ array with a randomly chosen pattern. Then, add a three-element column on the right with random code word. The uniqueness of the subwindow is verified before adding such a column. Keep adding the columns until all columns are filled with random code words and subwindow uniqueness is verified. Similarly, add random rows in the downward direction from the initial subwindow position. Afterwards, add new random code words along the diagonal direction. Repeat these procedures until all dots are filled with colors. Again, this computational procedure may not guarantee the generation of a pseudo-random array for all the array sizes and code words, but good results have been achieved for many cases. Figure 21 (right) shows an example of a pseudo-random array with $20\times 18$ dimensions.

#### 5.5. Hybrid Methods

There are many opportunities to improve specific aspect(s) of 3D surface imaging system performance by combining more than one encoding scheme discussed above. Figure 22 shows an example.

## 6. Performance Evaluation of 3D Surface Imaging Systems

There are many factors that characterize technical performance of a 3D surface imaging system. From application point of view, the following three aspects are often used as the primary performance indexes to be used to evaluate 3D imaging systems:

- (1)
**Accuracy.**Measurement accuracy denotes the maximum deviation of the measurement value obtained by a 3D surface imaging system from the grand truth of the actual dimension of the 3D object. Quite often, a 3D imaging system may have different accuracies in different $(x,y,z)$ directions because of the inherent design properties of systems. Also, different manufacturers may use different ways to characterize accuracy. For example, some may use average (mean) error, uncertainty, ±error, RMS, or other statistical values. Therefore, when comparing different systems, one has to understand the exact meaning of any performance claims and compare them in the same framework. - (2)
**Resolution.**In most of the optical literature, optical resolution is defined as the ability of an optical system to differentiated individual points or lines in an image. Similarly, 3D image resolution denotes the smallest portion of the object surface that a 3D imaging system can resolve. However, in the 3D imaging community, the term “image resolution” sometimes also denotes the maximum number of measurement points a system is able to obtain in single frame. For example, a 3D sensor with $640\times 480$ pixels may be able to generate 307,200 measurement points for a single-shot acquisition. Given field of view, standoff distance, and other factors, these two definitions of image resolution can be converted to each other. - (3)
**Speed.**Acquisition speed is important for imaging moving objects (such as the human body). For single-shot 3D imaging systems, the frame rate represents their ability to repeat the full-frame acquisition in a short time interval. For sequential 3D imaging systems (e.g., laser scanning systems), in addition to the frame rate, there is another issue that needs to be considered: the object is moving while sequential acquisition is performed; therefore, the obtained full-frame 3D image may not represent a snapshot of the 3D object at a single location. Instead, it becomes an integration of measurement points acquired in different time instances; therefore the 3D shape may be distorted from the original shape of the 3D object. There is another distinction, between acquisition speed and the computation speed. For example, some systems are able to acquire 3D images at 30 frames/s, but these acquired images need to be postprocessed at a much slower frame rate to generate 3D data.

The above-mentioned three key performance indexes can be used to compare 3D imaging systems. Figure 23 illustrates a primary performance space in which each 3D imaging method may occupy a spot, and multiple 3D imaging systems can then be compared intuitively. Of course, the price/performance ratio and reliability of a system are also important considerations when evaluating a 3D surface imaging system for practical installations.

In addition to the primary performance indexes, there are virtually unlimited numbers of performance indexes that can be used to characterize various specific aspects of 3D imaging systems. For example, there is the depth of field of the 3D imaging system, which refers to a range of standoff distance within which accurate 3D measurement can be obtained. Ultimately, these types of system properties will be reflected on the primary performance indexes (e.g., measurement accuracy, resolution, and speed).

Field of view, baseline, and standoff distance may also be used to characterize the behavior of 3D imaging systems. Structured-light 3D imaging systems usually have limited standoff distance because of limited energy of light projection, while time-of-flight sensors that rely on single laser scanning can reach a distance of miles.

Each type of 3D imaging technique has its own pros and cons, and we should judge a system by its overall performance for intended applications.

## 7. Camera and Projector Calibration Techniques

An essential part of the 3D imaging technology is camera and projector calibration techniques, which play a critical role in establishing the measurement accuracy of 3D imaging systems. Camera calibration is a well-known problem in computer vision. However, surprisingly, this key aspect of 3D imaging technology has not received sufficient attention in many 3D imaging technique reviews, research, and applications articles.

Since most 3D imaging systems use 2D optical sensors, the camera calibration procedures establish the relationship between a pixel on a 2D image (in camera coordinates) and a straight line in 3D space (world coordinates) along which the object point is located, taking lens distortion into consideration. Usually, a simplified camera model and a set of intrinsic parameters are used to characterize the relationships. Several approaches and accompanying toolboxes are available [47–49]. These procedures typically require images at several angles and distances of a known calibration object. A planar checkerboard pattern is a frequently used calibration object because it is very simple to produce, can be printed with a standard printer, and has distinctive corners that are easy to detect. An example image involving such a pattern is shown in Fig. 24. From the images of the calibration pattern 2D to 3D correspondences are constructed.

#### 7.1. Camera Calibration Algorithms

Assume the plane of the planar calibration board in world coordinates to be $Z=0$; then each point on the calibration board becomes $M={[X,Y,0,1]}^{T}$. Therefore, an object point *M* and its image point
*m* are related by a homographic matrix *H*:

*R*the column vector ${r}_{1},{r}_{2},{r}_{3}$ is orthonormal, we therefore have

Each homography can provide two constrains on the intrinsic parameters. As ${K}^{-T}{K}^{-1}$ in the equation above is a symmetry matrix, it can be defined with a 6D vector:

*i*th column vector of

*H*be ${h}_{i}=[{h}_{i1},{h}_{i2},{h}_{i3}]$. Then, we have where ${v}_{ij}={[{h}_{i1}{h}_{j1},{h}_{i1}{h}_{j2}+{h}_{i2}{h}_{j1},{h}_{i2}{h}_{j2},{h}_{i3}{h}_{j1}+{h}_{i1}{h}_{j3},{h}_{i3}{h}_{j2}+{h}_{i2}{h}_{j3},{h}_{i3}{h}_{j3}]}^{T}$. The two constraints can then be rewritten as a homogeneous equation. In order to solve

*a*, at least three images from different viewpoints are needed. In practice, more images are used to reduce the effect of noise, and a least-squares error solution is obtained with singular value decomposition. Finally, the result can be optimized to minimize the reprojection error by minimizing the energy function below:

#### 7.2. Projector Calibration

The calibration of the projector is twofold: as an active light source, the intensity of the projector needs to be calibrated in order to recover the linearity of its illumination intensity, and as an inverse camera, it needs to be geometrically calibrated like ordinary cameras.

### 7.2a. Intensity Calibration of Projector

To enhance the contrast, the intensity curve of the projector is often altered with gamma transformation by projector vender. When used in a 3D imaging system as an active light source, calibration is required to recover the linearity of illumination intensity. To do so, several test patterns are projected, and the projected patterns are captured by the imaging sensor. The relationship between the actual intensity of the projected pattern and image pixel value can be established, which is then fitted with a high-order polynomial function. Then the inverse function is calculated and used to rectify the pattern to be projected in the 3D imaging process (Fig. 25).

### 7.2b. Geometric Calibration of Projector

Consider the projector as an inverse camera; the optic model of the projector is the same as the camera, and the only difference between them is the direction of projection. The inverse model makes the problem of relating a pixel on a 2D image (in camera coordinates) with a straight line in 3D space (world coordinates) difficult, as we cannot tell where a given point in 3D space will be projected in the inverse camera coordinates. The key issue in projector calibration is how the correspondence is established. Once the correspondence is established, the projector can be calibrated by using camera calibration algorithms .

Projector calibration is performed by using a precalibrated camera and a calibration plane. First, the calibration plane is recovered in the camera coordinate system. Then the calibration pattern (Fig. 26) is projected and captured by the camera. 3D coordination of the corner points of the chessboard pattern formed on the calibration plane can be determined by reprojecting the corner points on the captured image onto the planer plate, as the spatial relationship between the camera and the planer plate is already recovered. Finally, the projector can be calibrated by using the point correspondences acquired. This method is straightforward in theory and relatively easy to implement. However, the calibration accuracy of these methods depends heavily on the accuracy of the precalibration of the camera.

## 8. Application Examples of 3D Surface Imaging Technologies

We provide several illustrative examples of interesting applications of 3D imaging technologies. These examples are by no mean exhaustive, and there are numerous applications that we are not able to include here because of space limitation.

#### 8.1. 3D Facial Imaging

Human body parts are ideal objects for 3D imaging. Everyone’s body parts are different. There is no digital CAD model of the human body; thus each body part needs to be modeled by 3D imaging technology. Figure 27 shows an example of a 3D facial image taken by a 3D camera developed by the author. Both the shaded model and the wire-frame model of 3D surface data are shown. There exist numerous applications of 3D facial images, ranging from 3D facial recognition and plastic surgery to personalized gifts made from 3D face of the owner.

#### 8.2. 3D Dental Imaging

Figure 28 shows a few examples of 3D dental images taken by a 3D camera developed by the author. Usually, a single 3D image (left) covers a small section of a dental arch structure. Multiple 3D images are taken to cover the entire surface area of a dental arch, and 3D mosaicing software will be used to piece these multiple 3D image pieces seamlessly together to form 3D model of the entire arch (right).

#### 8.3. 3D Imaging Techniques for Plastic Surgery

3D imaging systems enable plastic surgeons to capture and display a 3D surface profile of the patient’s breast(s) for assessment, presurgery treatment planning, posttreatment verification, patient communication, and documentation. A 3D camera is able to capture all 3D surface data $(x,y,z)$ coordinates and associated 2D image texture data (color overlay). Figure 29 shows an example of 3D surgical planning by using 3D imaging techniques. The patient-specific 3D profile of breasts acquired by a 3D camera can be used by doctor and patient to examine simulated outcomes based upon implant choice, assisting in determining the appropriate implant given volume measurements and possible asymmetry.

#### 8.4. 3D Model of Ear Impression for a Custom-Fit Hearing Aid

More than 28 million Americans suffer from some degrees of hearing impairment, according to the statistics from National Institute on Deafness and Communication Disorders (NIDCD). The current process of manufacturing custom-fit hearing aids is labor intensive and suffers an about one third return-repair-remake rate. 3D imaging technology could replace the traditional physical impression, thus eliminating the cost and time associated with such an error-prone and uncomfortable process. The digital impressions enable hearing aid manufacturers to take advantages of the latest breakthrough in computer-aided-design (CAD) and manufacturing (CAM) technologies and produce a custom-fit hearing aid device within a one-day time frame (Fig. 30). More important, the digital impression technology to be developed herein could improve the quality of fit, thus enhancing the hearing functionality for impaired people.

#### 8.5. 3D Imaging for Reverse Engineering

Many ergonomic products are prototyped by using manual process so the designer can get the touch and feel of the shape profile and optimize it until it “feels good.” Such a manually made prototype can be converted into a 3D CAD file by using a 3D camera system. Figure 31 shows an example of 3D CAD file of a mouse design that was digitized by using a 3D camera built by the author.

#### 8.6. 3D Imaging System for Airbag Analysis

High-speed 3D surface and volume measurements and tracking capability are very important for characterizing the dynamic behavior of airbags in order to optimize airbag design for insuring driver and passenger safety. Traditionally, obtaining 3D surface and volume measurement during airbag explosion is very difficult. Thanks to the advance of 3D imaging technology, acquisition of accurate sequential 3D surface and volume data of the airbag is now possible. Figure 32 shows an example of a dynamic 3D data sequence obtained during a test airbag explosion. These types of 3D data facilitate quantitative analysis of the airbag’s behavior and provide a crucial means for optimizing the airbag designs.

#### 8.7. High-Speed 3D Imaging System for Vehicle Crash Tests

The most detrimental aspect of offset impacts, where only a portion of the vehicle’s front structure is loaded, is the compartment intrusion and violation of the occupant’s survival space. Injury consequences of offset impacts are more closely related to timing, location, and velocity of intrusion relative to the affected body region than to simply the degree of final intrusion measured after the completion of the impacts. Intrusion is also a detrimental factor in side impacts, rollovers, and other crash modes. The National Highway Traffic Safety Administration (NHTSA) is evaluating an offset impact test as a possible future requirement for frontal protection. To effectively and quantitatively evaluate such impacts, a dynamic intrusion sensing system is needed that can perform dynamic measurement of compartment intrusion during staged crash tests. High-speed 3D imaging technology is able to provide accurate and quantitative measurement data of structural deformation of various parts of the vehicle during crash tests. Figure 33 shows an example of full-frame 3D images of a door panel under a test with various 3D data visualization modalities.

#### 8.8. 3D Imaging Technology for Accident Scene Investigation

Traffic accident scenes can be very complex, are open to arguable interpretations, and are difficult for legal representation to communicate with precision. Disputes may arise that often lead to costly and time-consuming legal proceedings. 3D imaging technology can help accurately document an accident scene, thus providing an effective tool for legal and insurance cases. 3D accident scene reconstruction can illustrate the facts clearly and effectively. An accident scene can be reenacted from any camera angle from a virtually unlimited number of possible vantage points (Fig. 34).

## 9. Conclusions

This tutorial provided a comprehensive review of recent advances in structured-light 3D surface imaging technologies and a few examples of their applications to a variety of fields. We established a classification framework to accommodate vast variations of 3D imaging techniques, organized and presented in a systematic manner. Representative techniques are briefly described and illustrative figures are provided to help readers grasp their basic concepts. Performance indexes of 3D imaging systems are also reviewed, and calibration techniques for both camera and projector are presented. Selective examples of applications of 3D imaging technology are presented.

There is a reason why so many 3D imaging techniques have been developed to date. There is no single technique that can be applied to each and every application scenario. Each 3D imaging technique has its own set of advantages and disadvantages. When selecting a 3D imaging technique for a specific application, readers are encouraged to make careful trade-offs among their specific application requirements and to consider key performance indexes such as accuracy, resolution, speed, cost, and reliability. Sometimes, multiple-modality sensor systems will be needed to address demands that cannot be met by a single modality.

3D imaging is an interdisciplinary technology that draws contributions from optical design, structural design, sensor technology, electronics, packaging, and hardware and software. Traditionally, the 3D imaging research activities from these disciplines are more or less independently pursued with different emphases. The recent trend in 3D imaging research calls for an integrated approach, sometimes called the “computational imaging” approach, in which optical design, the sensor characteristics, and software processing capability are taken into consideration simultaneously. This new approach promises to significantly improve the performance and price/cost ratio of future 3D imaging systems and is a worthwhile direction for future 3D imaging technology development.

The field of 3D imaging technology is still quite young, compared with its 2D counterpart that has developed over several decades with multibillion dollar investments. It is our hope that our work in developing and applying 3D imaging technologies to variety of applications could provide some stimulation and attraction to more talented researchers from both theoretical and application background to this fascinating field of research and development.

## References and Notes

**1. **I. Ishii, K. Yamamoto, K. Doi, and T. Tsuji,
“High-speed 3D image acquisition using coded structured light
projection,” in IEEE/RSJ International Conference on
Intelligent Robots and Systems, 2007. IROS 2007
(IEEE, 2007), pp.
925‒ 930.

**2. **K. Sato and S. Inokuchi,
“Range-imaging system utilizing nematic liquid crystal
mask,” in Proceedings of International Conference on
Computer Vision (IEEE Computer Society
Press, 1987), pp. 657‒
661.

**3. **R. J. Valkenburg and A. M. McIvor, “Accurate 3D measurement
using a structured light system,” Image Vision
Comput. **16**(2), 99‒ 110
(1998). [CrossRef]

**4. **J. L. Posdamer and M. D. Altschuler, “Surface measurement
by space-encoded projected beam systems,” Comput. Graph.
Image Processing **18**(1), 1‒ 17
(1982). [CrossRef]

**5. **S. Inokuchi, K. Sato, and F. Matsuda,
“Range-imaging for 3-D object recognition,” in
International Conference on Pattern Recognition
(International Association for Pattern
Recognition, 1984), pp.
806‒ 808.

**6. **D. Caspi, N. Kiryati, and J. Shamir,
“Range imaging with adaptive color structured
light,” IEEE Trans. Pattern Anal. Mach. Intell. **20**(5), 470‒ 480
(May 1998). [CrossRef]

**7. **W. Krattenthaler, K. J. Mayer, and H. P. Duwe, “3D-surface measurement
with coded light approach,” in Proceedings of the 17th
Meeting of the Austrian Association for Pattern Recognition on Image Analysis and
Synthesis (R. Oldenbourg Verlag,
1993), Vol. 12pp. 103‒
114.

**8. **E. Horn and N. Kiryati,
“Toward optimal structured light patterns,”
Image Vision Comput. **17**(2), 87‒ 97
(1999). [CrossRef]

**9. **H. Sagan,
*Space Filling Curves* (Springer,
1994.

**10. **P. S. Huang and S. Zhang,
“A fast three-step phase shifting algorithm,”
Appl. Opt. **45**(21), 5086‒ 5091
(2006). [CrossRef] [PubMed]

**11. **S. Zhang and S. T. Yau, “High-resolution,
real-time 3D absolute coordinate measurement based on a phase-shifting
method,” Opt. Express **14**, 2644‒ 2649 (2006).
[CrossRef] [PubMed]

**12. **S. Siva Gorthi and P. Rastogi,
“Fringe projection techniques: whither we are?,”
Opt. Lasers Eng. **48**(2), 133‒ 140
(2010). [CrossRef]

**13. **R. Woodham,
“Photometric method for determining surface orientation from
multiple images,” Opt. Eng. **19**(1), 134‒ 140
(1980).

**14. **R. Basri and D. Jacobs,
“Photometric stereo with general, unknown
lighting,” in 2001 IEEE Computer Society Conference on
Computer Vision and Pattern Recognition (CVPR 2001) (IEEE
Computer Society, 2001), pp.
374‒ 381.

**15. **A. Treuille, A. Hertzmann, and S. M. Seitz, “Example-based stereo
with general BRDFs,” in Computer Vision—ECCV 2004: 8th
European Conference on Computer Vision, Part II
(Springer, 2004), pp.
457‒ 469.

**16. **T. Higo, Y. Matsushita, N. Joshi, and K. Ikeuchi,
“A hand-held photometric stereo camera for 3-D
modeling,” in 2009 IEEE 12th International Conference on
Computer VisionSept. 2009), pp. 1234‒
1241.

**17. **Z. J. Geng, “Rainbow three-dimensional
camera: new concept of high-speed three-dimensional vision
systems,” Opt. Eng. **35**(2), 376‒ 383
(1996). [CrossRef]

**18. **J. Geng,
“Color ranging method for high speed low-cost 3D surface profile
measurement,” U.S. patent
5,675,407 (7 Oct. 1997).

**19. **J. Geng,
“High speed three dimensional imaging method,”
U.S. patent 6,028,672 (22 Feb. 2000).

**20. **J. Geng,
“High speed three dimensional imaging method,”
U.S. patent 6,147,760 (14 Nov. 2000).

**21. **J. Geng,
“3D surface profile imaging method and apparatus using single
spectral light condition,” U.S. patent
6,556,706 (29 Apr. 2003).

**22. **J. Geng,
“Three-dimensional dental imaging method and apparatus having a
reflective member,” U.S. patent
6,594,539 (15 July 2003).

**23. **J. Geng,
“High speed laser three-dimensional imager,”
U.S. patent 6,660,168 (29 July 2003).

**24. **J. Geng,
“Method and apparatus for 3D imaging using light pattern having
multiple sub-patterns,” U.S. patent
6,700,669 (2 March 2004).

**25. **C. L. Heike, K. Upson, E. Stuhaug, and S. M. Weinberg, “3D digital
stereophotogrammetry: a practical guide to facial image
acquisition,” Head Face Med. **6**(1), 18 (2010). [CrossRef] [PubMed]

**26. **K. L. Boyer and A. C. Kak, “Color-encoded
structured light for rapid active ranging,” IEEE Trans.
Pattern Anal. Mach. Intell. **9**(1), 14‒ 28
(1987). [CrossRef]

**27. **S. Fernandez, J. Salvi, and T. Pribanic,
“Absolute phase mapping for one-shot dense pattern
projection,” 2010 IEEE Computer Society Conference on
Computer Vision and Pattern Recognition Workshops (CVPRW), San
Francisco, Calif., June 3–18,
2010.

**28. **M. Maruyama and S. Abe,
“Range sensing by projecting multiple slits with random
cuts,” IEEE Trans. Pattern Anal. Mach. Intell. **15**(6), 647‒ 651
(1993). [CrossRef]

**29. **N. G. Durdle, J. Thayyoor, and V. J. Raso, “An improved structured
light technique for surface reconstruction of the human trunk,” in
IEEE Canadian Conference on Electrical and Computer Engineering,
1998 (IEEE,
1998), Vol. 2pp. 874‒
877.

**30. **F. J. MacWilliams and N. J. A. Sloane, “Pseudorandom
sequences and arrays,” Proc. IEEE **64**(12), 1715‒ 1729
(1976). [CrossRef]

**31. **H. Fredricksen,
“A survey of full length nonlinear shift register cycle
algorithms,” Soc. Industr. Appl. Math. Rev. **24**(2), 195‒ 221
(1982).

**32. **H. Hügli and G. Maïtre,
“Generation and use of color pseudo-random sequences for coding
structured light in active ranging,” Proc. SPIE **1010**, 75‒ 82 (1989).

**33. **T. Monks and J. Carter,
“Improved stripe matching for colour encoded structured
light,” in Computer Analysis of Images and
Patterns (Springer,
1993), pp. 476‒
485.

**34. **T. Pajdla,
“Bcrf—binary-coded illumination range finder
reimplementation,” Technical Report
KUL/ESAT/MI2/9502 (Katholieke Universiteit
Leuven, 1995.

**35. **L. Zhang, B. Curless, and S. M. Seitz, “Rapid shape acquisition
using color structured light and multi-pass dynamic programming,”
in First International Symposium on 3D Data Processing Visualization and
Transmission, 2002. Proceedings (IEEE,
2002), pp. 24‒
36.

**36. **J. Le Moigne and A. M. Waxman, “Multi-resolution grid
patterns for building range maps,” in Vision ’85, Applied
Machine Vision Conference (ASME) (Society of
Manufacturing Engineers, 1985), pp.
22‒ 39.

**37. **H. Morita, K. Yajima, and S. Sakata,
“Reconstruction of surfaces of 3-D objects by M-array pattern
projection method,” in Second International Conference on
Computer Vision (IEEE Computer Society,
1988), pp. 468‒
473.

**38. **J. Le Moigne and A. M. Waxman, “Structured light
patterns for robot mobility,” IEEE J. Robot.
Automat. **4**(5), 541‒ 548
(1988). [CrossRef]

**39. **P. Payeur and D. Desjardins,
“Structured light stereoscopic imaging with dynamic pseudo-random
patterns,” in *Image Analysis and Recognition*Lecture Notes in Computer Science Vol. 5627/2009
(Springer, 2009), pp.
687‒ 696.

**40. **A. Osman Ulusoy, F. Calakli, and G. Taubin,
“One-shot scanning using De Bruijn spaced grids,”
in 2009 IEEE 12th International Conference on Computer Vision Workshops (ICCV
Workshops) (IEEE,
2009), pp. 1786‒
1792.

**41. **P. M. Grin, L. S. Narasimhan, and S. R. Yee, “Generation of uniquely
encoded light patterns for range data acquisition,”
Pattern Recog. **25**(6), 609‒ 616
(1992). [CrossRef]

**42. **E. M. Petriu, Z. Sakr, H. J. W. Spoelder, and A. Moica,
“Object recognition using pseudo-random color encoded structured
light,” in Proceedings of the 17th IEEE Instrumentation
and Measurement Technology Conference, 2000. IMTC 2000
(IEEE, 2000), Vol.
3pp. 1237‒ 1241.

**43. **J. Pagès, J. Salvi, and C. Matabosch,
“Robust segmentation and decoding of a grid pattern for structured
light,” in *Pattern Recognition and Image Analysis*Lecture Notes in Computer Science Vol. 2652/2003
(Springer, 2003), pp.
689‒ 696.

**44. **A. Osman Ulusoy, F. Calakli, and G. Taubin,
“Robust one-shot 3D scanning using loopy belief
propagation,” in 2010 IEEE Computer Society Conference on
Computer Vision and Pattern Recognition Workshops (CVPRW)
(IEEE Computer Society,
2010), pp. 15‒
22.

**45. **P. Payeur and D. Desjardins,
“Structured light stereoscopic imaging with dynamic pseudo-random
patterns,” in *Image Analysis and Recognition*Lecture Notes in Computer Science Vol. 5627/2009
(Springer, 2009), pp.
687‒ 696.

**46. **D. Desjardins and P. Payeur,
“Dense stereo range sensing with marching pseudo-random
patterns,” in Fourth Canadian Conference on Computer and
Robot Vision (CRV ’07) (IEEE Computer
Society, 2007), pp. 216‒
226.

**47. **R. Y. Tsai, “A versatile camera
calibration technique for high accuracy 3D machine vision metrology using
off-the-shelf TV cameras and lenses,” IEEE J. Robotics
Automat. **3**(4), 323‒ 344
(1987). [CrossRef]

**48. **Z. Zhang,
“Flexible camera calibration by viewing a plane from unknown
orientations,” in Seventh International Conference on
Computer Vision (ICCV’99) (IEEE Computer
Society, 1999), Vol. 1p.
666.

**49. **J. Heikkil and O. Silven,
“A four-step camera calibration procedure with implicit image
correction,” in 1997 IEEE Computer Society Conference on
Computer Vision and Pattern Recognition, 1997. Proceedings
(IEEE Computer Society,
1997), pp. 1106‒
1112.

**Jason Geng** received his doctorial degree in Electrical Engineering from the
George Washington University in 1990. Since then, he has leaded a variety of research,
development, and commercialization efforts on 3D imaging technologies. He has published
more than 90 academic papers and one book, and is an inventor on 22 issued patents. He
received prestigious national honors, including the Tibbetts Award from the U.S. Small
Business Administration and the “Scientist Helping America” award from the Defense
Advanced Research Projects Agency, and was ranked 257 in *INC.*
magazine’s “INC. 500 List.” Dr. Geng currently serves as the vice president for the IEEE
Intelligent Transportation Systems Society (ITSS). He is also leading the Intelligent
Transportation System standard efforts by serving as the chairman of the standard
committee for IEEE ITSS.