## Abstract

A channel model of the volume holographic correlator (VHC) is proposed and demonstrated to improve the accuracy in the scene matching application with the multi-sample parallel estimation (MPE) algorithm. A quantity related to the space-bandwidth product is used to describe the recognition ability in the scene matching system by MPE. A curve is given to optimize the number of samples with the required recognition accuracy. The theoretical simulation and the experimental results show the validity of the channel model. The proposed model provides essential theoretical predictions and implementation guidelines for using the multi-sample parallel estimation method to achieve the highest accuracy.

© 2011 OSA

## 1. Introduction

Based on the high-density holographic storage technology, volume holographic correlators (VHC) have features of high-speed, high parallelism and multichannel processing [1–3]. There are many potential applications for the VHC, especially those that require high speed, real time, large-capacity correlation calculations [4,5]. One important application is the fast scene matching.

The scene matching application, which is to locate the target image in the reference remote sensing image [6,7], is widely used in many areas, such as space exploration, guided cruise, target tracking [8], and so on. In a scene matching, the correlations between the target image and all of the template images are calculated. The template image with the maximum inner product value is determined to be the target image. With a conventional computer, the process is a serial one that requires high power micro-processors and is very time consuming, therefore, may be sharply contradictory with the real-time, high speed working condition of the scene matching. Fortunately, the multichannel VHC can extract the inner product values between the target and all of the template images in parallel. And the template image corresponding to the brightest spot, i.e., the greatest inner product, is determined to be the target image. Thus the VHC can be used to implement the scene matching. The process is a parallel one and can be implemented by the VHC instantly, which well satisfies the need of the fast scene matching.

Previously, the multi-sample parallel estimation (MPE) method was proposed to improve the recognition accuracy of the scene matching with the VHC [1]. MPE uses the multiple correlation spots around the brightest spot to estimate the location of the target image. The method takes advantage of VHC’s high speed, high parallelism, multi-channel processing capability and the stationary random characteristics of the remote sensing image. Compared with using only the brightest spot, the method uses the information of more correlation spots to determine the location of the target image, therefore, improves the accuracy of the scene matching to pixels level.

To optimize the recognition accuracy, we further analyze the parameters in the MPE scene matching. According to the stationary random characteristics of the remote sensing images, the correlation function *R* of the reference image can be expressed as [9,10]

*a*,

*b*are constants, determined by the variance and mean of the image grayscale.

*α*,

*β*are also constants, whose reciprocal values are called correlation lengths, determined by the characteristics of the horizontal and vertical spatial grayscale distributions. Finally

*x*and

*y*represent horizontal and vertical coordinate differences between the target image and template images, respectively.

To use the MPE method, the following steps are necessary: image preprocessing, template images preparation, and estimation equation establishment. Image preprocessing can be used to adjust the correlation length (1/*α*, 1/*β*) of the image and reduce the redundancy correlation between the target image and the template images. The process of template images preparation is mainly based on the choice of the segmentation interval (*t*
_{x}, *t _{y}*). In this process, the reference image is divided vertically and horizontally into a set of template images: each image has the same pixel numbers as the target image, as well as the same vertical and horizontal segmentation intervals, as shown in Fig. 1
. The segmentation intervals (

*t*

_{x},

*t*) are chosen to allow overlap between different template images. The number of estimation equations is determined by the sample correlation spots, and more samples can help to improve the recognition accuracy. Thus the correlation length, the segmentation interval and the sample number are the main factors in the MPE method.

_{y}Although the MPE method can improve the recognition accuracy, there exist some problems. For example, the three main factors all have contribution to the recognition accuracy, but the extent of their contribution is still unknown and the relationship among them is unclear. The combination of these factors is not well understood. And the inappropriate combination usually results in low recognition accuracy and makes the MPE less efficient. To ensure the MPE method works smoothly, the optimization of these three parameters is necessary. In this paper, we will analyze the channel model of the VHC for the scene matching application, propose the correlation space-bandwidth product to optimize the above mentioned factors, and provide essential theoretical predictions and implementation guidelines for the MPE method to achieve the best performance.

## 2. Correlation space-bandwidth product

The VHC scene matching system can be treated as a channel system. In an optical channel system, one fundamental parameter is its space-bandwidth product (SBP) related to its information capacity. The SBP is defined as the product of the spatial area and the spatial-frequency bandwidth. The space-bandwidth product of a system is a measure of its complexity [11,12]. The ability of an optical system to accurately handle inputs and outputs relies on its space-bandwidth product. Therefore, SBP is a measure of the system performance and is directly related to the quality of the system.

In a scene matching system, only the template images within the correlation area 4/(*αβ*) have significant correlations among each other. As shown in Fig. 2
, the correlation lengths (1/*α*, 1/*β*) describe the characteristics of the remote sensing image and the segmentation intervals (*t*
_{x}, *t _{y}*) are limited by the capacity of the VHC. If we regard (1/

*t*

_{x}, 1/

*t*) as the sampling frequencies, similar to the SBP in the sampling theorem, we can define the correlation space-bandwidth product (CSBP) as the product between the spatial correlation area and the sampling frequency bandwidth, which is

_{y}The CSBP describes the recognition ability of the VHC scene matching system, and can help to optimize parameters used in MPE method. It is also a measure of the performance of this system, related to the recognition accuracy of the system. In the following sections, the channel model will be deduced and the relationship between the recognition accuracy and the CSBP will be derived.

## 3. Channel analysis of the VHC for the scene matching

The purpose for establishing the channel model is to derive an expression of the horizontal and vertical recognition error ${\sigma}_{x}$and ${\sigma}_{y}$in terms of the system parameters. Generally, the smaller the recognition error is, the higher the recognition accuracy is. Experimental measurements show that it is reasonable to use a Gaussian distribution with a standard deviation ${\sigma}_{\text{VHC}}$to describe the error in measuring the correlation intensity using the VHC [13–15]. The derivatives *dx* and *dy* can be calculated from Eq. (1). And then ${\sigma}_{x}$, ${\sigma}_{y}$ and ${\sigma}_{\text{VHC}}$ can be used to replace the derivatives *dx*, *dy,* and *d*(*f*(*x*,*y*)), therefore, the recognition error can be expressed as

According to the theory of the signals estimation [16], if the noise of a signal is a Gaussian distribution, so is the estimation result, even with different estimation methods. In this paper, we use the Least Squares Method to estimate the location of the target image. Also, we will first derive the recognition error for the simple one-dimensional two-sample system, then the complicated two-dimensional multi-sample system.

#### 3.1 One-dimensional two-sample system

Suppose there are two sample spots with the intensities of *z*
_{1}, *z*
_{2}, i.e., the target is correlated to two templates to certain degrees; and the segmentation interval is *t*, as shown in Fig. 3(a)
. According to the Least Squares Estimation, the condition for the residual error *S* to be the smallest is:

*dx*is given by

As known, the probability distribution of a linear superposition of random variables with normal distributions remains normal. For example, if a variable *w* can be related to the other variables *w*
_{1}, *w*
_{2}, …, *w _{n}*, as follows:

Suppose the noise of the sample intensities *z*
_{1} and *z*
_{2} is independent of each other, and has the same statistical distribution. Then their standard deviations *σ*
_{1}, *σ*
_{2} satisfy${\sigma}_{\text{1}}\text{=}{\sigma}_{\text{2}}\text{=}{\sigma}_{\text{VHC}}$ and according to Eq. (7), we have

*x*=

*t*/2.

The closer the target is to the template image, the higher the accuracy is. The highest accuracy is achieved with a minimum *σ _{x}*.

When the location of the target is in the middle of two sample templates, the lowest accuracy occurs, and the maximum *σ _{x}* can be written as:

This simply shows that this is $1/\sqrt{2}$ of Eq. (3), indicating that the recognition error is reduced by using two samples. The value $\mathrm{max}({\sigma}_{x})$ can be regarded as the recognition error of the whole system, shown in Fig. 3(a).

#### 3.2 Two-dimensional multi-sample system

Suppose segmentation intervals are *t _{x}* and

*t*, respectively. Considering a four-sample correlation spots system, the worst situation for recognition is when the target is in the center of the template images (

_{y}*x = t*2,

_{x}/*y = t*2), as is shown in Fig. 3(b). Therefore, the error in this worst case scenario can be defined as the error of the system. According to Eq. (10), for the two-dimensional four-sample system, we can have

_{y}/By using Eq. (11), the error of the system is

To analyze the recognition accuracy under the two-dimensional multi-sample spots condition, for convenience, we assume the sample number is *n* × *n* (*n* is even), *t _{x}* =

*t*=

_{y}*t*, and the target image is in the center of the system. According to the conclusion above and the symmetry of the system, the coordinate system shown in Fig. 3(c) shows that the recognition error of the

*n*×

*n*sample spots is half of that of the

*N*= (

*n*/2) × (

*n*/2) sample spots in the first quadrant. Therefore, it is sufficient to analyze the recognition accuracy of the

*N*sample spots in the first quadrant. Because these spots are in the same quadrant, the absolute-value signs in Eq. (1) can be removed. Suppose the sample correlation intensities are

*z*

_{1},

*z*

_{2}, …, z

*and the standard deviations of their errors satisfy ${\sigma}_{{z}_{1}}$ = ${\sigma}_{{z}_{2}}$ = … = ${\sigma}_{{z}_{N}}$ = ${\sigma}_{\text{VHC}}$. For these*

_{N}*N*sample spots, according to the Least Squares Estimation and the sum of the normal distributions, we have

For the sample number *n* × *n*, the error is half of the *N* sample spots, that is

## 4. Simulations based on the channel model

According to Eq. (17), the recognition accuracy can be improved by increasing the sample number (*n*), decreasing the segmentation interval (*t*), optimizing the image preprocessing (*α*, *β*), and improving the accuracy of the VHC (${\sigma}_{\text{VHC}}$). The accuracy of the VHC is determined by the VHC system, which is a fixed value. Other three main factors impact the recognition accuracy of the system and further simulation is necessary to clarify the relationship. Generally, *α*≈*β*. To simplify the expressions, we introduce two new parameters *w* = ${\sigma}_{x}\text{/}{\sigma}_{\text{VHC}}$and$p=\alpha t=4/\sqrt{M}$. Since the horizontal and vertical errors have a fixed relationship, the simulation will only be shown for the horizontal error, which can be written as

When the sample number approaches infinity, Eq. (18) can be written as

As the contribution of the parameters *a* and *t* to *w* is simply scaling factors, while the contribution of the parameters *n* and *p* to *w* is more complex, we can fix the parameters *a* and *t* (*a* = 0.5, *t* = 1) to analyze the *w* as a function of *n* and *p*, as shown in Fig. 4(a)
. As shown in Fig. 4(a), for each given *p*, *w* will decrease as *n* increases, but not indefinitely. This agrees with the intuition that using more samples will increase the recognition accuracy. However, for large *p* values, the impact of the parameter *n* to *w* is less significant.

On the other hand, it is important to notice that for each given *n*, there exists a *p* that minimizes the error *w*. When the parameter *n* changes from 2 to 20 (*n* being the square root of the sample number), the minimum error *w* and the corresponding *p* are shown in Table 1
.According to Table 1 and Eq. (19), the blue line and the black line are drawn in Fig. 4(b). The blue line is called the optimization curve, where each point indicates the highest achievable accuracy for a given sample number; and the black curve is the limit, when the sample number approaches infinity, that is the upper bound for any achievable accuracy with a given system parameter *p*. When the value of *p* decreases, the optimization curve approaches the upper bound curve indicating that the optimized accuracy has nearly achieved the limit of the system. For a given tolerable recognition error *w*, specified by the user, the parameters on the optimization line represent a system that requires the minimum number of sample points, therefore, the least amount of post-processing. In other words, the optimization curve is the best accuracy we can achieve for a given value of *n* and is our optimization goal.

A useful expression of the optimization line (blue line) can be obtained by a curve fitting, which leads to

and the relationship between*n*and

*p*can also be found as

Thus the relationship between the recognition error (*w*) and the CSBP (*M*) is derived. And if the tolerable recognition error (*w*) is given, the required system CSBP (*M*) can be calculated using Eq. (20), and the minimum sample number (*n ^{2}*) can also be determined using Eq. (21). For example, according to Table 1, to achieve a required accuracy corresponding to a recognition error of 2.28${\sigma}_{\text{VHC}}$, the minimum sample number should be about 100-144 (

*n*= 10-12) and the system parameter

*p*should be about 0.35-0.39 (

*M =*105

*-*131).

Based on the above results, the guidelines for the parameter choosing process can be described as follows: (1) Segmentation interval (*t _{x}, t_{y}*) determination. According to storage capacity of the VHC, it is possible to determine the interval. (2) CSBP (

*M*) determination. According to the recognition accuracy required by the user and the accuracy of the VHC, the error ratio

*w*can be derived. Then CBSP can be derived from Eq. (20). (3) Correlation length (1/

*α*, 1/

*β*) determination. According to the value of the segmentation interval and CSBP, the correlation length can be determined using Eq. (2). And the appropriate preprocessing of the target image should be performed to match the correlation length. (4) Sample number determination. According to Eq. (21) and Fig. 4 (b), choose the minimum number of samples (on the blue line) to achieve the required system performance. The accuracy can be further improved by increasing the sample number, within the limit of the black line.

## 5. Experimental results

Shown in Fig. 5
is an experimental setup, where the light source is a diode-pumped solid-state laser (DPSSL, *λ* = 532 nm). The holograms are stored in an Fe: LiNbO_{3} crystal using angle fractal multiplexing, while a CCD camera (MINTRON MTV-1881EX) is used to detect the correlation spots. The thickness of the recording medium is 15mm and the thickness of volume grating in the recording medium is about 6mm.

The speckle modulation and interleaving methods were proposed to improve the accuracy of the VHC in our previous work [17,18]. A diffuser is placed behind the SLM and the image interleaving preprocessing is implemented to the stored images and target images. The value of σ_{VHC} can be tested prior to the experiment, as follows. Some testing images are stored in the VHC. Another testing images can be input into the VHC for *N* times, where *N* should be larger than 1000. The *N* correlation values are acquired and a Gaussian distribution curve of the *N* values can be obtained. The parameter σ_{VHC} can be determined by fitting the Gaussian curve. In this experiment, the normalized error 3σ_{VHC} of the VHC is equal to 0.08.

The experimental result of the scene matching is shown in Fig. 6
. A remote sensing reference image with a size of 2492 × 1677 pixels is used to test the channel model, shown in Fig. 6(a), and the template image has a size of 640 × 480 pixels. The parameters in the estimation function, Eq. (1), are derived by performing a self-correlation using the reference image, with *α* = 0.122, *β* = 0.160, *a* = 0.453, and *b* = 0.491. With segmentation intervals *t _{x}* and

*t*chosen to be 3 pixels, the template images are stored into the VHC. Thus the CBSP(M) is about 91. Then 400 different target images are taken and inputted into the VHC to compute the coordinates of the targets by using the MPE method. When a white image is inputted into the VHC, the correlation spots are detected by the CCD, as shown in Fig. 6(b). When a target image is inputted into the VHC, the correlation spots are detected by the CCD, as shown in Fig. 6(c). The results of the horizontal and vertical errors for using 16 samples correlation spots are shown in Fig. 7 . The ratio between the horizontal error and the vertical error is determined by the ratio between the horizontal and the vertical correlation lengths. According to Eq. (17), the theoretical results are

_{y}*w*= 9.88, 3${\sigma}_{x}$ = 0.79 and 3${\sigma}_{y}$ = 0.60. And seen from Fig. 7, the experimental results are 3${\sigma}_{x}$ = 0.8 and 3${\sigma}_{y}$ = 0.6. Comparing these results, it is safe to say that the experimental results are in good agreement with the theoretical values. In the statistical theory [16], the probability of finding a value between −3

*σ*and 3

*σ*is 99.7%. Thus the value 3

*σ*can be regarded as the error of the VHC. Thus, the recognition error is about $\pm 0.8$ pixel, less than 1 pixel.

As shown in Fig. 4, the recognition error will decrease when the sample number increases, but not indefinitely. With CSBP decreasing (*p* increasing), the impact of the sample number to the recognition error will be less significant. In further experiments, we have adjusted *p* by changing the segmentation intervals to verify this conclusion. The segmentation intervals were set to be 1 pixel, 3 pixels and 5 pixels, respectively. By using the MPE method, the sample numbers 16, 36, 64, 100 were used to estimate the location of the target. To compare the results from the 3 different segmentation intervals, the error was normalized by the corresponding segmentation interval. The results are shown in Table 2
and Fig. 8
.

As shown in Fig. 8, the error can be decreased greatly with the increasing *n* when the value of *p* is low (*p* = 0.12, 1 pixel). When the value of *p* increases to 0.36 (3 pixels), the accuracy cannot be increased significantly with the increasing *n*. When the value of *p* reaches about 0.60 (5 pixels), the accuracy is nearly no longer increasing with the increasing sample number. The results also have good agreement with the theoretical analysis in Fig. 4.

As demonstrated by the theoretical simulations and experimental results, the CSBP (*M* or *p*) is a key parameter in the channel model of the VHC scene matching application, therefore, *M* is a good measure of the recognition ability in the scene matching by using the multi-channel VHC with the MPE algorithm. The system parameter *M* is related to the ratio between the correlation length and the segmentation interval. The correlation length describes the characteristics of the remote sensing reference image, and the segmentation interval is determined by the volume holographic storage capacity. CSBP combines the characteristics of the VHC with the characteristics of the remote sensing, and determines the sample number which results in the highest accuracy of the system. According to the theoretical prediction, the best recognition accuracy of our experimental system can be as high as 0.2 pixels.

## 6. Conclusion

A channel model of the VHC for scene matching has been analyzed, and an optimization equation for the parameters when using the MPE method has been provided. The CSBP introduced in this paper expresses the recognition ability of the scene matching system by MPE and an optimization curve for achieving the best recognition accuracy is derived. These results provide the essential theoretical predictions and implementation guidelines for using the MPE method. Examples show that the recognition error can reach $\pm 0.8$ pixels. However, the VHC scene matching is a complex system, and further research needs to be conducted to address other practical issues, such as a real-time environment, the target image and reference image from different sources and so on.

## Acknowledgment

This work is supported by the National Basic Research Program of China (2009CB724007), the National High-Tech R&D Program (863 Program) (2009AA01Z112), and the National Natural Science Foundation of China (60807005).

## References and links

**1. **S. L. Wang, Q. F. Tan, L. C. Cao, Q. S. He, and G. F. Jin, “Multi-sample parallel estimation in volume holographic correlator for remote sensing image recognition,” Opt. Express **17**(24), 21738–21747 (2009). [CrossRef]

**2. **G. W. Burr, F. H. Mok, and D. Psaltis, “Large-scale volume holographic storage in the long interaction length architecture,” Proc. SPIE **2297**, 402–414 (1994). [CrossRef]

**3. **Y. Takashima and L. Hesselink, “Media tilt tolerance of bit-based and page-based holographic storage systems,” Opt. Lett. **31**(10), 1513–1515 (2006). [CrossRef]

**4. **J. Joseph, A. Bhagatji, and K. Singh, “Content-addressable holographic data storage system for invariant pattern recognition of gray-scale images,” Appl. Opt. **49**(3), 471–478 (2010). [CrossRef]

**5. **E. Watanabe, A. Naito, and K. Kodate, “Ultrahigh-speed compact optical correlation system using holographic disc,” Proc. SPIE **7442**, 1–8 (2010).

**6. **A. Heifetz, J. T. Shen, J. K. Lee, R. Tripathi, and M. S. Shahriar, “Translation-invariant object recognition system using an optical correlator and a superparallel holographic random access memory,” Opt. Eng. **45**(2), 025201 (2006). [CrossRef]

**7. **J. Capon, “A probabilistic mode for run length coding of picture,” IEEE Trans. Inf. Theory **5**(4), 157–163 (1959). [CrossRef]

**8. **F. Saitoh, “Image template matching based on edge-spin correlation,” Electr. Eng. **153**, 1592–1596 (2005).

**9. **S. D. Wei and S. H. Lai, “Robust and efficient image alignment based on relative gradient matching,” IEEE Trans. Image Process. **15**(10), 2936–2943 (2006). [CrossRef]

**10. **T. S. Huang, “PCM picture transmission,” IEEE Spectr. **2**, 57–63 (1965).

**11. **M. A. Neifeld, “Information, resolution, and space-bandwidth product,” Opt. Lett. **23**(18), 1477–1479 (1998). [CrossRef]

**12. **J. W. Goodman, *Introduction to Fourier Optics* (McGraw-Hill, 1966).

**13. **P. M. Lundquist, C. Poga, R. G. Devoe, Y. Jia, W. E. Moerner, M.-P. Bernal, H. Coufal, R. K. Grygier, J. A. Hoffnagle, C. M. Jefferson, R. M. Macfarlane, R. M. Shelby, and G. T. Sincerbox, “Holographic digital data storage in a photorefractive polymer,” Opt. Lett. **21**(12), 890–892 (1996). [CrossRef]

**14. **M. R. Vant, R. W. Herring, and E. Shaw, “Digital processing techniques for satellite-borne SAR,” Can. J. Rem. Sens. **5**, 67 (1979).

**15. **M.-P. Bernal, H. Coufal, R. K. Grygier, J. A. Hoffnagle, C. M. Jefferson, R. M. Macfarlane, R. M. Shelby, G. T. Sincerbox, P. Wimmer, and G. Wittmann, “A precision tester for studies of holographic optical storage materials and recording physics,” Appl. Opt. **35**(14), 2360–2374 (1996). [CrossRef]

**16. **H. A. Jazwinskl, *Stochastic process and filtering theory* (Academic Press, 1970).

**17. **C. Ouyang, L. C. Cao, Q. S. He, Y. Liao, M. X. Wu, and G. F. Jin, “Sidelobe suppression in volume holographic optical correlators by use of speckle modulation,” Opt. Lett. **28**(20), 1972–1974 (2003). [CrossRef]

**18. **K. Ni, Z. Y. Qu, L. C. Cao, P. Su, Q. S. He, and G. F. Jin, “Improving accuracy of multichannel volume holographic correlators by using a two-dimensional interleaving method,” Opt. Lett. **32**(20), 2973–2974 (2007). [CrossRef]