Design of zero reference codes using cross-entropy method

Jung-Chieh Chen

doi:10.1364/OE.17.022163

1. Introduction

With the development of microelectronic fabrication and nanotechnology, the demand for a high-resolution grating measurement system is becoming more and more important in these areas. In a grating measurement system, a zero reference signal is necessary to achieve absolute position, an origin of a coordinate, or a machine home position. In principle, the zero reference signals can be generated from the autocorrelation of two identical zero reference codes (ZRCs). However, the considerable computational complexity for the required search through a high-dimensional vector space is a potential problem for the design of ZRCs, which generate optimum zero reference signals, especially for a large ZRC. As a result, the computational complexity of the design of an optimum ZRC limits the size of the ZRC to about 30 elements [1].

To design optimum ZRCs, various methods have been proposed [1]–[7]. Among these methods, the performance of the genetic algorithm (GA) [1] is the best. However, although the GA has the design capabilities for obtaining the size of the ZRCs up to 100 elements, its performance is still not good enough. Moreover, the complexity of the GA is still high. In this paper, we consider using autocorrelation properties to design optimum ZRCs for optical applications based on the Cross-Entropy (CE) method [8]. In the proposed CE method, we first define the secondary maximum of the autocorrelation signal as the score function. The score function is then translated into a stochastic approximation problem that can be solved effectively. The computer simulation results show that compared with the conventional GA, the proposed CE method obtains better performance with low computational complexity.

2. Description and problem definition

The problem definition and assumption of this paper basically follows [1]. Let us consider a ZRC comprised of a group of unequally spaced transparent and opaque slits, where the transparent and opaque regions in the ZRC are assumed integer multiples of the width of a single slit. Mathematically, a ZRC can be described by a binary sequence with length n as follows:

c = [c_{1}, c_{2}, \dots, c_{n}], c_{i} \in {0, 1},

where c_i=1 if a transparent slit is located at the ith position, and c_i=0 elsewhere. In order to calculate the reference signal, we assume a parallel ray beam and neglect the distance between the two ZRCs, as done in [1]. Accordingly, we can use the autocorrelation approximation to design a good optical reference signal. The autocorrelation function provides a measure of how closely the sequence matches a copy of itself as the copy is shifted k units in time. Hence, the autocorrelation of sequence (1) can be expressed as

S_{k} (c) = \sum_{i = 1}^{n - k} c_{i} c_{i + k},

where the maximum value of the autocorrelation signal occurs at the origin S ₀=∑ⁿ _i=1 c ² _i, that is, the number of the transparent slits in the code.

To construct a good zero reference signal, the secondary maximum of the autocorrelation signal, σ=max{S ₁,…,S _n-1}, must be as low as possible. Given a fixed length of the ZRC, n, and the number of transparent slits, n ₁, the objective of the optical reference signal design is to search for an optimal ZRC that minimizes the second maximum of the autocorrelation signal σ. Consequently, we can formulate the optimal ZRC design as a combinatorial optimization problem as

ω^{*} = \arg min_{ω_{q} \in Ω} C_{sel} (ω_{q}),

where

C_{sel} (ω_{q}) = max {S_{1} (ω_{q}), \dots, S_{n - 1} (ω_{q})},

ω ^* denotes the global optimum of the objective function, and ω_q is the indicator of the selected subset of the feasible sequences, which can be defined by

ω_{q} = {I_{i}}_{i = 1}^{n}, I_{i} \in {0, 1}, q = 1, 2, \dots, Q

where i is the index of the binary sequence and the indicator function I_i shows whether a transparent slit is located at the ith position. Moreover, we denote the set of all $Q = ({\overset{n}{n}}_{1})$ possible subsets as Ω={ω ₁,…,ω_Q}, where $(\frac{a}{b})$ denotes the binomial coefficient, a!/[b! (a-b)!].

The simplest idea for the optimum solution of (3) is to perform an exhaustive search through all combinations of $({\overset{n}{n}}_{1})$ to determine the optimal ZRCs. It turns out that the complexity of optimal ZRC problem is $𝒪 (n^{n_{1}})$ , which is too much for practical applications. Therefore, one has to resort to suboptimal solutions such as the GA [1]. Although the GA can obtain a better code, the height of the secondary maximum of the GA has space for improvement. Therefore, in the next section, we will propose a novel implementation of ZRC design based on the CE algorithm.

Before starting to design a ZRC, we have to consider the limitations of a ZRC design, which are as follows: 1) the period that the grating measurement system limits the width of the slits of ZRC; 2) the diameter of the light beam that limits the length of the code; and 3) the lower bound of the number of slits n ₁ that is restricted to the sensitivity of an optoelectronic receiver. In practical application, this electronic requirement establishes a minimum of the signal-to-noise ratio of zero-reference signals. Further, this requirement determines the minimum number of slits n ₁ of ZRC. According to the above-mentioned working requirements, we have n and n ₁ predetermined. In this case, we can make sure the design of ZRC is workable. However, we do not consider other imperfect factors, such as diffraction and signal noise, in the proposed optical system in this study. The focus of our work is to introduce an efficient CE method to minimize the second maximum of the signal under the same scenario of previous works. Accordingly, the issue of imperfect factors is beyond the scope of this paper.

3. Design methodology

The CE method, a principle adaptive importance sampling, was first proposed by Rubinstein [8] to solve rare event estimation problems and was eventually successfully applied to solve both combinatorial and continuous optimization problems. The basic idea behind the CE method is to transform the original optimization problem (3) to an associated stochastic optimization problem, and then to tackle the stochastic problem efficiently using an adaptive sampling algorithm. The general scheme for the CE method involves the following two iterative phases: 1) Generate random samples according to a specified sampling distribution generated from the previous iteration; and 2) Update the parameters on the basis of the best scoring samples in order to produce better scoring samples in the next iteration. For a concrete understanding of the CE method, the reader should refer to [8]. In the following, we employ the CE method to search the optimal ZRC to minimize the second maximum of the autocorrelation signal. The procedure of the proposed CE-based ZRC design is described as follows:

• Step 1) Set the iteration counter t :=1 and initialize probability vector $P^{(0)} = {p_{i}^{(0)}}_{i = l}^{n},$ , $p_{i}^{(0)} = \frac{1}{2}$ , where the probability vector p=[p ₁,…, p_n] is the only parameter needed for the proposed CE method, and p_i indicates the probability of a transparent slit is located at the ith position.

• Step 2) Use the density function f(ω_q,p ^(t-1)) to generate randomsamples ${ω_{q}^{(m)}}_{m = 1}^{M},$ , where M is the total number of samples and each element of ω_q is modeled as an independent Bernoulli random variable with the probability distribution P(I_i(ω_q)=1)=1-P(I_i(ω_q)=0)=p_i, i=1,…,n. The probability distribution is defined as

f (ω_{q}, p) = \prod_{i = 1}^{n} p_{i}^{I_{i} (ω_{q})} {(1 - p_{i})}^{1 - I_{i} (ω_{q})} .

• Step 3) Calculate the fitness function according to (4) to obtain a set of performance values {C_sel(ω.^(m,t) _q)}^M _m=1.

• Step 4) Order the performance values from the smallest to the biggest, C ⁽¹⁾ _sel≤⋯≤C ^(M) _sel, and select the best M^elite elite performance values according to the predetermined quantile parameter r(t)=C ^(⌈ρM⌉) _sel, where ρ denotes the fraction of the best samples and ⌈ρM⌉ is the integer part of ρM.

• Step 5) Calculate the parameter p ^(t) according to

p_{i}^{(t)} = \frac{\sum_{m = 1}^{M} I_{{C_{sel} (ω_{q}^{(m, t)}) \leq r^{(t)}}} I_{i} (ω_{q}^{(m, t)})}{\sum_{m = 1}^{M} I_{{C_{sel} (ω_{q}^{(m, t)}) \leq r^{(t)}}}} .

where $I_{{C_{sel} (ω_{q}^{(m, t)}) \leq r^{(t)}}}$ is an indicated variable defined by

I_{{C_{sel} (ω_{q}^{(m, t)}) \leq r^{(t)}}} = {\begin{matrix} 1, & if C_{sel} (ω_{q}^{(m, t)}) \leq r^{(t)} \\ 0, & otherwise \end{matrix} .

… Step 6) Update the parameter p ^(t) smoothly via

p^{(t)} = λ \times p^{(t)} + (1 - λ) \times p^{(t - 1)},

where λ is the smoothing parameter, with 0<λ≤1. It is obvious that we have the original updating rule for λ=1.

… Step 7) Repeat step 2 to step 6 for t:=t+1 until the stopping criterion is met. Here, the stopping criterion is the predefined number of iterations.

4. Results and discussion

Simulations were conducted to verify the performance improvement of the ZRC introduced by the proposed algorithm. For comparison, we also tested the conventional GA [1], which was run with crossover probability P_c=0.6, mutation probability P_m=0.1, 500 generations, and a population of 100 individuals. It should be noted that the simulation parameters described above for GA is exactly the same as that in [1]. Furthermore, in order to clarify the performance differences among the algorithms, we also numerically evaluated the theoretical lower bound for the second maximum. This lower bound was established by Li [4] and is expressed as

σ \geq \frac{(2 n + 1) - \sqrt{{(2 n + 1)}^{2} - 4 n_{1} (n_{1} - 1)}}{2} .

Regarding the parameters used in the proposed CE algorithm, the smoothing parameter λ is 0.7, M=100, ρ=0.1, and the algorithm is stopped when the iteration number exceeds 500.

Fig. 1. Comparison of the second maximum reached with the GA, the CE method, and the theoretical lower bound. The code length is 200, and n ₁ varies from 1 to 199.

Download Full Size | PDF

Experiment 1:We fix the number of elements, n=200, and determine how the second maximum s of various algorithms are affected by a variable number of slits, n ₁, in the interval from 1 to 199. Fig. 1 shows the value of the second maximum reached using the GA, the CE method, and the theoretical lower bound. It is observed that 1) the height of the second maximum increases when the number of slits increases¹; 2) when the number of slits is small, it is possible for the GA and the CE method to reach the theoretical lower bound; and 3) the proposed algorithm is superior to the GA. In order to make it more easy for readers to distinguish the improvement of the proposed CE method as compared to GA, Fig. 2 shows the improved percentage of CE as compared to GA versus the number of slits n ₁, where the improved percentage with the number of slits n ₁ is defined as

improved percentage (n_{1}) = \frac{σ_{GA} (n_{1}) - σ_{CE} (n_{1})}{σ_{GA} (n_{1})} \times 100 %,

where σ_GA (n ₁) and σ_CE (n ₁) are the value of the second maximum of GA with the number of slits n ₁ and the value of the second maximum of CE with the number of slits n ₁, respectively. As shown in Fig. 2, we can see that the proposed CE method has better improvement in most cases. In a fewer cases, the proposed CE method at least provides the same performance as that of GA.

Next, let us explain the reason for the improvement of the proposed CE method. As compared to conventional evolution algorithms, such as GA, the CE method has two special characteristics. First, instead of maintaining a simple solution candidate in each time step for conventional optimization algorithms, the main idea of the CE method is to maintain a distribution of possible solutions, which makes evolutionary computing with the distribution representation have a better characteristic in terms of solutions diversity. Second, the update rules are very simple. Instead of selection, crossover, and mutation operators in GA, CE adaptively updates this distribution according to Kullback-Leibler distance, that is, cross entropy, between the associated density and the optimal importance sampling density. As a result, one constructs a random sequence of solutions which converges (probabilistically) to the optimal or, at the least, to a reasonable solution.

Fig. 2. The improved percentage of CE as compared to GA versus the number of slits n ₁.

Download Full Size | PDF

Experiment 2: We consider another scenario for large codes with n=1000 elements and n ₁=100. In Fig. 3, we show one of the resulting ZRCs provided by the proposed algorithm for n=1000 elements and n ₁=100. Fig. 4 shows the optimal autocorrelation signal with respect to the relative displacement between the codes. In Fig. 4, the value of the second maximum is 11 after performing the proposed CE method. However, with the same scenario, the value of the second maximum of the GA is 13 [1]. It is obvious that the proposed algorithm outperforms the GA. It should be noted that if no design was done, the value of the second maximum is 15². However, after performing GA and the proposed CE method, the second maximums can be further reduced to 13 and 11, respectively. In this case, there are 13% and 27% reductions in the second maximum value for GA and CE, respectively, as compared to that without design.

Finally, let us consider the complexity issue of the GA and the proposed CE algorithm. Inasmuch as GA and CE are population-based search methods, we may therefore fix the number of samples to determine the suboptimal solutions with low complexity. In this case, the complexity for GA and CE can be expressed in terms of the number of sample. Accordingly, the number of samples for GA and CE is the population size times the maximum number of generations (iterations). Fig. 5 shows the cost function in (4) in terms of the average height of the second maximum as the function of iteration for the simulated parameters mentioned in Experiment 2. When the number of iteration reaches around 350, the cost is not improved anymore, and CE converges close to a global minimum, where the second maximum is 11. This means the CE requires 100×350=35,000 samples to obtain σ=11 for a ZRC with n=1000 elements and n ₁=100. However, as shown in [1], the GA required 100×500=50,000 samples to obtain σ=13 for a ZRC with n=1000 elements and n ₁=100. Therefore, the proposed CE method can offer better ZRC while keeping a low complexity.

Fig. 3. A ZRC found by the proposed CE method with n=1000 and n ₁=100.

Download Full Size | PDF

Fig. 4. Autocorrelation signal generated with an optimum code. The length of the code is 1000, and n ₁=100. The value of the second maximum is 11.

Download Full Size | PDF

Fig. 5. Average height of the second maximum versus iterations.

Download Full Size | PDF

5. Conclusion

This paper presented a CE-based method used to design optimum ZRC. We formulated the design of the ZRC as a particular combination optimization problem. We then applied the CE method to search for a good ZRC to obtain the desirable zero reference signal. Compared with the conventional GA scheme, the simulation results showed that the performance of the proposed CE method not only obtained a better σ but also enjoyed complexity advantages.

Footnotes

¹

¹If the number of slits is less, the second maximum is less. However, if the number of slits is even lesser, the corresponding maximum value of the autocorrelation signal is also less. In this case, the signal registered in the photodiode will be decreased. This is because such signal is the autocorrelation of the transmittances of ZRC. As a result, it is difficult for a photodiode to detect a small signal amplitude since we have to take dark noise and environmental noise into consideration. In general, the lower bound of the number of slits is restricted to the sensitivity of a optoelectronic receiver. In practical application, this electronic requirement establish a minimum of the signal-to-noise ratio of the zero reference signal. This requirement determines the minimum number of slits of ZRC. Based on the above-mentioned requirement, we have n and n ₁ predetermined, and we have to minimize the second maximum of the signal, σ.

²

²If no design was done, we may randomly generate a binary ZRC c=[c ₁,c ₂,…,c_n], where c_i∊{0,1}. In this case, however, we still need an extra operation to fit the problem definition. This is because ZRC c is a binary vector subject to constraint, which is the restricted number of the transparent slits n ₁. Therefore, we need an extra operation to fix the number of 1s in the binary vector c; that is, the restricted number of the transparent slits n ₁. This can be carried out by the restricted search operator [9], which randomly adds or removes the necessary 1s to meet the number of the transparent slits. Assume the number of 1s in c and the restricted number are n_p and n ₁, respectively. If n_p<n ₁, the restricted search operator adds (n ₁-n_p) 1s randomly; and if n_p>n ₁, the restricted search operator randomly selects (n_p-n ₁) 1s and removes them from the binary vector c. After performing the restricted search operation, a randomly generated ZRC will meet the problem constraint. However, the secondary maximum of the autocorrelation of a randomly generated ZRC with extra operation is almost equal to 15 according to our numerical experiments.

References and links

1. J. Saez-Landete, S. Salcedo-Sanz, M. Rosa-Zurera, J. Eusebio, and A. Bernabeu, “Optimal design of optical reference signals using a genetic algorithm,” Opt. Lett. 30, 2734–2736 ( 2005). [CrossRef]

2. X. Yang and C. Yin, “A new method for the design of zero reference marks for grating measurement systems,” J. Phys. E, Sci. Instrum. 19, 34–37 ( 1986). [CrossRef]

3. Y. Li, “Autocorrelation function of a bar code system,” J. Mod. Opt. 34, 1571–1575 ( 1987). [CrossRef]

4. Y. Li, “Optical valve using bar codes,” Optik 79, 67–74 ( 1988).

5. Y. Li, “Characterization and design of bar code systems for accurate alignment,” Appl. Opt. 27, 2612–2620 ( 1988). [CrossRef] [PubMed]

6. Y. Li and F. T. S. Yu, “Design of bar code systems for accurate alignment: a new method,” Appl. Opt. 29, 723–725 ( 1990). [CrossRef] [PubMed]

7. J. Saez-Landete, J. Alonso, and E. Bernabeu, “Design of zero reference codes by means of a global optimization method,” Opt. Express , 13, 195–201 ( 2005). [CrossRef] [PubMed]

8. R. Y. Rubinstein and D. P. Kroese, The Cross-Entropy Method. (Berlin, Germany, Springer, 2004).

9. S. Salcedo-Sanz, G. Camps-Valls, F. Perez-Cruz, J. Sepulveda-Sanchis, and C. Bousono-Calzon, “Enhancing genetic feature selection through restricted search and Walsh analysis,” IEEE Trans. Syst., Man, Cybern. C, Appl. Rev. 34, 398–406 ( 2004). [CrossRef]

Design of zero reference codes using cross-entropy method

Abstract

1. Introduction

2. Description and problem definition

3. Design methodology

4. Results and discussion

5. Conclusion

Footnotes

References and links

Cited By

Figures (5)

Equations (11)

Optics Express