Dual-pixel tracking of the fast-moving target based on window complementary modulation

Sheng Fu; Sheng Fu; Sheng Fu; Fei Xing; Fei Xing; Fei Xing; Zheng You; Zheng You; Zheng You

doi:10.1364/OE.475249

1. Introduction

Real-time tracking of fast-moving targets has many applications in biomedicine [1–3], remote sensing [4,5], and industrial measurement [6,7]. Image-based systems [8–11] have been widely used in target tracking with the development of image sensor technology. In image-based systems, sequential images of the target are first captured by the image sensors, and then image processing and analysis algorithms are employed to extract the position of the target from these images. However, the tracking performance of image-based systems for fast-moving targets is still limited by the fact that high temporal resolution and high spatial resolution cannot be satisfied simultaneously. In order to achieve fast-moving target tracking with high temporal and spatial resolution, the target tracking system needs to have the ability to high efficiently acquire and process data to obtain an accurate target trajectory.

For the tracking of fast-moving targets, high-performance high-speed cameras have been designed to capture high-resolution images at high frame rates. However, it is typically accompanied by huge data throughput for storage and transmission, and the computation of image processing and analysis algorithms is typically exhausted. Moreover, the short exposure time of high-speed cameras also results in low signal-to-noise and reduces tracking accuracy. To solve these problems, some researchers utilize the window readout mode of the CMOS to realize the detection of the selected region and reduce the following computation [8,9]. Besides, the electronic roller shutter (ERS) imaging technique of the image sensor is also implemented to achieve a high frame update rate [10,11]. Single-pixel imaging [12–17] has been proposed to compress the data amount and track the moving target from successive reconstructed images. However, the frame rate of this method is limited by the modulation frequency of the spatial light modulator and the number of coding masks used. Meanwhile, though the amount of data produced by single-pixel imaging is small, the computation used to reconstruct targets still limits the spatial and temporal resolution.

Tracking moving targets does not necessarily need to capture successive images, but can be based on the extracted features of targets such as corner point, line segment, texture, centroid, and so on. Feature-based tracking methods have much less data storage and computation than image-based methods, thus, it is more suitable for application in real-time tracking of fast-moving targets. Some researchers have proposed image-free methods to obtain the trajectory of targets without reconstructing images [18–25]. Similar to single-pixel imaging, the frame rate of image-free methods is also limited by the modulation frequency of the spatial light modulator and the number of coding masks used, but the number of coding masks is smaller and the frame rate is higher. Shi et al. used the Hadamard basis to modulate the spatial information of the target by transforming 2D images into 1D projection curves, resulting in an update rate of ∼177 Hz [18]. Zhang et al. used 6 Fourier basis coding masks to modulate the spatial information of the target and a single-pixel detector for tracking to achieve an update rate of 1666 Hz [19]. Furtherly, Zha et al. proposed a method using only 3 coding masks to obtain the centroid of the target so that the update rate of the system achieves 7400 Hz [25].

In this study, we presented a dual-pixel image-free target tracking system based on the window complementary modulation that achieves real-time tracking of the target centroid with an ultra-high update rate. The digital micromirror device (DMD) is used as the spatial light modulator (SLM). Due to the complementary nature of the DMD and the application of alternating interpolation Kalman filter, the update rate of the proposed target tracking system reaches the highest modulation frequency of the DMD, which is 22.2 kHz. We experimentally demonstrate that the tracking accuracy is enhanced by using the proposed tracking approach based on the window complementary modulation. The influences of the gray level of the chosen window coding mask and the ratio of the target to the window area on the centroiding accuracy are analyzed. To evaluate the performance of the system under different conditions, we deployed another two experiments by varying the update rate of the system and the maximal velocity of the target respectively. Through these experiments, the proposed target tracking system is proved to perform well in tracking fast-moving targets with high temporal and spatial resolution.

2. Principles and methods

2.1 DMD-based dual-pixel system design for target tracking

The basic schematic diagram of the proposed target tracking system is illustrated in Fig. 1(a). The target scene is imaged on a DMD by the camera lens. The DMD is used as an efficient and fast SLM with a flip frequency of 22.2 kHz. It is a micro-electro-mechanical system (MEMS) consisting of hundreds of thousands of tiny switchable micromirrors which have two flip states (+12° and −12°). By driving the flip states of the micromirrors, the target scene imaged on the DMD is modulated in two directions. Part of the light is reflected into direction 1 when the micromirror works at +12°, while another part of the light is reflected into direction 2 when the micromirror works at −12°. The reflected light then is collected by two collective lenses and detected by two identical single-pixel detectors respectively. The flip states of all micromirrors determine the coding masks, and the coding masks of the two directions are complementary due to the bistability of the micromirrors. Based on the above scheme, the target tracking system is built in the laboratory. It consists of one imaging lens (35-105mm F/3.5-4.5, Nikon), two collection lenses (CL 1 and CL 2, focal length 50 mm), two mirror reflectors (MR 1 and MR 2), two photomultiplier tubes (PMT 1 and PMT 2, PMT1001/M, Thorlabs), a custom-built dual total internal reflection (DTIR) prism, and a 1024 × 768 pixels DMD (DLP7000, Texas Instrument), as shown in Fig. 1(b).

Fig. 1. Scheme of the DMD-based dual-pixel target tracking system (a). 3D model of the DMD-based dual-pixel target tracking system (b). The principle of complementary dual-field TIR prism (c).

Download Full Size | PDF

The DTIR prism is designed to avoid optical system structure conflicts and stray light interference. As the micro-mirror works at small angles (+12° and −12°), the incident light and the reflective light in two directions are very close to each other in front of the DMD, and the camera lens, CL 1, and CL 2 in Fig. 1(a) would conflict with each other in the system structure. Meanwhile, the overlap of light beams would introduce stray light and reduce the accuracy of the proposed system. In order to avoid these problems in complementary measurements, the DTIR prism is designed by improving the total reflection interface structure of the prism based on the conventional TIR prism. As shown in Fig. 1(c), the DTIR prism consists of three single prisms spliced and laminated together, with an air gap of 0.01 mm between the laminated surfaces of all three prisms. By designing two different total reflection light paths for the reflected light of the ±12° micromirror, the DTIR prism separates the incident light and the reflected light obtained from the complementary modulation, which overcomes the limitations of the optical apertures of the imaging and collection lenses.

2.2 Centroid localization method based on window complementary modulation

When the target scene is imaged on the discrete micromirror array of the DMD, the coordinate (x,y) of the two-dimensional continuous scene light field f(x,y) is discretized as (m,n). (m,n) is the coordinate of the micromirror on the DMD, and I(m,n) denotes the discretized scene light field. The pair of complementary two-dimensional coding masks generated by the DMD in the two reflection directions is denoted as P(m,n) and P'(m,n). The range of m and n is kept consistent with the length M and width N of the micromirror array. After the scene light field I(m,n) is modulated by the complementary coding masks P(m,n) and P'(m,n), it is received and measured by two identical single-pixel detectors. And the modulated light intensity responses R and R’ can be expressed as follows,

(1)$$R = k\sum\limits_\Omega I (m,n) \cdot P(m,n) + {R_b}, $$

(2)$$R^{\prime} = k\sum\limits_\Omega I (m,n) \cdot P^{\prime}(m,n) + {R_b}, $$

where k is the detection sensitivity factor of the single pixel detector; R_b is the background noise; Ω is the modulation window region on the DMD and determined according to the minimum boundary window of the target obtained by fast detection of Fourier slices [17]; ∑_Ω is the summation operation on the window region Ω.

In the case of target motion, the motion model is usually constrained by rotation and translation, while the geometric shape of the target remains approximately unchanged so that it is possible to disregard the details of the target and focus only on the centroid position. Here, we consider using compressive measurement to extract centroid. A set of non-orthogonal basis coding masks are constructed according to the characteristics of the geometric moment [26] to modulate the intensity distribution of the target scene, and the centroid is extracted from the modulated intensity. The non-orthogonal basis coding masks are non-orthogonal moments derived from the polynomials m^pn^q. The geometric moment µ_pq of scene I(m,n) in 2D space is defined as

(3)$${\mu _{pq}} = \sum\limits_{m = 1}^M {\sum\limits_{n = 1}^N {{m^p}} } {n^q}I(m,n), $$

where p and q are non-negative integers and p + q is called the order of the geometric moment.

As shown in Fig. 2(a), complementary coding masks P₁ and P₁^’, P₂ and P₂^’ are alternately generated in the two directions to obtain the centroid of the target in the x and y directions. P₁ and P₂ are transposed matrices of each other, as well as P₁^’ and P₂^’. In the proposed target tracking system, a pixel is used as the grayscale width, thus the gray level for an M × N window region is M or N. In this situation, the complementary coding masks are specified with

(4)$${P_1} = {\left[ {\begin{array}{cccc} {\frac{1}{M}}&{\frac{2}{M}}& \cdots &1\\ {\frac{1}{M}}&{\frac{2}{M}}& \cdots &1\\ \vdots & \vdots & \vdots & \vdots \\ {\frac{1}{M}}&{\frac{2}{M}}& \cdots &1 \end{array}} \right]_{M \times N}}, P_1^{\prime} = {\left[ {\begin{array}{cccc} 1&{\frac{{M - 1}}{M}}& \cdots &{\frac{1}{M}}\\ 1&{\frac{{M - 1}}{M}}& \cdots &{\frac{1}{M}}\\ \vdots & \vdots & \vdots & \vdots \\ 1&{\frac{{M - 1}}{M}}& \cdots &{\frac{1}{M}} \end{array}} \right]_{M \times N}}, $$

(5)$${P_2} = {\left[ {\begin{array}{cccc} {\frac{1}{N}}&{\frac{1}{N}}& \cdots &{\frac{1}{N}}\\ {\frac{2}{N}}&{\frac{2}{N}}& \cdots &{\frac{2}{N}}\\ \vdots & \vdots & \vdots & \vdots \\ 1&1& \cdots &1 \end{array}} \right]_{M \times N}}, P_2^{\prime} = {\left[ {\begin{array}{cccc} 1&1& \cdots &1\\ {\frac{{N - 1}}{N}}&{\frac{{N - 1}}{N}}& \cdots &{\frac{{N - 1}}{N}}\\ \vdots & \vdots & \vdots & \vdots \\ {\frac{1}{N}}&{\frac{1}{N}}& \cdots &{\frac{1}{N}} \end{array}} \right]_{M \times N}}, $$

(6)$${P_3} = \frac{M}{{M + 1}}({{P_1} + P_1^{\prime}} )= \frac{N}{{N + 1}}({{P_2} + P_2^{\prime}} )= {\left[ {\begin{array}{cccc} 1&1& \cdots &1\\ 1&1& \cdots &1\\ \vdots & \vdots & \vdots & \vdots \\ 1&1& \cdots &1 \end{array}} \right]_{M \times N}}. $$

Fig. 2. Light field of the target modulated by complementary grayscale masks (a). Light field intensity response of a single-pixel detector (b).

Download Full Size | PDF

Using the coding masks P₁, P₁^’, P₂, and P₂^’ to modulate the light field, the modulated intensity is received and measured by two identical single-pixel detectors to obtain R₁, R₁’, R₂, and R₂’, as shown in Fig. 2(b). R₁ and R₂ correspond to the first-order moment values µ₁₀ and µ₀₁ of I(m,n), respectively. (R₁+ R₁’) or (R₂+ R₂’) correspond to the zero-order moment values µ₀₀ of I(m,n). Since the scene can be equated as a superposition of the target and the background, the effect of the background response R_b in the modulated light intensity response needs to be eliminated. After the zero-order and first-order moment values are determined, the centroid coordinate (x_c,y_c) of the target can be obtained,

(7)$${x_c} = \frac{{{\mu _{10}}}}{{{\mu _{00}}}} = ({M + 1} )\cdot \frac{{{R_1} - {R_b}}}{{{R_1} + R_1^{\prime} - 2{R_b}}}, $$

(8)$${y_\textrm{c}} = \frac{{{\mu _{01}}}}{{{\mu _{00}}}} = ({N + 1} )\cdot \frac{{{R_2} - {R_b}}}{{{R_2} + R_2^{\prime} - 2{R_b}}}. $$

2.3 Complementary coding mask generation by S-shaped Sierra Lite dithering

In order to keep the plane where the incident and reflected light are located parallel to the system plane, which will facilitate the subsequent construction of the optical path, DMD is installed by rotating 45° around the center o of the micromirror array, as shown in Fig. 3(a). As the micromirror on the DMD only has +12° and −12° flip states, DMD can only generate various binary (black-and-white) coding masks instead of grayscale coding masks. To solve this problem, a spatial dithering modulation method based on the S-shaped Sierra Lite error diffusion dithering is used to achieve the grayscale in the window [27,28]. The grayscale coding mask is first quantized to 0 and 1 according to a threshold to obtain a binary dithering coding mask. Then the quantization error of the current pixel is diffused to several adjacent unprocessed pixels based on the kernel function h(m,n). During the diffusion, when measuring the centroid on the x-axis, odd rows are scanned from left to right, and even rows are scanned from right to left, as shown in Fig. 3(b). The effect of binarization space dithering for different grayscale coding masks is shown in Fig. 3(c). And the kernel function h(m,n) and h^’(m,n) of the error diffusion in odd rows and even rows can be expressed as,

(9)$$h(m,n) = \frac{1}{4}\left[ {\begin{array}{lll} - &x&2\\ 1&1&0 \end{array}} \right],\textrm{ }{h^\prime }(m,n) = \frac{1}{4}\left[ {\begin{array}{lll} 2&x&{ - 7}\\ 0&1&1 \end{array}} \right], $$

2.4 Target tracking algorithm based on the alternating interpolation Kalman filter

Kalman filter is a recursive filtering method based on statistical estimation theory, and Kalman filter or its improved algorithm has been used in the literature for the improvement of measurement accuracy [29–31]. In the proposed target tracking system, only centroid position in one direction can be obtained through each measurement, so the measured centroids of the target in the x and y directions are asynchronous. To solve this problem and improve the asynchronous measurement accuracy, the alternating interpolation Kalman filter is applied in our study to predict the centroid in the next moment, update the centroid according to the measurement, and interpolate the asynchronous centroid to obtain the real-time centroid.

Fig. 3. Schematic of bistable micromirror on DMD mounted by rotating 45° (a). Snake-scanning error diffusion of grayscale coding mask in the horizontal direction (b). Comparison of the binarization spatial dithering effect of different grayscale coding masks (c).

Download Full Size | PDF

The flowchart of the alternating interpolation Kalman filter is shown in Fig. 4. The initial target motion state is first obtained using the global moment. Second, the state at time t + 2Δt on the x-axis is predicted by the previous target motion state at time t and applied to determine the window position. The time interval is 2Δt instead of Δt because the centroids on the x-axis and y-axis are measured alternately. Then the measured state based on the complementary window moment and the predicted state is fed to the alternating interpolation Kalman filter to obtain the optimal estimated state at time t + 2Δt. Finally, the optimal estimated state is used as the output and provided as the previous state for the next round of the loop until the positioning end. The state at time t+Δt is obtained through the interpolation between the states at time t and t + 2Δt. Similarly, the target motion state at time t+Δt on the y-axis is predicted by the previous state at time t-Δt, then the state at time t on the y-axis is obtained by the interpolation between the optimal estimated state at time t-Δt and t+Δt.

Fig. 4. Target tracking algorithm based on alternating interpolation Kalman filter

Download Full Size | PDF

Take the target motion state at time t on the x-axis as an example, it can be written as $\hat{{\boldsymbol x}}(t) = {\left[ {\begin{array}{ccc} {x(t)}&{\dot{x}(t)}&{\ddot{x}(t)} \end{array}} \right]^T}$, where $x(t)$, $\dot{x}(t)$, and $\ddot{x}(t)$ are the position, velocity, and acceleration of the target at time t.

The predicted target motion state at time t + 2Δt, $\hat{{\boldsymbol x}}(t + 2\Delta t|t)$, and the covariance matrix ${P_{t + 2\Delta t|t}}$ of the predicted state is expressed as

(10)$$\hat{{\boldsymbol x}}(t + 2\Delta t|t) = A\hat{{\boldsymbol x}}(t), $$

(11)$${P_{t + 2\Delta t|t}} = A{P_t}{A^T} + Q(t), $$

where ${P_t}$ is the covariance matrix at time t. $Q(t)$ is the covariance of the noise. A is the prediction matrix and it is described as

(12)$$A = \left[ {\begin{array}{ccc} 1&{2\Delta t}&{\frac{{2\alpha \Delta t - 1 + {\textrm{e}^{ - 2\alpha \Delta t}}}}{{{\alpha^2}}}}\\ 0&1&{\frac{{1 - {\textrm{e}^{ - 2\alpha \Delta t}}}}{\alpha }}\\ 0&0&{{\textrm{e}^{ - 2\alpha \Delta t}}} \end{array}} \right], $$

where α is the maneuver frequency.

In the update stage, the measured state ${\boldsymbol z}(t)$, and the covariance of the centroiding error at the current state of the target is indicated by R. Thus, the Kalman gain ${K_{t + 2\Delta t}}$ is expressed as

(13)$${K_{t + 2\Delta t}} = {P_{t + 2\Delta t|t}}{H^T}{[{H{P_{t + 2\Delta t|t}}{H^T} + R} ]^{ - 1}}, $$

where H is expressed as $H = [\begin{array}{ccc} 1&0&0 \end{array}]$.

Then the Kalman gain ${K_{t + 2\Delta t}}$ is used to obtain the optimal estimated state $\hat{{\boldsymbol x}}(t + 2\Delta t)$ at time t + 2Δt,

(14)$$\hat{{\boldsymbol x}}(t + 2\Delta t) = \hat{{\boldsymbol x}}(t + 2\Delta t|t) + {K_{t + 2\Delta t}}[{{\boldsymbol z}(t + 2\Delta t) - H\hat{{\boldsymbol x}}(t + 2\Delta t|t)} ]. $$

The covariance matrix of the optimal estimated state is expressed as

(15)$${P_{t + 2\Delta t}} = [{{I_{3 \times 3}} - {K_{t + 2\Delta t}}H} ]P_{t + 2\Delta t}^ -. $$

Meanwhile, $\hat{{\boldsymbol x}}(t + 2\Delta t)$ and ${P_{t + 2\Delta t}}$ will be used in the time t + 4Δt.

The state at time t+Δt on the x-axis is obtained by the quintic polynomial interpolation based on the optimal estimated states at time t and t + 2Δt. Using quintic polynomial interpolation, the continuously varying state of the target is obtained, which includes the position, velocity, and acceleration. The prediction, update, and interpolation of the target motion state on the y-axis are similar to those on the x-axis, but with a time difference of Δt.

3. Results and analysis

3.1 Simulation

The validation of the proposed centroid localization method is evaluated based on numerical simulation. As shown in Fig. 5(a), the scene size in the simulation experiment is 512 × 512 pixels, and the scene contains the background and a target. The background consists of the simulated moon, stars and mountains, and the target is a square of size 64 × 64 pixels. Shot noise and thermal noise mainly influence the authenticity and stability of the PMT output signal. Shot noise is associated with the particle nature of light, which can be modeled by a Poisson process. Thermal noise represents the noise generated by the thermal excitation of charge carriers, which satisfies the Gaussian distribution. Therefore, Poisson noise and Gaussian noise are added to the simulation scene to simulate the measurement noise. The constructed binarized coding mask based on spatial dithering is shown in Fig. 5(b) and Fig. 5(c). The mask size is consistent with the simulation scene as 512 × 512 pixels, and the size of the grayscale window on the coding mask is 128 × 128 pixels. The flip state of the micromirrors outside the window satisfies the equal probability binary random distribution, and the effect of background on the measurement results is eliminated by complementary modulation measurements.

Fig. 5. Simulation scene with target and background (a). Binarized coding mask (b). Target modulated by binarized window grayscale mask (c).

Download Full Size | PDF

The signal-to-noise ratio (SNR) is used as a measure of the quality of the target light intensity signal and can be expressed as [32],

(16)$$\textrm{SNR} = 20 \cdot {\log _{10}}\frac{{\bar{R} - {{\bar{R}}_n}}}{\sigma }, $$

where R is the response of the single pixel detector to the target light intensity; $\bar{R}$ is the mean value of R; $\sigma$ is the standard deviation of R; ${\bar{R}_n}$ is the mean value of the measurement noise R_n.

The SNR of the target is varied by adding different Gaussian white noise and Poisson noise to the simulation scene. The RMSE of the centroid localization based on window complementary grayscale modulation at different SNR is shown in Fig. 6. The red and blue lines in the figure are the RMSEs of the centroid in the x-axis and y-axis, respectively. By comparing the changes of the RMSE at different SNR, it can be seen that as the SNR increases, the RMSE of the centroid gradually decreases and the localization accuracy of the target also improves. When the SNR is better than 45 dB, the RMSE of the centroid is within 0.1 pixels.

Fig. 6. Relationship between RMSE of centroid positioning and SNR of the target

Download Full Size | PDF

To analyze the effect of the gray level of the window on the target localization accuracy, the window of size 128 × 128 pixels used in the above paper is spatially dithered with different gray levels d. The binarized window coding masks with d = 16, 32, 64 and 128 gray levels are shown in Fig. 7(a), respectively. The proportion of micromirrors operating at ±12° state in each level of grayscale stripes in the binarized window mask determines the equivalent grayscale level. The RMES variation of the measured centroid for different window gray levels is compared. As shown in Fig. 7(b), as the window gray level d increases, the RMSE of the centroid gradually decreases and the localization accuracy keeps improving. When the window gray level d is greater than 80, the RMSE of the centroid is stabilized within 0.1 pixels. When the gray level reaches the maximum value of 128, each gray level is composed of one column of micromirrors, so the window size determines the maximum gray level. The gray level of the window is set to the maximum value in all subsequent experiments, i.e., the width or height of the window.

Fig. 7. Window mask modulation targets with different gray levels d (a). Relationship between RMSE of centroid positioning and window gray level d (b).

Download Full Size | PDF

To analyze the effect of the relative size between the target and the window on the accuracy of the centroid, a binarized grayscale window of the same size is used to modulate the targets of different sizes. The size of the window is 128 × 128 pixels, and the window gray level is 128. Three targets with different sizes of 10 × 10 pixels, 60 × 60 pixels and 110 × 110 pixels are shown in Fig. 8(a), and the target regions account for 0.6%, 22.0%, and 73.9% of the whole window respectively. The SNRs of targets of different sizes are kept constant. The RMES variation of the measured centroid for the targets of different sizes is compared. As is shown in Fig. 8(b), the target size increases from 10 × 10 pixels to 110 × 110 pixels, RMSE of the centroid first decreases rapidly and then decreases slowly from about 0.1 pixels starting at 50 × 50 pixels. When the target size is smaller than 50 × 50 pixels, more background information in the fixed-size window is modulated into the measurement results, which causes drift in target localization. And the closer the target is to the window, the more likely it is to be out of the window during motion target tracking, which will seriously affect the target tracking accuracy. Therefore, the relative size of target and window is kept between 16% and 36%, which effectively improves the target tracking robustness and accuracy.

Fig. 8. Different size targets are modulated by a grayscale window mask of 128 × 128 pixel size (a). Effect of relative size between the target and the window mask on RMSE of centroid positioning (b).

Download Full Size | PDF

3.2 Laboratory experiments

As is shown in Fig. 9(a), the fast-moving object with irregular motion is simulated by a high-precision dynamic target simulator, and the dynamic target simulator refreshes in 45 µs. The configuration of this experiment is shown in Fig. 9(b). To test the performance of the proposed target tracking system, the modulation time of the DMD is set to 45, 90, and 180 µs, corresponding to system update rates of 22.2, 11.1, and 5.55 kHz. The complementary modulated light intensity in the two reflection directions of the DMD micromirror is received by two identical PMTs. In order to make the results detected by two PMTs accurate, calibration measures are taken for both detectors before the experiment.

Fig. 9. The irregular trajectory of motion target simulated by dynamic target simulator (a). Experimental target tracking system based on dynamic target simulator (b).

Download Full Size | PDF

The real trajectory of the target and the tracking results at different update rates are shown in Fig. 10(a). The solid red line indicates the real trajectory, and the blue, green and purple scatters indicate the tracking results at update rates of 22.2, 11.1 and 5.55 kHz, respectively. The real trajectory is obtained by measuring the motion target sequence generated by the simulator frame by frame. Figure 10(b) and 10(c) show x-axis and y-axis centroid error at different update rates, respectively, and the comparison of the experimental RMSE for different update rates is shown in Table 1. By comparing the results in Fig. 10(b), 10(c) and Table 1, it can be seen that the measured trajectory at the 22.2 kHz update rate matches the real trajectory the best, with RMSE of 0.1235 and 0.1304 pixels on the x-axis and y-axes, respectively. The system tracks the motion target at a higher update rate to obtain results that are closer to the real trajectory. Especially in the case of rapid changes of the target trajectory, the target tracking results obtained by the system at 22.2 kHz can more accurately reflect the details of the target trajectory.

Fig. 10. Real trajectory of the target and measured trajectory at different update rates (a). x-axis centroid error at different update rates (b). y-axis centroid error at different update rates (c).

Download Full Size | PDF

Table 1. Comparison of the experimental RMSE for different update rate

View Table | View all tables in this article

Table 2. Comparison of the experimental RMSE for different period of sinusoidal motion

View Table | View all tables in this article

The proposed target tracking system is proved to perform well at an update rate of 22.2 kHz, however, in practice, velocity is another important factor that affects the tracking accuracy of the fast-moving object. To test the accuracy of the proposed system when tracking objects with different velocities at an update rate of 22.2 kHz, we simulate a sinusoidally moving target using a combination of a 1 cm × 1 cm LED and an electrodynamic shaker, as is shown in Fig. 11(a). This device is placed 55 cm away from the proposed system. The measured centroids in the x and y directions for the target with a variation period of 10, 20, and 40 ms are shown in Fig. 11(b), and the corresponding vibration frequencies are 100, 50, and 25 Hz, respectively. For clarity of the graph, only the target tracking results within 0.04 s are shown. As shown in Fig. 11(b), the purple, green and blue solid lines indicate the average trajectory results of sinusoidal motion with periods of 10, 20 and 40 ms, respectively, and the purple, green and blue scatter points are the measured trajectories of the targets with corresponding periods of sinusoidal motion.

Fig. 11. Experimental target tracking system based on an electrodynamic shaker (a). Average trajectory of ten measurements and corresponding measurement trajectory for different periods of sinusoidal motion (b).

Download Full Size | PDF

As shown in Fig. 12, the velocities and centroid errors of sinusoidal motion targets with periods of 10, 20 and 40 ms are shown on the x-axis and y-axis, respectively. The RMSE of the centroid of the target with different periods of sinusoidal motion is shown in Table 2. It can be seen that the centroid tracking error increases with the increase of the target motion velocity. When the sinusoidal motion period is 10 ms, the relationship between the centroid tracking error and the velocity is most obvious, and the larger the speed of the moving target, the larger the measured centroid error. When the maximum velocity of the moving target in x-axis and y-axis is up to about 2 × 10⁴ pixel/s, the RMSE of the centroid in x-axis and y-axis does not exceed 0.3 pixels. The experiment demonstrates the accuracy and stability of the proposed method when tracking targets with different motion speeds.

Fig. 12. x-axis speed for different periods of sinusoidal motion (a). y-axis speed for different periods of sinusoidal motion (b). x-axis centroid error for different periods of sinusoidal motion (c). y-axis centroid error for different periods of sinusoidal motion (d).

Download Full Size | PDF

4. Conclusion

In this work, a dual-pixel image-free target tracking system based on the window complementary modulation is proposed, which achieves fast-moving target tracking without acquiring complete image information. Due to the complementary nature of the DMD, the proposed centroid localization method simply requires modulating the light field of the target using two horizontal and vertical window grayscale masks. The target centroid position is calculated by obtaining the zero- and first-order geometric moments. The influences of the gray level of the chosen window coding mask and the ratio of the target to the window area on the centroiding accuracy are analyzed by numerical simulation. The alternating interpolation Kalman filter is used to track the moving target to improve the accuracy of target localization and tracking. Finally, the target tracking experiments are conducted based on the dynamic target simulator and the real scene using the built experimental system, respectively. The experimental results show that the accuracy of the proposed system at 22.2 kHz can achieve 0.1 pixels in the range of 1024 × 768 pixels, and when the maximum velocity of the moving target is up to about 2 × 10⁴ pixel/s, the RMSE of the centroid in x-axis and y-axis of the target does not exceed 0.3 pixels. This demonstrates that the proposed method is able to achieve high accuracy tracking of high-speed moving targets in real time.

Funding

National Natural Science Foundation of China (No. 51827806); Tencent Foundation through the Xplorer Prize

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but maybe obtained from the authors upon reasonable request.

References

1. N. Ogawa, H. Oku, K. Hashimoto, and M. Ishikawa, “Microrobotic visual control of motile cells using high-speed tracking system,” IEEE Trans. Robot. 21(4), 704–712 (2005). [CrossRef]

2. K. Goda, K. K. Tsia, and B. Jalali, “Serial time-encoded amplified imaging for real-time observation of fast dynamic phenomena,” Nature 458(7242), 1145–1149 (2009). [CrossRef]

3. S. Ota, R. Horisaki, Y. Kawamura, M. Ugawa, I. Sato, K. Hashimoto, R. Kamesawa, K. Setoyama, S. Yamaguchi, K. Fujiu, K. Waki, and H. Noji, “Ghost cytometry,” Science 360(6394), 1246–1251 (2018). [CrossRef]

4. J. Shao, B. Du, C. Wu, and L. Zhang, “Tracking Objects From Satellite Videos: A Velocity Feature Based Correlation Filter,” IEEE Trans. Geosci. Remote Sensing 57(10), 7860–7871 (2019). [CrossRef]

5. X. Deng, Y. Wang, D. He, G. Han, T. Xue, Y. Hao, X. Zhuang, J. Liu, C. Zhang, and S. Wang, “A Compact Mid-Wave Infrared Imager System With Real-Time Target Detection and Tracking,” IEEE J. Sel. Top. Appl. Earth Observations Remote Sensing 15, 6069–6085 (2022). [CrossRef]

6. A. Kumar, “Computer-Vision-Based Fabric Defect Detection: A Survey,” IEEE Trans. Ind. Electron. 55(1), 348–363 (2008). [CrossRef]

7. F. Viani, P. Rocca, G. Oliveri, D. Trinchero, and A. Massa, “Localization, tracking, and imaging of targets in wireless sensor networks: An invited review,” Radio Sci. 46(5), 1 (2011). [CrossRef]

8. C. C. Liebe, L. Alkalai, G. Domingo, B. Hancock, D. Hunter, J. Mellstrom, I. Ruiz, C. Sepulveda, and B. Pain, “Micro APS based star tracker,” in Proceedings, IEEE Aerospace Conference, (2002), 5.

9. A. Teman, S. Fisher, L. Sudakov, A. Fish, and O. Yadid-Pecht, “Autonomous CMOS image sensor for real time target detection and tracking,” in 2008 IEEE International Symposium on Circuits and Systems (ISCAS), (2008), 2138–2141.

10. M. Wei, F. Xing, and Z. You, “An implementation method based on ERS imaging mode for sun sensor with 1 kHz update rate and 1″ precision level,” Opt. Express 21(26), 32524–32533 (2013). [CrossRef]

11. M.-S. Wei, F. Xing, and Z. You, “A real-time detection and positioning method for small and weak targets using a 1D morphology-based approach in 2D images,” Light: Sci. Appl. 7(5), 18006 (2018). [CrossRef]

12. O. S. Magaña-Loaiza, G. A. Howland, M. Malik, J. C. Howell, and R. W. Boyd, “Compressive object tracking using entangled photons,” Appl. Phys. Lett. 102(23), 231104 (2013). [CrossRef]

13. E. Li, Z. Bo, M. Chen, W. Gong, and S. Han, “Ghost imaging of a moving target with an unknown constant speed,” Appl. Phys. Lett. 104(25), 251120 (2014). [CrossRef]

14. S. Sun, H. Lin, Y. Xu, J. Gu, and W. Liu, “Tracking and imaging of moving objects with temporal intensity difference correlation,” Opt. Express 27(20), 27851–27861 (2019). [CrossRef]

15. J. Wu, L. Hu, and J. Wang, “Fast tracking and imaging of a moving object with single-pixel imaging,” Opt. Express 29(26), 42589–42598 (2021). [CrossRef]

16. W. Jiang, X. Li, X. Peng, and B. Sun, “Imaging high-speed moving targets with a single-pixel detector,” Opt. Express 28(6), 7889–7897 (2020). [CrossRef]

17. H. Jiang, H. Liu, X. Li, and H. Zhao, “Efficient regional single-pixel imaging for multiple objects based on projective reconstruction theorem,” Opt. Lasers Eng. 110, 33–40 (2018). [CrossRef]

18. D. Shi, K. Yin, J. Huang, K. Yuan, W. Zhu, C. Xie, D. Liu, and Y. Wang, “Fast tracking of moving objects using single-pixel imaging,” Opt. Commun. 440, 155–162 (2019). [CrossRef]

19. Z. Zhang, J. Ye, Q. Deng, and J. Zhong, “Image-free real-time detection and tracking of fast moving object using a single-pixel detector,” Opt. Express 27(24), 35394–35401 (2019). [CrossRef]

20. Q. Deng, Z. Zhang, and J. Zhong, “Image-free real-time 3-D tracking of a fast-moving object using dual-pixel detection,” Opt. Lett. 45(17), 4734–4737 (2020). [CrossRef]

21. H. Li, K. Lu, J. Xue, F. Dai, and Y. Zhang, “Dual Optical Path Based Adaptive Compressive Sensing Imaging System,” Sensors 21(18), 6200 (2021). [CrossRef]

22. F. Zhou, X. Shi, J. Chen, T. Tang, and Y. Liu, “Non-imaging real-time detection and tracking of fast-moving objects,” arXiv e-prints, arXiv: 2108.06009 (2021).

23. Z.-H. Yang, X. Chen, Z.-H. Zhao, M.-Y. Song, Y. Liu, Z.-D. Zhao, H.-D. Lei, Y.-J. Yu, and L.-A. Wu, “Image-free real-time target tracking by single-pixel detection,” Opt. Express 30(2), 864–873 (2022). [CrossRef]

24. Y. Yu, Z. Yang, W. Li, and H. Shao, “Image-Free Positioning Tracking Scheme via Single Pixel Detection,” in Advances in Guidance, Navigation and Control (Springer, 2022), pp. 1349–1357.

25. L. Zha, D. Shi, J. Huang, K. Yuan, W. Meng, W. Yang, R. Jiang, Y. Chen, and Y. Wang, “Single-pixel tracking of fast-moving object using geometric moment detection,” Opt. Express 29(19), 30327–30336 (2021). [CrossRef]

26. D. Xu and H. Li, “Geometric moment invariants,” Pattern Recognition 41(1), 240–249 (2008). [CrossRef]

27. Z. Zhang, X. Wang, G. Zheng, and J. Zhong, “Fast Fourier single-pixel imaging via binary illumination,” Sci. Rep. 7(1), 12029 (2017). [CrossRef]

28. Z.-Y. Liang, Z.-D. Cheng, Y.-Y. Liu, K.-K. Yu, and Y.-D. Hu, “Fast Fourier single-pixel imaging based on Sierra–Lite dithering algorithm*,” Chin. Phys. B 28(6), 064202 (2019). [CrossRef]

29. G. Welch and G. Bishop, “An introduction to the Kalman filter,” (1995).

30. C. Shen, Y. Zhang, J. Tang, H. Cao, and J. Liu, “Dual-optimization for a MEMS-INS/GPS system during GPS outages based on the cubature Kalman filter and neural networks,” Mechanical Systems and Signal Processing 133, 106222 (2019). [CrossRef]

31. C. Shen, Y. Zhang, X. Guo, X. Chen, H. Cao, J. Tang, J. Li, and J. Liu, “Seamless GPS/Inertial Navigation System Based on Self-Learning Square-Root Cubature Kalman Filter,” IEEE Trans. Ind. Electron. 68(1), 499–508 (2021). [CrossRef]

32. “Standard for Characterization of Image Sensors and Cameras, Release 3.1,” EMVA Standard 1288 (2016).

Update rate /kHz	RMSE_x /pixel	RMSE_y /pixel
5.55	0.6309	0.5001
11.1	0.2746	0.2671
22.2	0.1235	0.1304

Period of sinusoidal motion /ms	Maximum velocity of x-axis and y-axis /pixel/s	RMSE_x /pixel	RMSE_y /pixel
10	2 × 10⁴	0.2758	0.2883
20	1 × 10⁴	0.1983	0.1940
40	5 × 10³	0.1623	0.1632

Update rate /kHz	RMSE_x /pixel	RMSE_y /pixel
5.55	0.6309	0.5001
11.1	0.2746	0.2671
22.2	0.1235	0.1304

Period of sinusoidal motion /ms	Maximum velocity of x-axis and y-axis /pixel/s	RMSE_x /pixel	RMSE_y /pixel
10	2 × 10⁴	0.2758	0.2883
20	1 × 10⁴	0.1983	0.1940
40	5 × 10³	0.1623	0.1632

Dual-pixel tracking of the fast-moving target based on window complementary modulation

Abstract

1. Introduction

2. Principles and methods

2.1 DMD-based dual-pixel system design for target tracking

2.2 Centroid localization method based on window complementary modulation

2.3 Complementary coding mask generation by S-shaped Sierra Lite dithering

2.4 Target tracking algorithm based on the alternating interpolation Kalman filter

3. Results and analysis

3.1 Simulation

3.2 Laboratory experiments

4. Conclusion

Funding

Disclosures

Data availability

References

Data availability

Cited By

Figures (12)

Tables (2)

Equations (16)

Optics Express