Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Robust shape-based template matching algorithm for target localization under SEM

Open Access Open Access

Abstract

A novel acceleration algorithm for geometric template matching is proposed based on the Cauchy–Schwartz inequality (C-S inequality) in this paper. The proposed approach is validated under a scanning electron microscope (SEM), and its effectiveness is demonstrated. In this approach, the object shape features are represented as column vectors with complex elements. Then, a threshold is determined to exclude the sliding windows without any matched objects. Finally, surface fitting is employed to obtain the subpixel positions of the targets, and least squares adjustment is utilized to fine-tune the obtained results. The experimental results demonstrate that the proposed method can significantly reduce the matching time by 59% - 96% compared with the traditional shape-based method. Furthermore, the strong robustness and high accuracy are verified under different disturbances. Additionally, the approach is shown to be robust and accurate under different types of disturbances, thus confirming its suitability for real-time targets tracking on SEM with high accuracy.

© 2023 Optica Publishing Group under the terms of the Optica Open Access Publishing Agreement

1. Introduction

With the advancement of manufacturing technologies, electronic components are increasingly shrinking in size. As the microstructures of devices approach the nanoscale, accurately tracking their positions becomes crucial. Given the miniature size of the microstructures, conventional contact-based measurement techniques are impractical. Instead, microvision techniques are primarily utilized for measurement and analysis. These techniques commonly include optical microscopy, atomic force microscopy (AFM), and scanning electron microscopy (SEM) [1]. Although optical microscopes are low-cost and allow for swift measurements, their capabilities are limited due to the constraints of optical wavelengths, rendering measurements at the nanometer level difficult. While AFMs offer high resolution and are capable of measuring three-dimensional surfaces, they also face limitations such as small imaging scales and slow imaging speeds [2].

The utilization of scanning electron microscopes (SEM) presents numerous advantages over optical microscopes, including the capability of providing high magnification, a large field of view, and a large depth of focus. However, the electron escape phenomena inherent in SEM imaging can result in edge blooming effects and photometric fluctuations, thereby impacting the quality of the resulting images [1]. Moreover, SEM images are susceptible to noise, particularly at higher scanning speeds. Despite the current commercial availability of advanced SEM models capable of frame rates exceeding 20 Hz, the associated trade-offs include lower resolution and quality [3]. Hence, the development of a robust method for accurately tracing targets in low-quality SEM images is vital. Template matching, a non-contact full-field vision measurement technique, demonstrates significant promise in this regard.

Presently, template matching algorithms are predominantly based on gray levels and geometry. In gray level-based template matching, several methods such as the Sum of Absolute Differences (SAD), Sum of Squared Differences (SSD) [4], and Normalized Cross-Correlation (NCC) [5] are utilized to determine the similarity between the gray-level values of the template and the source image for identifying the target’s location. While these methods have fast computational speed, they suffer from relatively poor robustness and lower precision when compared to geometry-based techniques.

Template matching algorithms that leverage geometric features are employed to analyze and identify objects based on their geometric properties, such as boundary, arc, circle, center of gravity, and other characteristics. These algorithms are known for their robustness to common issues such as noise, occlusion, clutter, and nonlinear illumination. However, they typically require significant amounts of memory and extended computing time. To address this challenge, we introduce a new algorithm based on the Cauchy-Schwartz inequality that establishes a threshold for ignoring those sliding windows that fail to contain any objects. Our approach significantly reduces the matching time while maintaining the high robustness of geometric feature-based matching. Compared to the traditional gray level-based template matching method, our algorithm produces results in less time. For example, on our dataset, the average matching time for nonrotating targets is 3.02 ms, which is 77% faster than traditional shape-based matching methods. Additionally, the average matching time for scale change target and rotating target is 8.32 ms and 17.35 ms, respectively, which is faster than both traditional shape-based and gray level-based matching methods. Another kind of template matching is utilizing nearest neighbor field (NNF) [6]. The target location is compared to the NNF generated from the template and the source image. The application of a similarity metric that incorporates global context enables the guidance of local diversity cues, thereby enhancing the accuracy of the match.

Template matching is a highly utilized technique for target tracking in Scanning Electron Microscopy (SEM). Notably, Fatikow, Sergej, Sievers and Torsten [7] proposed an object tracking method using active contours to track microrobots under SEM. In the interest of improving the tracking process, the research team in [8] processed SEM images to binary images and applied contour-based subpixel template matching for accurate matching. Work [8] proposed a denoising and drift compensation visual tracking algorithm for SEM, integrating a nanomanipulation system. Similarly, Cui, Le and Marchand, Eric and Haliyo, Sinan and Régnier, Stéphane [9] employed a hybrid visual tracking approach that leverages brightness information to measure the target’s position on the $x-y$ plane and defocus information to estimate the object’s depth, ultimately achieving a 3D position of the object. These studies showcase the effectiveness of template matching in SEM target tracking and the significance of hybrid approaches to address the limitations of conventional methods.

Template matching is a commonly-used technique for object detection [10,11], measurement [12], character recognition [13,14], surface defect detection [15], and 3D reconstruction [16,17]. Various approaches have been proposed in the literature to improve template matching methods for specific applications. For instance, Kong, Qiming and Wu, Zhenhua and Song, Yuantao [15] developed an improved template matching algorithm to detect surface defects in bolts. Their method utilized two parts of the template to match separately, and then conducted rotation and scaling to correct the target image before performing accurate template matching. Another study by Lin Lin and Liuzhen [18] proposed a fast template matching method for chip quality detection in microelectromechanical systems. Their approach used composite gray projection to project images in horizontal and vertical directions, which could complete five matches within 200 ms, thus meeting real-time requirements. While gray-based template matching is sensitive to geometric and illumination changes in the target. Tsai, Du-Ming and Huang, Chih-Kai [19] utilized a global Fourier image reconstruction method to locate small defects in aperiodic pattern images. Their approach involved comparing the entire Fourier spectrum of templates and detecting images by retaining only the frequency components that were unrelated to the local space. Lastly, paper [20] cascaded template matching and hash structure for 3D reconstruction. Their method involved the use of different measures and filtering strategies at each layer, thereby reducing noise sensitivity and accelerating image matching severally compared to brute force matching. Overall, these studies illustrate the potential for improving template matching algorithms by tailoring them to specific applications.

Deep learning has recently gained significant attention in object detection. Wu, Hao and Gao, Wenbin and Xu, Xiangrong [21] applied a pre-trained mask R-CNN model on the COCO dataset, achieving an average segmentation accuracy (mAP) exceeding 95% for solder joint recognition. Solorzano, Carlos and Tsai, Du Ming [22] proposed a reinforcement learning (RL) model based on an actor-critic style proximal policy optimization algorithm(s) (AC-PPO) that did not require a manually marked training set, but only a template from the reference image. All training images were generated automatically through neural networks. To combine template matching with deep learning, method [12] utilized the Siamese tracker to predict target regions and applied correlation-based template matching to correct the outputs. Algorithm [23] employed Sketch-a-Net to classify targets and then selected a corresponding template image from the multiclass template library to locate the target, enabling fast identification and tracking of railhead laser stripes. However, deep learning methods demand powerful hardware and a long training time, creating limitations for their industrial application. Furthermore, the positioning accuracy of deep learning still requires improvement.

The aforementioned techniques often face limitations in simultaneously achieving high speed and precision, as well as exhibiting robustness to various interferences. In this study, we propose an algorithm that significantly improves both speed and precision while demonstrating robustness. Moreover, our method is not angle-sensitive, enabling target localization at any angle. The remainder of this paper is organized as follows: Section 2 describes the methods used by our algorithm. The derivation of the acceleration method is specifically presented in Subsection 2.3. Furthermore, the experiment results on SEM are shown in subsections 3.13.3, we have tested the noise, photometric fluctuation and small rotation invariant abilities. Subsection 3.4 describes the robustness of the proposed method comprehensively. Matching time test results are given in subsection 3.5. Finally, This study is concluded in Section 4.

2. Methodology

In this study, a gradient vector-based similarity function is employed to determine the precise pixel-level location of a target within an image pyramid. To expedite the matching process, the Cauchy-Schwarz inequality is utilized. Furthermore, quadratic surface fitting and least squares adjustment techniques are utilized to extract the subpixel position of the target for improving matching accuracy. Figure 1 depicts the flow diagram of our algorithm.

 figure: Fig. 1.

Fig. 1. Flow diagram of the proposed algorithm.

Download Full Size | PDF

2.1 Similarity function

The present work employs a similarity function that is rooted in the gradient vectors of the edge points of objects [24]. This method harnesses geometric features to determine similarity between objects. The equation for this similarity function is expressed as follows:

$$s\left(r,c\right)=\frac{1}{n}\sum_{i=1}^{n}\frac{\langle \boldsymbol{d}_i^m, \boldsymbol{d}_{\left(r+r_i,c+c_i\right)}^s \rangle}{\| \boldsymbol{d}_i^m\| \cdot \| \boldsymbol{d}_{\left(r+r_i,c+c_i\right)}^s\|}$$
where $s(r,c)$ is the similarity measure, $n$ is the number of template object edge points, $\boldsymbol {d}_i^m$ is the gradient vector at the edge point $p_i = (r_i, c_i)$ in the template image, $(r, c)$ is the position of the sliding window in the source image, $\boldsymbol {d}_{(r+r_i,c+c_i)}^s$ is the gradient vector at the same position $(r_i, c_i)$ in the sliding windows, $\langle \cdot \rangle$ is the dot product, and $\|\cdot \|$ is the Euclidean norm.

If the length of $d_{(r+r_i,c+c_i)}^s$ is smaller a threshold $t$ that depends on the noise level in the image and preprocessing operation that is used to extract the gradient vectors in the image, $\frac {1}{\|\boldsymbol {d}_{(r+r_i,c+c_i)}^s\|}$ will be set to 0 in Eq. (1).

In order to derive the similarity measure score, a thorough pixel-by-pixel scan of the entire source image is conducted. This entails computing the mean value of dot products between the normalized template gradient vectors and the corresponding normalized source image vectors, as depicted in Fig. 2. It should be noted that this process involves a considerable amount of computational time.

 figure: Fig. 2.

Fig. 2. Schematic of the similarity measure. Red and green vectors are template and source image gradient vectors, respectively.

Download Full Size | PDF

2.2 Image pyramid

Generally, The time consumption for calculating the similarity under various angles of each subimage in the whole image is high as the complexity is $O(whNn)$, where $w$ and $h$ are the width and height of the image, respectively, $N$ is the number of edge points in the template, and $n$ is the number of the rotations of the template. We can use the image pyramid decomposition [25] to speed up the search process.

The process of calculating similarity between subimages in an entire image is typically time-consuming, particularly when considering varying angles. This is due to the complexity of the algorithm, which is characterized as $O(whNn)$, where $w$ and $h$ correspond to width and height of the image, $N$ refers to the number of edge points in the template, and $n$ represents the number of rotations applied to the template. To mitigate this high computational burden, the image pyramid decomposition method (as presented in Ref. [25]) can be utilized to accelerate the search process.

Firstly, the original image and template image are progressively downsampled to generate images at different resolutions, forming an image pyramid. Next, template matching is performed on the top-level image $I_0$ to get the coordinates of the sliding window with the several highest similarity scores. Subsequently, these coordinates are scaled according to the downsampling rate to obtain the corresponding coordinates in the next level image $I_1$. Then, template matching is performed only in the vicinity of these coordinates in $I_1$ to find the positions of the sliding windows with the several highest similarity score too. This process is repeated until the final matching result is obtained in the bottom-level image $I_n$. Since the approximate location of the target has already been established in the previous layer, the search in the current layer can be confined to a much smaller neighborhood around the previously mapped locations.Figure 3 depicts the template and image pyramid and the corresponding search process in each layer. Overall, This method could greatly save matching time when locating the targets in complex images with multiple potential interested objects.

 figure: Fig. 3.

Fig. 3. Image pyramid for fast target searching.

Download Full Size | PDF

2.3 Cauchy–Schwartz inequality based acceleration

2.3.1 Refactoring Cauchy—Schwartz inequality

The most common form of Cauchy–Schwartz inequality is expressed as:

$$\left( \sum_{i = 1}^n a_i b_i \right)^2 \leq \left( \sum_{i=1}^{n} a_i^2 \right) \left(\sum_{i=1}^n b_i^2 \right)$$
where $a_i,b_i \in R, i = 1, 2,\ldots, n$. The necessary and sufficient conditions for formula (2) to be an equation are $a_i = k b_i,i=1,2,\ldots,n$.

Since only the real numbers can be compared in size , two complex numbers will be compared with their modulus. For an arbitrary complex number $z = x+iy$, its modulus is $| z | = \sqrt {x^2+y^2}$. In this case, which is to say the inequality elements are complex numbers, the Cauchy–Schwartz inequality can be modified as follows:

$$\left| \sum_{ i = 1}^n a_ib_i \right|^2 \leq \left( \sum_{i=1}^n |a_i|^2 \right) \left( \sum_{i=1}^n |b_i|^2 \right)$$
where $a_i,b_i\in C ,i = 1, 2, \ldots, n$. The necessary and sufficient conditions for equality are $a_i=k b_i,i=1,2,\ldots,n, k \in C$.

If the two variables of the Cauchy–Schwartz inequality are defined as two column vectors, formula (3) can be rewritten as Eq. (4):

$$\left| \boldsymbol{a}\cdot \boldsymbol{b} \right| \leq \| \boldsymbol{a} \|\cdot \| \boldsymbol{b} \|$$
where $\boldsymbol {a} = (a_1,a_2,\ldots,a_n), \boldsymbol {b}=(b_1,b_2,\ldots,b_n)$. Moreover, the length of the complex vector is defined as follows:
$$\| \boldsymbol{a} \| =\sqrt{ |a_1|^2+|a_2|^2+\cdots+ |a_n|^2}$$

The abovementioned expression is called $\mathcal {L}_2$ norm for a complex vector, and formula (4) is named as the Cauchy–Schwartz inequality. If $\boldsymbol {a}$ and $\boldsymbol {b}$ are in n dimension complex planes, their dot product $(\boldsymbol {a}\cdot \boldsymbol {b})$ is also a complex number, and the intersection angle is a complex angle. Vector $\boldsymbol {c}$ is defined as $\boldsymbol {c} = \boldsymbol {b}-\lambda \boldsymbol {a}$. If he vectors $\boldsymbol {a}$ and $\boldsymbol {b}$ are not parallel, $\lambda$ is a complex number, and the length of vector $\boldsymbol {c}$ is determined as Eq. (6):

$$\|\boldsymbol{c}\|^2 = \|\boldsymbol{a}\|^2\left|\lambda\right|^2-2\left|\boldsymbol{a}\cdot\boldsymbol{b}\cdot\lambda\right|+\|\boldsymbol{b}\|^2$$

If the $\lambda$ equals $(\boldsymbol {a}\cdot \boldsymbol {b})/\|\boldsymbol {a}\|^2$, i.e., $\boldsymbol {c}$ is perpendicular to $\boldsymbol {b}$. Equation (6) is described as follows.

$$0\leq \|\boldsymbol{c}\|^2 = \|\boldsymbol{a}\|^2 \left| \frac{\boldsymbol{a}\cdot\boldsymbol{b}}{\|\boldsymbol{a}\|^2} \right|^2 - 2\left| \boldsymbol{a}\cdot \boldsymbol{b} \frac{\boldsymbol{a}\cdot\boldsymbol{b}}{\|\boldsymbol{a}\|^2} \right| + \|\boldsymbol{b}\|^2$$

Figure 4 shows the geometric relationship for $\boldsymbol {a}$, $\boldsymbol {b}$ and $\boldsymbol {c}$. $\lambda \boldsymbol {a}$ is the projection for vector $\boldsymbol {b}$ in the vector direction $\boldsymbol {a}$. Here, $\lambda$ is called a Lagrange multiplier. From formula (7), an equation can be expressed as (8).

$$\|\boldsymbol{a}\|^2 \|\boldsymbol{b}\|^2 - \left|\boldsymbol{a}\cdot \boldsymbol{b}\right|^2 = \|\boldsymbol{a}\|^2 \|\boldsymbol{c}\|^2$$
where $\|\boldsymbol {a}\|^2\|\boldsymbol {c}\|^2$ is the error term for the Cauchy–Schwartz inequality.

 figure: Fig. 4.

Fig. 4. The geometric relationship of three vectors.

Download Full Size | PDF

2.3.2 Improved similarity function

The matching space used herein is every edge point’s direction vector. Its polar coordinator form is defined as:

$$\begin{array}{l} \boldsymbol{d}_i = (x_i, y_i) = (r_i\cos\theta_i, r_i\sin\theta_i) \end{array}$$
where $r$ is the vector length and $\theta$ is the angle between the point’s direction vector and horizontal axis. Here, the similarity function (1) can be modified as Eq. (26):
$$\begin{aligned} s(r,c) \!\!\!\!&=\!\!\!\! \frac{1}{n}\sum_{i=1}^n \frac{\langle \boldsymbol{d}_i^m, \boldsymbol{d}_{(r+r_i,c+c_i)}^s \rangle}{\|\boldsymbol{d}_i^m \|\cdot \| \boldsymbol{d}_{(r+r_i,c+c_i)}^s \|}\\ &=\!\!\!\! \frac{1}{n}\sum_{i=1}^n \frac{\boldsymbol{x}_i\cdot\boldsymbol{x}_{(r+r_i,c+c_i)}+\boldsymbol{y}_i\cdot\boldsymbol{y}_{(r+r_i,c+c_i)}}{(\sqrt{|\boldsymbol{x}_i|^2+|\boldsymbol{y}_i|^2})\cdot(\sqrt{|\boldsymbol{x}_{(r+r_i,c+c_i)}|^2+|\boldsymbol{y}_{(r+r_i,c+c_i)}|^2})}\\ &=\!\!\!\! \frac{1}{n}\sum_{i=1}^n \frac{r_i r_{(r+r_i,c+c_i)}(\cos\theta_i\cos\theta_{(r+r_i,c+c_i)}+\sin\theta_i\sin\theta_{(r+r_i,c+c_i)})}{r_ir_{(r+r_i,c+c_i)}}\\ &=\!\!\!\! \frac{1}{n} \sum_{i=1}^n \cos(\theta_i-\theta_{(r+r_i,c+c_i)}) \end{aligned}$$

Defining the direction vector as a complex number: $\boldsymbol {d} = x + jy$, where $x = r\cos \theta$ and $y = r\sin \theta$ or $-r\sin \theta$. More specifically, when constructing complex vectors from the template edge vectors, we employ the convention $y = r\sin \theta$, whereas when constructing complex vectors from the gradient vectors of the source image, we use $y = -r\sin \theta$. Then, the Eq. (14) can be modified as follows:

$$\begin{aligned} s(r,c) &= \frac{1}{n}\sum_{i = 1}^{n}\cos(\theta_i-\theta_{(r+r_i,c+c_i)})\\ &= \frac{1}{n}\sum_{i=1}^{n} \mathcal{R}\{ \cos(\theta_i-\theta_{(r+r_i,c+c_i)} )+ j\sin(\theta_i-\theta_{(r+r_i,c+c_i)})\}\\ &= \frac{1}{n}\sum_{i=1}^{n} \mathcal{R}\{ (\cos\theta_i + j\sin\theta_i)\cdot (\cos\theta_{(r+r_i,c+c_i)} - j\sin\theta_{(r+r_i,c+c_i)}) \}\\ &= \frac{1}{n}\sum_{i=1}^{n} \mathcal{R}\left\{ \frac{a_i b_{(r+r_i,c+c_i)}}{\left|a_i\right| \left|b_{(r+r_i,c+c_i)}\right|} \right\} \end{aligned}$$
where $\mathcal {R}\{x + jy\} = x$ is the real part for a complex number, and $a_i$ and $b_i$ are the direction vectors expressed in complex numbers as Eq. (11).
$$\left\{ \begin{aligned} a_i = & r_i\cos\theta_i+jr_i\sin\theta_i \\ b_i = & r_{(r+r_i,c+c_i)}\cos\theta_{(r+r_i,c+c_i)} -jr_{(r+r_i,c+c_i)}\sin\theta_{(r+r_i,c+c_i)} \end{aligned} \right.$$

Then, the length of all vectors were normalized, i.e., let $r_i = 1$, and $r_{(r+r_i,c+c_i)} = 1$. If the lengths of certain vectors are smaller the threshold $t$ in source images, let their $r_{(r+r_i,c+c_i)} = 0$, where threshold $t$ has been discussed in Subsection A. Thus, we obtain:

$$ s(r,c) = \frac{1}{n}\sum_{i=1}^{n} \mathcal{R}( a_i b_{(r+r_i,c+c_i)}) = \frac{1}{n}\mathcal{R}(\boldsymbol{a} \cdot \boldsymbol{b})$$
where $\boldsymbol {a}=(a_1,a_2,\ldots,a_n)$, $\boldsymbol {b}=(b_1,b_2,\ldots,b_n)$ are represented the template edge points set and the source object edge points set, , and $\mathcal {R}(\boldsymbol {a}\cdot \boldsymbol {b}) = ns(r,c)$. Equation (8) can be modified as:
$$\begin{aligned} \|\boldsymbol{a}\|^2\|\boldsymbol{b}\|^2-\left|\boldsymbol{a}\cdot\boldsymbol{b}\right|^2 &= \|\boldsymbol{a}\|^2\|\boldsymbol{b}\|^2-(\mathcal{R}^2(\boldsymbol{a}\cdot\boldsymbol{b})+\mathcal{I}^2(\boldsymbol{a}\cdot\boldsymbol{b}))\\ &= \|\boldsymbol{a}\|^2\|\boldsymbol{c}\|^2 \end{aligned}$$
where $\mathcal {I}$ is the imaginary part of the complex number $(\boldsymbol {a}\cdot \boldsymbol {b})$, let $\mathcal {I}^2$ be equal to $r\mathcal {R}^2$ and $\|\boldsymbol {c}\|^2$ be equal to $l\|\boldsymbol {b}\|^2$. Then, Eq. (12) can be expressed:
$$(1-l)\|\boldsymbol{a}\|^2\|\boldsymbol{b}\|^2 = (\boldsymbol{a}\cdot\boldsymbol{b})^2 = (1+r)\mathcal{R}^2 = (1 + r)n^2s^2(r,c)$$
then, we have:
$$s^2(r,c) = \frac{(1-l)\|\boldsymbol{a}\|^2\|\boldsymbol{b}\|^2}{(1+r)n^2}$$
where $\|\boldsymbol {a}\|^2$ is a constant value, which is the number of the template edge points because every element in vector $\boldsymbol {a}$ is normalized; $\|\boldsymbol {b}\|^2$ is the number of edge vectors whose length exceed the threshold $t$ in the corresponding position in the sliding window. Hence, $\|\boldsymbol {a}\|^2 = n$, $\|\boldsymbol {b}\|^2 \leq n$. Equation (14) can be modified as follows:
$$s^2(r,c) = \frac{\|\boldsymbol{a}\|^2\cdot \|\boldsymbol{b}\|^2 (1-l)}{n^2(1+r)} = \frac{\|\boldsymbol{b}\|^2(1-l)}{n(1+r)} \geq s_{\min}^2$$
where $s_{\min }\in [-1,1]$ is the similarity threshold. If $s(r,c)$ is smaller the threshold, we believe there is no target in this sliding window. Then, we can obtain
$$\|\boldsymbol{b}\|^2 \geq \frac{n(1+r)}{1-l}s_{\min}^2$$

The next step entails estimating the values of $r$ and $l$. Obviously, $r$ exceeds zero. From Fig. 4, the length of the vector $\boldsymbol {c}$ does not exceed the length of vector $\boldsymbol {b}$, and so $l$ ranges from 0 to 1. Thus, we can obtain:

$$\|\boldsymbol{b}\|^2 \geq ns_{\min}^2$$

Finally, a criterion can be summarised: if there is a position where $\|\boldsymbol {b}\|^2<ns_{\min }^2$, it is not the candidate result, and the similarity measure will not be calculated at this position. In the derivation process, Eq. (17) does not relate to the object angle, and so the criterion is insensitive to the rotation transform.

2.4 Sub-pixel edge detection

Using the abovementioned methods, the pixel level-matching result can be obtained; however, is insufficient in industrial scenes. There are several subpixel edge extraction algorihtms [2528]. Herein, we employ a simple surface-fitting method to detect the subpixel edge points for efficiency.

2.4.1 Surface fitting

The second-order polynomial is used for the surface fitting of the gray image as Eq. (18).

$$G(x, y) = m_0+m_1x+m_2y+m_3xy+m_4x^2+m_5y^2$$

The difference between the real and fitted gray values is computed and minimized the difference. We can obtain the variables $m_i (i = 0,1,\ldots, 5)$ in Eq. (18). This procedure can be formalized with Eq. (19).

$$E=\sum_{i=1}^{n}[G(x_i,y_i)-g_i]^2\rightarrow \min$$
where $g_i$ is the gray value of the pixel $(x_i,y_i)$. We can practically utilize the magnitude of the gradient to replace the pixel value for robustness. To differentiate the Eq. (19), we can obtain the values of coefficients $m_i$ by using the least square method:
$$\boldsymbol{L} = \boldsymbol{P}^{+}\boldsymbol{g}$$
where $\boldsymbol {P}^{+} = (\boldsymbol {P}^T\boldsymbol {P})^{-1}\boldsymbol {P}^T$, and $\boldsymbol {L} = (m_0,m_1,\ldots, m_5)^T$. $\boldsymbol {g} = (\sum _{i=1}^{9}g_i, \sum _{i=1}^{9}x_ig_i, \sum _{i=1}^{9}y_ig_i, \sum _{i=1}^{9}x_iy_ig_i, \\ \sum _{i=1}^{9} x_i^2g_i, \sum _{i=1}^{9}y_i^2g_i )^T$ blue [29]. Moreover,
$$\boldsymbol{P} = \left( \begin{array}{cccccc} \sum 1 & \sum x_i & \sum y_i & \sum x_iy_i & \sum x_i^2 & \sum y_i^2\\ \sum x_i & \sum x_i^2 & \sum x_iy_i & \sum x_i^2y_i & \sum x_i^3 & \sum x_iy_i^2\\ \sum y_i & \sum x_iy_i & \sum y_i^2 & \sum x_iy_i^2 & \sum x_i^2y_i & \sum y_i^3\\ \sum x_iy_i & \sum x_i^2y_i & \sum x_iy_i^2 & \sum x_i^2y_i^2 & \sum x_i^3y_i & \sum x_iy_i^3\\ \sum x_i^2 & \sum x_i^3 & \sum x_i^2y_i & \sum x_i^3y_i & \sum x_i^4 & \sum x_i^2y_i^2\\ \sum y_i^2 & \sum x_iy_i^2 & \sum y_i^3 & \sum x_iy_i^3 & \sum x_i^2y_i^2 & \sum y_i^4\\ \end{array} \right)$$

2.4.2 Extract subpixel edge

From the fitted surface $G(x,y)$, the direction vector of a boundary point on the image is written as:

$$\left\{ \begin{aligned} G_x = \frac{\partial G(x,y)}{\partial x} & =m_1+m_3y+2m_4x\\ G_y = \frac{\partial G(x,y)}{\partial y} & =m_2+m_3x+2m_5y \end{aligned} \right.$$

Figure 5 shows a $3\times 3$ subimage centered at $P_4(0,0)$. Plane $\alpha$, which passes via the point $P_4(0,0)$ and parallel to the gradient direction of the $G(x,y)$ at $P_4(0,0)$ and perpendicular to the pixel plane, is defined as follows: $y-y_4 = k(x-x_4)$. The slope $k$ is defined as $k = G_{y4}/G_{x4}$. The extreme point of the curve where the $\alpha$ and the $G(x,y)$ intersect is the subpixel coordinate corresponding to $P_4(0,0)$. To calculate it, $\alpha$ and Eq. (18) are combined, and eliminating variable $y$ we obtain:

$$\begin{aligned} G(x) = & m_0+m_1x+m_2(kx-kx_4+y_4)+m_3x(kx-kx_4+\\ & y_4)+m_4x^2+m_5(kx-kx_4+y_4)^2 \end{aligned}$$

 figure: Fig. 5.

Fig. 5. A subimage with $3 \times 3$ centered at $P_4(0,0)$.

Download Full Size | PDF

Let the first derivative of Eq. (22) be zero to obtain (23).

$$G'(x) = m_0+km_2+2km_3x+2m_4x+2km_5(kx-kx_4+y_4) = 0$$

Then, we have

$$x = \frac{m_1+km_2+2k^2m_5+2km_5y_4}{2km_3+2m_4+2km_5}.$$

Then, Eq. (24) is substituted into $y-y_4=k(x-x_4)$, and we obtain $y$. Finally, the subpixel position of the edge point is $P(x,y)$.

Using the abovementioned method, the subpixel edge points can be extracted Fig. 6. The top figure is the template, and a small magnified portion shows the location and gradient direction of the subpixel edge $d^m$. The bottom is the source image and the tangent line $v(a,b,c)$ of the gradient direction of the pixel point paired with the subpixel point $d^s$ in the template image.

 figure: Fig. 6.

Fig. 6. Schematic diagram of Sub-pixel results.

Download Full Size | PDF

2.5 Least squares adjustment

To obtain higher subpixel accuracy, the least squares adjustment method [30] was used to optimize the accuracy of the matching position and rotation angle.

The edge point in the template is denoted as $\boldsymbol {p} = (x_i,y_i)$, the homogeneous representation is $\boldsymbol {p} = (x_i,y_i,1)$ and its normal direction is denoted as $n_i$ . Then,we need to find out the corresponding edge point in source image, the coordinates of the top-left corner of the sliding window in the source image are $(r,c)$, we take $(r+x_i,c+y_i)$ as the base point and search along the normal direction $n_i$ in the source image to find the nearest edge point $(x, y)$, which is regarded as the corresponding edge point. The feature line, which is the tangent of the corresponding edge point in the source image, is expressed as $a_ix+b_iy = c_i, (a_i^2+b_i^2 = 1)$, where $\boldsymbol {v_i} = (a_i, b_i, -c_i)$ represents this feature line. The sum of squares of distances between all edge points and corresponding feature lines is

$$E(\hat{x},\hat{y},\theta,\gamma)=\sum_{i = 1}^n\|v_i^T\cdot [T(\hat{x}, \hat{y}, \theta, \gamma)\boldsymbol{p}_i] \|^2$$
where $T(\hat {x}, \hat {y}, \theta, \gamma )$ is a transform matrix represented as:
$$T(\hat{x}, \hat{y}, \theta, \gamma) = \left[ \begin{array}{ccc} \gamma\cos\theta & -\gamma\sin\theta & \hat{x}\\ \gamma\sin\theta & \gamma\cos\theta & \hat{y}\\ 0 & 0 & 1 \end{array} \right]$$
$(\hat {x},\hat {y})$ is the translation vector of the edge point, the rotation angle is $\theta$, and the scale factor is $\gamma =\gamma _0+\Delta \gamma$ ($\gamma _0$ is a initial value). Equation (25) can then be rewritten as (27) using Taylor expansion.
$$E(\hat{x},\hat{y},\theta,\gamma) = \sum_{i=1}^n (a_i\hat{x}+b_i\hat{y}+\alpha_i\theta s+\beta_i \gamma-c_i)^2$$
$\hat {x}, \hat {y}, \theta$, and $\gamma$ that minimize Eq. (27) via the least-square method is the adjustment amount of the target position. The least square method is elaborated as (20).

3. Experiments

In this section, we collect three sets of images from SEM to verify the performance of our proposed algorithm under noise, illumination fluctuations and small rotation respectively, which are commonly encountered during the imaging process. To provide a comprehensive comparison, we evaluate Lu’s method [31], which also is a shape-based template matching approach designed for SEM images. classic NCC algorithm is also implemented as a comparison method.

To verify the potential applications of our algorithm in other scenarios, we also collected a set of synthetic datasets that were subjected to various types of disturbances to test the robustness of the proposed algorithm. this datasets also be used for test the matching time under various interferences.

In the implementations, all algorithms are programmed in Visual Studio 2022 and executed on a computer with AMD R7 5800X locked on 4.6 GHz processor.

3.1 Anti-noise test

To verify the anti-noise ability of our algorithm, this part of the experiment is implemented. First, we collected a group of noise-free images on SEM. Then, Gaussian noise with different variances $\sigma ^2 = (0.01, 0.1, 0.2, 0.5, 0.8)$ are added to this group of pictures to simulate the actual situation may be encountered. Some sample images and test results of proposed algorithm are shown in Fig. 7 (a). One can see that even under the highest noise variance $\sigma ^2 = 0.8$, our algorithm can still accurately find the target position, with the $X_{error} = 1.02$ pixel and $Y_{error} = 1.15$ pixel.

 figure: Fig. 7.

Fig. 7. Some images collected from SEM and the results of our algorithm, the red cross is the detected position.

Download Full Size | PDF

To assess the effect of noise on matching accuracy, we opt to employ the mean error and standard deviation (STD) of errors with euclidean distances between detected positions and true positions as metrics to compare our method with Lu’s and NCC, and the results are shown in Fig. 8. It can be observed that our algorithm exhibits consistent and superior performance in terms of mean error and STD compared to other methods as the noise level increases. Notably, when the $\sigma ^2$ reaches 0.8, Lu’s method shows a significant increase in error STD, while the NCC method fails to match. Furthermore, our algorithm demonstrates exceptional matching accuracy in the absence of noise, achieving an average error of 0.03 pixels. This is reasonable because we have use two strategies, second-order surface fitting and least squares adjustment, to refine the results.

 figure: Fig. 8.

Fig. 8. Anti-noise test results. NCC fails to match when the variance $\sigma ^2 = 0.8$.

Download Full Size | PDF

3.2 Photometric fluctuation test

we utilize a pointolite model to simulate the photometric fluctuations that may are encountered during the imaging of SEM. the pointlite model can be described as

$$I = k(1 - \frac{d^2}{r^2}),\qquad d\leq r$$
where $r$ represents the influence range of the pointolite, $d$ denotes the distance from a position to the center of the pointolite, and $k$ donates the light intensity factor. In our experiment, the $r$ is set to 400, $k$ are choose in $(0.05, 0.1, 0.2, 0.5)$. Some pictures and detection results are shown in Fig. 7 (b). One can be noticed that our algorithm can obtain the target position accurately under any light intensity, which demonstrating the robustness of the proposed method to photometric fluctuation.

Furthmore, Fig. 9 shows the mean errors and STD of proposed algorithm, Lu’s method and NCC. It shows that the mean errors of our algorithm are smaller than 0.2 pixel, and outperform the other two methods when $k \leq 0.8$. While $k$ increase into 0.8, the NCC lose the ability to capture targets, though our method achieve the suboptimal results in terms of mean error, it shows the minimal STD, which indicates the stability of our method.

 figure: Fig. 9.

Fig. 9. Photometric fluctuation test results. NCC fails to match when $k = 0.8$.

Download Full Size | PDF

3.3 Small rotation invariability test

In the imaging process of SEM, small angle errors may occur between the source images and the template. Therefore, it is necessary to evaluate the small rotation invariant ability of the proposed algorithm. To this end, we conducted an experiment where an image of size 1024 $\times$ 768 was collected on SEM and a template was cropped from the image on the coordinates (400, 465). Then, the image is rotated at -0.5$^{\circ }$, 0.5$^{\circ }$, 1$^{\circ }$ around its center respectively, and the theoretical true positions $P_t$ of the targets after rotation are calculated. Finally, Our proposed algorithm, NCC as well as Lu’s method, are applied to these rotated images, and the resulting matching positions $P_{our}$ and $P_{lu}$ are compared and presented in Table 1. From the table, we can notice that all algorithms could capture the targets, but our method has the smallest errors both in the $x$ axes and $y$ axes. The findings indicate that our algorithm achieved optimal results under all rotation angles.

Tables Icon

Table 1. The matching results under small rotation

3.4 Robustness test

To verify the potential applications of our algorithm in other scenarios and comprehensively evaluate the performance of our algorithm under various disturbances, we collected datasets under five kinds of disturbances, which are rotation, scale change, occlusion, blur and nonlinear illumination. The rotation part is generated by sequentially rotating the same image from 0$^{\circ }$ to 360$^{\circ }$ in steps of 5$^{\circ }$. Similarity, the scale change part is generated with a scale magnify of 0.8x to 1.2x in steps of 0.02x. The nonlinear illumination dataset is collected under different illumination conditions. All image are $640 \times 480$ with 8-bit color depth, and certain test images are shown in Fig. 10. Furthermore, we selected the NCC, global-aware diversity (GAD) based template matching [6] and traditional shape-based (S-B) template matching [23] as the comparison methods.

 figure: Fig. 10.

Fig. 10. Examples of the dataset. The images in the first row are templates, the second and third rows are part of the images under different interferences.

Download Full Size | PDF

The success rate (SR) and mean intersection-over-union (MIoU) were used as indices to evaluate the matching precision under various interferences. They are popular in template matching algorithms. These two indices are defined by the IoU. The calculation method of IOU is shown in Eq. (29), where $GT_i$ is the ground truth box of the target $i$, and $D_i$ is the detection box calculated by the matching algorithms. The sizes of the ground truth and detection boxes are the same as the template image size multiplied by the detected scaling coefficient $\gamma$. Then, we can define SR and MIoU, as shown in Eq. (30) and Eq. (31), where $n$ is the number of targets, and $\mathbb {I}(\cdot )$ is an indicator function. $\mathbb{I}(IoU \geq 0.7)$ signifies that if the IoU is greater than or equal to 0.7, the function outputs the original IoU value; otherwise, it outputs 0. This function serves to exclude low-quality detection results and emphasizes high-quality detection outcomes.

$$IoU(i) = \frac{D_i \bigcap GT_i}{D_i \bigcup GT_i}$$
$$SR = \frac{1}{n} \sum_{i}^{n}IoU(i)$$
$$MIoU = \frac{1}{n} \sum_{i}^{n}\mathbb{I}(IoU \geq 0.7)$$

The matching precision is presented in Table 2. Our method obtained the highest scores under the four of five types of disturbances. Though the NCC obtains the highest score under the blur, our algorithm obtains a comparable score with a slight disadvantage. For rotation and scale change, our method obtained the highest scores. GAD cannot detect the target angle, and so it yields a low score when the target has angle changes. Note that our algorithm made significant progress in dealing with occlusion and nonlinear illumination, thus obtaining a score far higher than NCC and GAD. This outcome is because the NCC calculates the similarity function pixel by pixel, whereas the occlusion and nonlinear illumination change the gray level of local pixels in a large range. The GAD algorithm can only obtain a rough target position. Therefore, only a low score can be obtained under most disturbance types, although it has the least matching time.

Tables Icon

Table 2. Matching precision under blur, occlusion, nonlinear illumination, rotation, and scale change.

We also test our algorithm’s angle detection ability when there are large angle deviations between the template and source images. In this item, a chip image is rotated from $0^{\circ }$ to $355^{\circ }$ every $5^{\circ }$ to obtain 72 rotated images. Some of the images are shown in Fig. 10 (d), from which we can identify that the target is tiny. This demonstrates that the number of features is small in this model. Because GAD cannot detect the target angle, it does not participate in this comparison. The partial detection results selected every 35$^{\circ }$ are shown in Table 3; our method achieved the lowest angle error on all images except 0$^{\circ }$ image. The average angle error of all 72 images of our algorithm is 0.055$^{\circ }$, whereas that of the NCC and traditional shape-based matching are 1.295$^{\circ }$ and 1.680$^{\circ }$ respectively. Our algorithm has made considerable progress in angle error.

Tables Icon

Table 3. Partial results of angle detection

3.5 Matching time

The proposed algorithm’s effectiveness in dealing with various disturbances has been demonstrated in Fig. 11, wherein it shows a relatively short matching time of approximately 3 ms. These disturbances include noise, blur, occlusion, and nonlinear illumination. However, with disturbances like scale change and rotation, the average matching time increases to 8.3 and 17.3 ms, respectively, it mainly because of the multiple matching. To tackle the problem of low scores of similarity function when scale change and rotation happen, we generate templates at different scales or rotation angles and performing sequential matching on the source images. Of all the algorithms tested, GAD has the shortest matching time of 2 ms, but the algorithm takes considerable time to build the NNF from the source and template images before matching. As a result, the real running time takes several seconds. In conclusion, compared to other methods, our algorithm’s matching time is significantly lower, demonstrating its exceptional capacity for saving matching time.

 figure: Fig. 11.

Fig. 11. Average matching time under different disturbances. GAD* includes the matching and NNF building times. Non-ill and Sc-ch are the abbreviations of nonlinear illumination and scale change.

Download Full Size | PDF

4. Conclusion

We propose an improved shape-based template matching method for quickly and accurately detecting targets on scanning electron microscope (SEM) images. Our method is derived from the Cauchy-Schwartz inequality and employs a simple threshold of the number of gradient vectors to exclude sliding windows where targets cannot be included. This approach greatly reduces the matching time consumption. Additionally, surface fitting and least squares adjustment are used to enhance matching accuracy.

Our algorithm has been tested on SEM images to evaluate noise immunity, photometric fluctuation, and small rotation invariability. Results demonstrate the effectiveness of our proposed algorithm. We further conducted a comprehensive comparison with other template matching algorithms under various interferences to confirm the robustness of our method. To assess time efficiency, we compared our algorithm with three classical algorithms. Our results demonstrate that our proposed algorithm excels in matching time efficiency. Overall, these findings provide strong evidence of the practicality and efficiency of our proposed algorithm for template matching tasks on SEM.

Funding

National Natural Science Foundation of China (52175459).

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China (52175459), in part by the Basic Research Key Project of Shenzhen Science and Technology Plan (JCYJ20200109113416531), and by the Key Technical projects Shenzhen Science and Technology Plan (JSGG20200701095007013)

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper may be obtained from the authors upon request.

References

1. S. Pang, X. Zhang, H. Li, and Y. Lu, “Edge determination improvement of scanning electron microscope images by inpainting and anisotropic diffusion for measurement and analysis of microstructures,” Measurement 176, 109217 (2021). [CrossRef]  

2. F. J. Giessibl, “Advances in atomic force microscopy,” Rev. Mod. Phys. 75(3), 949–983 (2003). [CrossRef]  

3. S. Yao, H. Li, S. Pang, B. Zhu, X. Zhang, and S. Fatikow, “A review of computer microvision-based precision motion measurement: Principles, characteristics, and applications,” IEEE Trans. Instrum. Meas. 70, 1–28 (2021). [CrossRef]  

4. P. Swaroop and N. Sharma, “An overview of various template matching methodologies in image processing,” International Journal of Computer Applications 153(10), 8–14 (2016). [CrossRef]  

5. A. Crispin and V. Rankov, “Automated inspection of pcb components using a genetic algorithm template-matching approach,” The International Journal of Advanced Manufacturing Technology 35(3-4), 293–300 (2007). [CrossRef]  

6. Y. Lan, X. Wu, and Y. Li, “Gad: A global-aware diversity-based template matching method,” IEEE Trans. Instrum. Meas. 71, 1–13 (2022). [CrossRef]  

7. S. Fatikow and T. Sievers, “Real-time object tracking for the robot-based nanohandling in a scanning electron microscope,” Journal of Micromechatronics 3(3), 267–284 (2006). [CrossRef]  

8. Z. Gong, B. K. Chen, J. Liu, and Y. Sun, “Robotic probing of nanostructures inside scanning electron microscopy,” IEEE Trans. Robot. 30(3), 758–765 (2014). [CrossRef]  

9. L. Cui, E. Marchand, S. Haliyo, and S. Régnier, “Three-dimensional visual tracking and pose estimation in scanning electron microscopes,” in 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (IEEE, 2016), pp. 5210–5215.

10. J. Coughlan, A. Yuille, C. English, and D. Snow, “Efficient deformable template detection and localization without user initialization,” Computer Vision and Image Understanding 78(3), 303–319 (2000). [CrossRef]  

11. P. F. Felzenszwalb, R. B. Girshick, D. McAllester, and D. Ramanan, “Object detection with discriminatively trained part-based models,” IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010). [CrossRef]  

12. X. Wang, “An accurate and distraction-free vision-based structural displacement measurement method integrating siamese network based tracker and correlation-based template matching,” Measurement 179, 109506 (2021). [CrossRef]  

13. R. Boia, C. Florea, L. Florea, and R. Dogaru, “Logo localization and recognition in natural images using homographic class graphs,” Machine Vision and Applications 27(2), 287–301 (2016). [CrossRef]  

14. M. Ryan and N. Hanafiah, “An examination of character recognition on id card using template matching approach,” Procedia Computer Science 59, 520–529 (2015). [CrossRef]  

15. Q. Kong, Z. Wu, and Y. Song, “Online detection of external thread surface defects based on an improved template matching algorithm,” Measurement 195, 111087 (2022). [CrossRef]  

16. A. Malti, R. Hartley, A. Bartoli, and J.-H. Kim, “Monocular template-based 3d reconstruction of extensible surfaces with local linear elasticity,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2013), pp. 1522–1529.

17. S. M. Seitz, B. Curless, J. Diebel, D. Scharstein, and R. Szeliski, “A comparison and evaluation of multi-view stereo reconstruction algorithms,” in 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), vol. 1, (IEEE, 2006), pp. 519–528.

18. L. Lin and L. Zhen, “Mems chip quality detection by fast image matching method,” in 2014 9th International Conference on Computer Science & Education, (IEEE, 2014), pp. 635–639.

19. D.-M. Tsai and C.-K. Huang, “Defect detection in electronic surfaces using template-based fourier image reconstruction,” IEEE Trans. Compon., Packag. Manufact. Technol. 9(1), 163–172 (2019). [CrossRef]  

20. J. Cheng, C. Leng, J. Wu, H. Cui, and H. Lu, “Fast and accurate image matching with cascade hashing for 3d reconstruction,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2014), pp. 1–8.

21. H. Wu, W. Gao, and X. Xu, “Solder joint recognition using mask r-cnn method,” IEEE Trans. Compon., Packag. Manufact. Technol. 10(3), 525–530 (2020). [CrossRef]  

22. C. Solorzano and D.-M. Tsai, “Environment-adaptable printed-circuit board positioning using deep reinforcement learning,” IEEE Trans. Compon., Packag. Manufact. Technol. 12(2), 382–390 (2022). [CrossRef]  

23. S. Wang, H. Wang, Y. Zhou, J. Liu, P. Dai, X. Du, and M. A. Wahab, “Automatic laser profile recognition and fast tracking for structured light measurement using deep learning and template matching,” Measurement 169, 108362 (2021). [CrossRef]  

24. C. Steger, “Occlusion, clutter, and illumination invariant object recognition,” International Archives of Photogrammetry Remote Sensing and Spatial Information Sciences 34, 345–350 (2002).

25. S. L. Tanimoto, “Template matching in pyramids,” Computer Graphics and Image Processing 16(4), 356–369 (1981). [CrossRef]  

26. D. G. Bailey, “Sub-pixel profiling,” in 2005 5th International Conference on Information Communications & Signal Processing, (IEEE, 2005), pp. 1311–1315.

27. A. Fabijańska, “A survey of subpixel edge detection methods for images of heat-emitting metal specimens,” International Journal of Applied Mathematics and Computer Science 22(3), 695–710 (2012). [CrossRef]  

28. E. P. Lyvers, O. R. Mitchell, M. L. Akey, and A. P. Reeves, “Subpixel measurements using a moment-based edge operator,” IEEE Transactions on Pattern Analysis and Machine Intelligence 11(12), 1293–1309 (1989). [CrossRef]  

29. F. Da and H. Zhang, “Sub-pixel edge detection based on an improved moment,” Image and Vision Computing 28(12), 1645–1658 (2010). [CrossRef]  

30. G. Zou, “Fast template matching based on geometric feature (2011).

31. M. Sester, “Generalization based on least squares adjustment,” International Archives of Photogrammetry and Remote Sensing 33, 931–938 (2000).

Data availability

Data underlying the results presented in this paper may be obtained from the authors upon request.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (11)

Fig. 1.
Fig. 1. Flow diagram of the proposed algorithm.
Fig. 2.
Fig. 2. Schematic of the similarity measure. Red and green vectors are template and source image gradient vectors, respectively.
Fig. 3.
Fig. 3. Image pyramid for fast target searching.
Fig. 4.
Fig. 4. The geometric relationship of three vectors.
Fig. 5.
Fig. 5. A subimage with $3 \times 3$ centered at $P_4(0,0)$.
Fig. 6.
Fig. 6. Schematic diagram of Sub-pixel results.
Fig. 7.
Fig. 7. Some images collected from SEM and the results of our algorithm, the red cross is the detected position.
Fig. 8.
Fig. 8. Anti-noise test results. NCC fails to match when the variance $\sigma ^2 = 0.8$.
Fig. 9.
Fig. 9. Photometric fluctuation test results. NCC fails to match when $k = 0.8$.
Fig. 10.
Fig. 10. Examples of the dataset. The images in the first row are templates, the second and third rows are part of the images under different interferences.
Fig. 11.
Fig. 11. Average matching time under different disturbances. GAD* includes the matching and NNF building times. Non-ill and Sc-ch are the abbreviations of nonlinear illumination and scale change.

Tables (3)

Tables Icon

Table 1. The matching results under small rotation

Tables Icon

Table 2. Matching precision under blur, occlusion, nonlinear illumination, rotation, and scale change.

Tables Icon

Table 3. Partial results of angle detection

Equations (34)

Equations on this page are rendered with MathJax. Learn more.

s ( r , c ) = 1 n i = 1 n d i m , d ( r + r i , c + c i ) s d i m d ( r + r i , c + c i ) s
( i = 1 n a i b i ) 2 ( i = 1 n a i 2 ) ( i = 1 n b i 2 )
| i = 1 n a i b i | 2 ( i = 1 n | a i | 2 ) ( i = 1 n | b i | 2 )
| a b | a b
a = | a 1 | 2 + | a 2 | 2 + + | a n | 2
c 2 = a 2 | λ | 2 2 | a b λ | + b 2
0 c 2 = a 2 | a b a 2 | 2 2 | a b a b a 2 | + b 2
a 2 b 2 | a b | 2 = a 2 c 2
d i = ( x i , y i ) = ( r i cos θ i , r i sin θ i )
s ( r , c ) = 1 n i = 1 n d i m , d ( r + r i , c + c i ) s d i m d ( r + r i , c + c i ) s = 1 n i = 1 n x i x ( r + r i , c + c i ) + y i y ( r + r i , c + c i ) ( | x i | 2 + | y i | 2 ) ( | x ( r + r i , c + c i ) | 2 + | y ( r + r i , c + c i ) | 2 ) = 1 n i = 1 n r i r ( r + r i , c + c i ) ( cos θ i cos θ ( r + r i , c + c i ) + sin θ i sin θ ( r + r i , c + c i ) ) r i r ( r + r i , c + c i ) = 1 n i = 1 n cos ( θ i θ ( r + r i , c + c i ) )
s ( r , c ) = 1 n i = 1 n cos ( θ i θ ( r + r i , c + c i ) ) = 1 n i = 1 n R { cos ( θ i θ ( r + r i , c + c i ) ) + j sin ( θ i θ ( r + r i , c + c i ) ) } = 1 n i = 1 n R { ( cos θ i + j sin θ i ) ( cos θ ( r + r i , c + c i ) j sin θ ( r + r i , c + c i ) ) } = 1 n i = 1 n R { a i b ( r + r i , c + c i ) | a i | | b ( r + r i , c + c i ) | }
{ a i = r i cos θ i + j r i sin θ i b i = r ( r + r i , c + c i ) cos θ ( r + r i , c + c i ) j r ( r + r i , c + c i ) sin θ ( r + r i , c + c i )
s ( r , c ) = 1 n i = 1 n R ( a i b ( r + r i , c + c i ) ) = 1 n R ( a b )
a 2 b 2 | a b | 2 = a 2 b 2 ( R 2 ( a b ) + I 2 ( a b ) ) = a 2 c 2
( 1 l ) a 2 b 2 = ( a b ) 2 = ( 1 + r ) R 2 = ( 1 + r ) n 2 s 2 ( r , c )
s 2 ( r , c ) = ( 1 l ) a 2 b 2 ( 1 + r ) n 2
s 2 ( r , c ) = a 2 b 2 ( 1 l ) n 2 ( 1 + r ) = b 2 ( 1 l ) n ( 1 + r ) s min 2
b 2 n ( 1 + r ) 1 l s min 2
b 2 n s min 2
G ( x , y ) = m 0 + m 1 x + m 2 y + m 3 x y + m 4 x 2 + m 5 y 2
E = i = 1 n [ G ( x i , y i ) g i ] 2 min
L = P + g
P = ( 1 x i y i x i y i x i 2 y i 2 x i x i 2 x i y i x i 2 y i x i 3 x i y i 2 y i x i y i y i 2 x i y i 2 x i 2 y i y i 3 x i y i x i 2 y i x i y i 2 x i 2 y i 2 x i 3 y i x i y i 3 x i 2 x i 3 x i 2 y i x i 3 y i x i 4 x i 2 y i 2 y i 2 x i y i 2 y i 3 x i y i 3 x i 2 y i 2 y i 4 )
{ G x = G ( x , y ) x = m 1 + m 3 y + 2 m 4 x G y = G ( x , y ) y = m 2 + m 3 x + 2 m 5 y
G ( x ) = m 0 + m 1 x + m 2 ( k x k x 4 + y 4 ) + m 3 x ( k x k x 4 + y 4 ) + m 4 x 2 + m 5 ( k x k x 4 + y 4 ) 2
G ( x ) = m 0 + k m 2 + 2 k m 3 x + 2 m 4 x + 2 k m 5 ( k x k x 4 + y 4 ) = 0
x = m 1 + k m 2 + 2 k 2 m 5 + 2 k m 5 y 4 2 k m 3 + 2 m 4 + 2 k m 5 .
E ( x ^ , y ^ , θ , γ ) = i = 1 n v i T [ T ( x ^ , y ^ , θ , γ ) p i ] 2
T ( x ^ , y ^ , θ , γ ) = [ γ cos θ γ sin θ x ^ γ sin θ γ cos θ y ^ 0 0 1 ]
E ( x ^ , y ^ , θ , γ ) = i = 1 n ( a i x ^ + b i y ^ + α i θ s + β i γ c i ) 2
I = k ( 1 d 2 r 2 ) , d r
I o U ( i ) = D i G T i D i G T i
S R = 1 n i n I o U ( i )
M I o U = 1 n i n I ( I o U 0.7 )
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.