CN110647925A

CN110647925A - Rigid object identification method and device based on improved LINE-MOD template matching

Info

Publication number: CN110647925A
Application number: CN201910842182.XA
Authority: CN
Inventors: 王月; 范志鹏; 罗志勇; 夏文彬; 帅昊; 唐文平
Original assignee: Chongqing University of Post and Telecommunications
Current assignee: Chongqing University of Post and Telecommunications
Priority date: 2019-09-06
Filing date: 2019-09-06
Publication date: 2020-01-03

Abstract

The invention discloses a rigid object identification method and device based on improved LINE-MOD template matching, and relates to the field of identification of objects with few textures or no textures. The current method based on the feature descriptor is easily influenced by a complex background when the feature descriptor is calculated for the target object, and the method based on the deep learning has the defects of complex structure and training process and the like. Aiming at the problems, the method is an effective method for solving the recognition of the rigid object with few textures, and has good real-time performance, and meanwhile, the LINE-MOD template matching method is one of the most advanced template matching methods in recent years. However, the method has the main disadvantages that only rigid objects with few textures can be identified in a fixed scale, and how to improve multi-scale identification of the algorithm is still to be researched. Therefore, the invention provides a rigid object identification method and device based on improved LINE-MOD template matching, so that the method can identify the target object under different scales.

Description

Rigid object identification method and device based on improved LINE-MOD template matching

Technical Field

The invention belongs to the field of computer vision processing, and particularly relates to a rigid object identification method and device based on improved LINE-MOD template matching.

Background

The template matching method is mainly used for training a built target object model, then sliding matching is carried out on an acquired image by using a template image as a sliding window, and the matching position can be found out on the basis of a similarity measurement method at the best matching position. The method has strong processing capability for identifying the rigid target object with few textures, does not need a huge training data set or a time-consuming training stage, and has good expression effect on the rigid object with few textures or no textures. Early template matching methods and extensions thereof used Chamfer distance to measure the difference between the template and the test image contours. For example, Gavrila proposes a coarse-to-fine binary edge image Chamfer distance metric over shape and parameter space. The generalized distance between two groups of edge points can be reduced to the maximum extent by the Chamfer matching, but the method is extremely sensitive to outer points caused by occlusion and the like. Another distance metric for binary images is the Hausdorff distance, which measures the maximum of all distances from each edge point in the image to the nearest neighbor in the template, but is susceptible to occlusion and complex backgrounds. Huttenlocher et al tried to overcome this drawback by introducing a generalized Hausdorff distance, which to some extent overcome the effects of occlusion and cluttered background, but which required a priori estimation of the background clutter, and moreover, when there were many templates, the computation cost was high. The binary images used by these methods are mostly obtained by Canny-like edge extraction algorithms, and thus they are extremely sensitive to illumination variations, noise and blur. In order to avoid the defects of the algorithm, Hinterstoisser and the like propose a LINE-MOD method using image gradient rather than image contour as a matching feature, and the method adopts a binary mode representation of gradient direction features, and skillfully utilizes the cache of a modern computer for parallel processing. The method can detect multiple types of rigid objects with few textures in real time in a complex background environment. However, this method can only identify rigid objects at a fixed scale.

The invention provides a rigid object identification method and device based on improved LINE-MOD template matching, which considers that the LINE-MOD template matching method can only identify rigid objects with few textures at a fixed scale and how to improve algorithm multi-scale identification on the basis of in-depth research on identification methods of target objects with few textures or no textures, so that the method can identify the target objects at different scales.

Disclosure of Invention

The present invention is directed to solving the above problems of the prior art. A rigid object identification method and device based on improved LINE-MOD template matching are provided. The technical scheme of the invention is as follows:

a rigid object identification method and device based on improved LINE-MOD template matching comprises the following steps:

step 1, image preprocessing: acquiring a rigid object image on line, and carrying out smooth denoising and sharpening on the rigid object image by using Gaussian filtering and a Laplace operator in sequence to remove noise of the rigid object image and keep edge information of the image;

step 2, rigid object feature offline extraction: in the off-line stage, a rigid object model is trained in a CAD environment, image acquisition under multiple viewing angles is carried out on the rigid object model and is used as a reference image, and gradient direction descriptors are extracted on the obtained reference image and are stored in an XML file; providing an image template for the online identification of the rigid object in the step 3;

step 3, rigid object online identification: acquiring video frames of a rigid object in a real scene at an online stage, performing image preprocessing and image gradient direction descriptor extraction on each frame of image, extracting image gradient direction descriptors on the basis of the step 1, and matching the image gradient direction descriptors extracted at the online stage with the gradient direction descriptors extracted at an offline stage by an improved LINE-MOD template matching method.

Further, the step 1 successively performs smoothing, denoising and sharpening on the rigid object and the image by using gaussian filtering and laplacian operator, and specifically includes:

assuming that the input image is f (x, y) and the image obtained after the gaussian filtering is g (x, y), the transformation relationship between the two is shown as the following formula:

in the formula

For convolution of the image, x and y represent the x and y coordinates in the image coordinate system. h (x, y) is a Gaussian template that can be represented as follows:

wherein, x and y represent x and y coordinates in an image coordinate system, sigma is a standard deviation of a Gaussian function, and can describe the discrete degree of data, and when the sigma value is larger, the smoothing effect on the image is obvious; on the contrary, the sigma value is smaller, and the image edge information is protected. The gaussian filtered image g (x, y) removes some noise, but appears blurred relative to the input image f (x, y), with the degree of blurring depending on the value of σ. After obtaining the image g (x, y), it is necessary to further perform laplace transform on g (x, y) as shown in the following formula:

wherein x and y represent x and y coordinates +in the image coordinate system²g (x, y) is the image after Laplacian processing, h (x, y) is a Gaussian template, f (x, y) is an input image ^ h²h (x, y) is the second derivative of the gaussian operator.

Further, the specific process of acquiring the multi-view rigid object image in the step 2 is as follows:

the method comprises the steps that a virtual camera in a CAD environment is used for carrying out image acquisition on a CAD model of a rigid object, projection images of the model under different visual angles are acquired, each image contains different shapes of the outline of the rigid object, sampling is carried out on the surface of a regular icosahedron in order to realize uniform distribution of sampling visual angles and avoid the appearance of sampling polar points on the surface of a hemisphere, each surface of the regular icosahedron is divided into four parts on average and is subjected to iterative division, each surface of the regular icosahedron is only subjected to iteration for 2 times, and meanwhile, 16 equilateral triangles are formed on each surface; in the CAD environment, a CAD model of a rigid object is placed in the center of a regular icosahedron, the optical axis of a virtual camera always passes through the center, shape feature descriptors of the rigid object outline in all views are extracted, and the shape feature descriptors of all the rigid object outline and corresponding pose parameters are stored in an XML file.

Further, the step 2) of extracting the gradient direction descriptor of the acquired reference image specifically includes:

for each sampled image, calculating the gradient direction on three color channels at each position { R, G, B } of the rigid object outline, and taking the maximum value of the gradient direction in the three channels as the gradient direction at the point, for one input image I, the gradient direction at position x is shown in the following formula:

where C ∈ { R, G, B } is the three channels red, green, blue in the image at position x in view I, I_g(x)、Respectively represent the position x in the view IIs the gradient direction at position x in view I, C is a color channel on the three color channels R, G, B, and ori () represents the image gradient direction.

Further, in the step 3, the gradient direction descriptor of the image extracted on LINE is matched with the gradient direction descriptor extracted in the off-LINE stage by an improved LINE-MOD template matching method, and a specific similarity measurement algorithm is as follows:

where epsilon represents the similarity between the target object to be recognized and the target object training template obtained in the off-line training stage, I represents the input image, T ═ O, p represents the template of the target object, c represents the position c of the input image and the template image at the same position, ori (O, r) represents the radian of the gradient direction at the position r of the reference image O,

denotes a region centered at c + r, τ is the neighborhood, ori (I, t) denotes the radian of the gradient direction at the t point of the input image I position, and p denotes a list of positions r.

Further, in order to complete the identification of the target object under any scale, depth information is introduced into the similarity measurement method, so that the depth information has scale invariance, and a specific algorithm is as follows:

where ∈ denotes the similarity between the target object to be recognized and the target object training template obtained in the offline training stage, I denotes the input image, T ═ O, p denotes the template of the target object, and c'_IIndicating that the input image and the template image are at the same position c'_IWhere O denotes the position of the reference image O, S_o(c′_oR') denotes the gradient direction of the reference image, ori (O, S)_o(c′_oR')) is represented in a reference pictureO position

In the direction of the gradient at a point ori (I, t) denotes the direction of the gradient at the point t of the input image I, p denotes a list of positions r, y denotes the distance of the image gradient along the y-axis, D (c'_x) Is indicated at point c'_xDepth value of (D'_o) Representing the distance from the reference image O camera to the center of the regular polyhedron in offline training;

is shown in

As the center, τ is the region of the neighborhood.

an image preprocessing module: acquiring a rigid object image on line, and carrying out smooth denoising and sharpening on the rigid object image by using Gaussian filtering and a Laplace operator in sequence to remove noise of the rigid object image and keep edge information of the image;

rigid object feature offline extraction module: in the off-line stage, a rigid object model is trained in a CAD environment, image acquisition under multiple viewing angles is carried out on the rigid object model and is used as a reference image, and gradient direction descriptors are extracted on the obtained reference image and are stored in an XML file; providing an image template for online identification of the rigid object in the rigid object online identification module;

rigid object online identification module: the method comprises the steps of collecting video frames of a rigid object in a real scene at an online stage, conducting image preprocessing and image gradient direction descriptor extraction on each frame of image, matching the image gradient direction descriptor extracted on LINE with the gradient direction descriptor extracted at an offline stage through an improved LINE-MOD template matching method, and improving the improved LINE-MOD template matching method mainly in that depth information is introduced into a similarity measurement algorithm, so that the improved LINE-MOD template matching method can finish rigid object identification at any scale. And completing the identification of the rigid object.

The invention has the following advantages and beneficial effects:

the innovation point of the invention is to provide an improved LINE-MOD template matching rigid object recognition algorithm, which adopts a binary system mode of gradient direction characteristics to express and skillfully uses the cache of a modern computer to carry out parallel processing. The method can identify multiple types of non-texture rigid objects in real time in a complex background environment. However, the method can only identify the rigid object at a fixed scale, so that the method can identify the target object at different scales in order to improve the multi-scale identification of the algorithm.

The method comprises the following specific steps:

(1) and (3) image preprocessing, namely performing smooth denoising and sharpening on the image by using Gaussian filtering and a Laplacian operator in sequence, so that the noise of the image is effectively removed, and the edge information of the image can be better kept.

(2) The method comprises the steps of rigid object feature offline extraction, rigid object model training in a CAD environment in an offline stage, image acquisition under multiple visual angles of the model and serving as a reference image, gradient direction descriptor extraction of the acquired reference image and storage of the extracted gradient direction descriptor in an XML file.

(3) The method comprises the steps of identifying a rigid object on LINE, collecting video frames of the rigid object in a real scene at an on-LINE stage, carrying out image preprocessing and image gradient direction descriptor extraction on each frame of image, and then matching the image gradient direction descriptor extracted on LINE with the gradient direction descriptor extracted at an off-LINE stage through an improved LINE-MOD template matching method. If the image acquired during the off-line phase to generate the template is far away from the camera and the image acquired during the on-line operation is close to the camera, even if the object in the on-line acquired image and the object in the template have the same gradient direction, the similarity measurement algorithm in step 3 still shows a mismatch. In order to complete the recognition of the object to be assembled under any scale, depth information is introduced into a similarity evaluation method, so that the object to be assembled has scale invariance.

(4) In order to complete the identification of the target object under any scale, the improved LINE-MOD template matching rigid object identification algorithm introduces depth information into a similarity measurement method to enable the depth information to have scale invariance.

The method has the advantages that: the operation process is simple, and the method has strong capability of processing different objects. It does not require a large training set nor a time-consuming training phase and can be handled for both textured and non-textured rigid objects.

Drawings

FIG. 1 is a flow chart of a rigid object recognition algorithm based on improved LINE-MOD template matching according to a preferred embodiment of the present invention;

fig. 2 is a flowchart of the off-line extraction of the rigid object features according to the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be described in detail and clearly with reference to the accompanying drawings. The described embodiments are only some of the embodiments of the present invention.

The technical scheme for solving the technical problems is as follows:

the invention provides an efficient domain ontology semantic similarity calculation algorithm, and on the basis of in-depth research on a method for identifying a few-texture or non-texture target object, the LINE-MOD template matching method is considered to be capable of identifying a few-texture rigid object only at a fixed scale, and how to improve multi-scale identification of the algorithm is mainly solved through the step 3, so that the method can identify the target object at different scales. The invention will be described in further detail below with reference to the accompanying drawings and examples.

As shown in fig. 1, the flow chart of the rigid object recognition algorithm based on the improved LINE-MOD template matching of the present invention is specifically implemented as follows:

1. and (3) image preprocessing, namely performing smooth denoising and sharpening on the image by using Gaussian filtering and a Laplacian operator in sequence, so that the noise of the image is effectively removed, and the edge information of the image can be better kept.

2. The method comprises the steps of rigid object feature offline extraction, rigid object model training in a CAD environment in an offline stage, image acquisition under multiple visual angles of the model and serving as a reference image, gradient direction descriptor extraction of the acquired reference image and storage of the extracted gradient direction descriptor in an XML file.

3. The method comprises the following steps of identifying a rigid object on LINE, collecting video frames of the rigid object in a real scene at an on-LINE stage, carrying out image preprocessing and image gradient direction descriptor extraction on each frame of image, and matching the image gradient direction descriptor extracted on LINE with a gradient direction descriptor extracted at an off-LINE stage by an improved LINE-MOD template matching method, wherein a specific similarity measurement algorithm is as follows:

4. In order to complete the identification of a target object under any scale, the improved LINE-MOD template matching rigid object identification algorithm introduces depth information into a similarity measurement method to enable the depth information to have scale invariance, and the specific algorithm is as follows:

in the formula, epsilon represents the target object to be recognized and the target obtained in the off-line training stageSimilarity between object training templates, I denotes an input image, T ═ O, p denotes a template of a target object, c'_IIndicating that the input image and the template image are at the same position c'_IWhere O denotes the position of the reference image O, S_o(c′_oR') denotes the gradient direction of the reference image, ori (O, S)_o(c′_oR')) is indicated at the reference image O position

In the direction of the gradient at a point ori (I, t) denotes the direction of the gradient at the point t of the input image I, p denotes a list of positions r, y denotes the distance of the image gradient along the y-axis, D (c'_x) Is indicated at point c'_xDepth value of (D'_o) And the distance from the reference image O camera to the center of the regular polyhedron in off-line training is shown.

Is shown in

As the center, τ is the region of the neighborhood.

Fig. 2 is a flow chart of off-LINE rigid object feature extraction according to the present invention, and fig. 2 is a further more detailed description of the rigid object recognition algorithm process based on the improved LINE-MOD template matching in fig. 1. The method is realized by the following steps:

1. multi-view rigid object image acquisition

Image acquisition of a CAD model of a rigid object by a virtual camera in a CAD environment requires acquisition of projection images of the model at different viewing angles, and each image contains a different shape of the rigid object's outline. In order to realize the uniform distribution of sampling visual angles and avoid the problem of image redundancy caused by overlarge sampling density of sampling polar points on the surface of a hemisphere, the sampling is carried out on the surface of a regular icosahedron, and each surface of the regular icosahedron is mainly divided into four parts on average and is subjected to iterative division. To take into account the speed and accuracy of the operation in image acquisition, only 2 iterations were chosen for each face of the regular icosahedron, while allowing 16 equilateral triangles to be formed on each face. In a CAD environment, a CAD model of a rigid object is placed in the center of a regular icosahedron, with the virtual camera optical axis passing through the center at all times. Shape feature descriptors of the rigid object contours in all views are extracted, and the shape feature descriptors and corresponding pose parameters of all the rigid object contours are saved in an XML file.

2. Gradient direction extraction

For each sampled image, the gradient direction on the three color channels for each position { R, G, B } of the rigid object holder contour is calculated, and the gradient direction maximum in the three channels is taken as the gradient direction at that point. For example, in an input image I, the gradient direction at the position x is shown by the following formula:

where C ∈ { R, G, B } is the three channels of red, green, and blue in the image at position x in view I, C is a color channel of the three color channels of R, G, and B, and ori () represents the image gradient direction.

3. Extension in gradient direction

Because the input image may generate a small movement or rotation, the gradient direction of the rigid object extracted in the online matching process is directly mismatched with the gradient direction of the rigid object extracted offline, and thus the identification of the rigid object is affected. In order to improve robustness of micro movement or rotation of an input image in an online matching process, after extracting a profile gradient of a rigid object in an offline stage, expansion processing needs to be performed in each gradient direction. The gradient direction of a certain position x on the surface of the rigid object is expanded, namely the gradient direction of the position x is adopted to represent in a 3 x 3 pixel coordinate neighborhood taking the current position as the center.

The above examples are to be construed as merely illustrative and not limitative of the remainder of the disclosure. After reading the description of the invention, the skilled person can make various changes or modifications to the invention, and these equivalent changes and modifications also fall into the scope of the invention defined by the claims.

Claims

1. A rigid object identification method and device based on improved LINE-MOD template matching are characterized by comprising the following steps:

2. The method and device for identifying the rigid object based on the improved LINE-MOD template matching according to claim 1, wherein the step 1 performs smoothing, denoising and sharpening on the rigid object and the image by using gaussian filtering and laplacian successively, and specifically comprises:

in the formula

3. The rigid object identification method and device based on improved LINE-MOD template matching according to claim 1, wherein the specific process of multi-view rigid object image acquisition in the step 2 is as follows:

4. The method and device for rigid object identification based on improved LINE-MOD template matching according to claim 3, wherein the 2) performing gradient direction descriptor extraction on the acquired reference image specifically comprises:

where C ∈ { R, G, B } is the three channels red, green, blue in the image at position x in view I, I_g(x)、

Respectively represent the gradient direction at the position x in the view I, the maximum value of the gradient direction in the three channels is taken as the gradient direction at the position x in the view I, C is a certain color channel among the three color channels R, G, B, and ori () represents the image gradient direction.

5. The method and device for rigid object identification based on improved LINE-MOD template matching as claimed in claim 4, wherein the step 3 matches the gradient direction descriptor of the image extracted on-LINE with the gradient direction descriptor extracted at off-LINE stage by the improved LINE-MOD template matching method, and the specific similarity measure algorithm is as follows:

6. The rigid object identification method and device based on improved LINE-MOD template matching according to claim 5, wherein in order to complete the identification of the target object at any scale, depth information is introduced into the similarity measurement method to make it have scale invariance, and the specific algorithm is as follows:

where ∈ denotes the similarity between the target object to be recognized and the target object training template obtained in the offline training stage, I denotes the input image, T ═ O, p denotes the template of the target object, and c'_IIndicating that the input image and the template image are at the same position c'_IWhere O denotes the position of the reference image O, S_o(c′_oR') denotes the gradient direction of the reference image, ori (O, S)_o(c′_oR')) is indicated at the reference image O position

is shown in

As the center, τ is the region of the neighborhood.

7. A rigid object identification method and device based on improved LINE-MOD template matching are characterized by comprising the following steps: