CN108154066B

CN108154066B - Three-dimensional target identification method based on curvature characteristic recurrent neural network

Info

Publication number: CN108154066B
Application number: CN201611096314.1A
Authority: CN
Inventors: 梁炜; 李杨; 郑萌; 谈金东; 彭士伟
Original assignee: Shenyang Institute of Automation of CAS
Current assignee: Shenyang Institute of Automation of CAS
Priority date: 2016-12-02
Filing date: 2016-12-02
Publication date: 2021-04-27
Anticipated expiration: 2036-12-02
Also published as: CN108154066A

Abstract

The invention relates to an image recognition technology, and provides a three-dimensional target recognition method based on a curvature characteristic recurrent neural network in order to effectively depict the characteristics of a three-dimensional target under different visual angles and aim at the problem of image noise in the process of recognizing the three-dimensional target. Firstly, the method obtains the combined curvature of the target three-dimensional model by calculating the local mean Gaussian curvature and the mean curvature of the target three-dimensional model, forms a curvature sketch of the three-dimensional model by extracting the local maximum value of the combined curvature, and generates a 360-degree two-dimensional image sequence by utilizing transmission projection transformation as the input of training a recurrent neural network; secondly, a Bidirectional Recurrent Neural Network (BRNN) is used as a three-dimensional model multi-view sequence feature learning method, and the identification category with the maximum correct probability is obtained by utilizing a softmax function in a softmax layer. The method can automatically extract the common characteristics of the three-dimensional target and the two-dimensional image, and can keep better robustness and higher target identification rate under the condition of image noise.

Description

Three-dimensional target identification method based on curvature characteristic recurrent neural network

Technical Field

The invention relates to the technical field of image recognition, in particular to a three-dimensional target recognition method based on a curvature characteristic recurrent neural network.

Background

Three-dimensional target recognition refers to a process of automatically detecting, positioning and recognizing a specified target mode from any given two-dimensional image scene, and is one of the key problems of computer vision research. With the continuous development of computer vision technology, three-dimensional target recognition is more and more widely applied to the fields of industrial detection, augmented reality, medical images and the like. However, due to the influence of factors such as illumination change, image noise and target shielding, it is difficult to extract common features of a three-dimensional target and two-dimensional images thereof under different viewing angles, and the problem to be solved is urgent to be identified.

The key of the three-dimensional target identification is to find out the two-dimensional expression of a three-dimensional target model and extract the common characteristics of the three-dimensional target and the two-dimensional image. The existing three-dimensional target identification method mainly comprises an artificial marking point-based method, a geometric feature-based method, a deep learning-based method and the like. The method based on the manual marking points needs to manually initialize the characteristic points in the two-dimensional image, and the method has no repeatability because of the need of manual interaction; the method based on the geometric features realizes target identification by extracting information such as a center line skeleton, a contour shape and the like of a target, but the method has poor identification effect under the condition that noise exists in an image; the method based on deep learning utilizes a deep neural network to fuse low-level image features into high-level features with semantic information, and can well solve the problem of image noise of a two-dimensional image in the process of identifying a three-dimensional target. Therefore, it is desirable to provide an automatic three-dimensional target identification method robust to image noise problem in images with different viewing angles.

Disclosure of Invention

The invention aims to more effectively depict the characteristics of a three-dimensional target under different visual angles, reduce the sensitivity of the characteristic extraction process to image noise and improve the identification accuracy of the three-dimensional target.

The technical scheme adopted by the invention for realizing the purpose is as follows: a three-dimensional target identification method based on curvature characteristic recurrent neural network comprises the following steps:

step 1: calculating joint curvature of a three-dimensional model of an object

Extracting combined curvatures

The local maximum values form a curvature sketch R of the three-dimensional model_Sketch(ii) a Then, a curvature sketch R of the three-dimensional model is conducted_SketchGeneration of a 360 DEG two-dimensional image P using transmission projective transformation_mWherein m is 1, 2.., 360;

step 2: inputting a 360-degree two-dimensional image into the BRNN, and utilizing multi-angle characteristics to learn and calculate sequence attributes of the image under multiple visual angles; obtaining the identification category when the correct probability of the sequence attribute is maximum by utilizing a softmax function in a softmax layer; the BRNN is a bidirectional recurrent neural network.

The joint curvature of the three-dimensional model of the calculation target

The method comprises the following steps:

is provided with

Is a normal vector of a given point (x, y, z) on the target three-dimensional model R; order to

Then p is_x,p_y,q_x,q_yIs defined as

Calculating the mean Gaussian curvature in a 3 × 3 neighborhood around the normal vector of each point on the three-dimensional model R

And mean curvature

Wherein the content of the first and second substances,

being the mean curvature matrix, trace (-) is the trace of the matrix,

are respectively p, q, p_x,p_y,q_x,q_yAverage in the 3 × 3 neighborhood;

defining a joint curvature of a three-dimensional model R of an object

Comprises the following steps:

the method comprises the following steps of inputting a 360-degree two-dimensional image into the BRNN, and utilizing multi-angle features to learn and calculate the sequence attributes of the image under multiple visual angles, wherein the method comprises the following steps:

one-dimensional characteristic sequence T for acquiring 360-degree two-dimensional image_SS 1,2, 360, then the signature sequence T_SOutput at the i-th layer of BRNN is divided into forward output

And reverse output

And respectively output with a sequence on the BRNN of the local layer in the forward direction

Reverse output of BRNN next sequence at this layer

And the forward output of the upper layer BRNN

And reverse output

The following relationships exist:

wherein the content of the first and second substances,

b is a bias, and tanh is a neuron activation function;

then the characteristic sequence T_STotal output O at BRNN^sI.e. input I of full connection level fc_fcComprises the following steps:

wherein the content of the first and second substances,

respectively is the connection weight of the forward output and the reverse output on the full connection layer;

thus, the signature sequence T_SThe cumulative output at full connection level fc is

I.e. the sequence property.

The identification category when the correct probability of the sequence attribute is maximum is obtained by utilizing a softmax function at a softmax layer, and the method comprises the following steps:

calculating the correct probability p (C) of the recognition result being the kth class by utilizing a softmax function at a softmax layer_k)

Wherein C is the total number of identification categories, A_kAccumulating and outputting a result of the sequence attribute of the kth three-dimensional target at the full connection layer fc;

then, the maximum likelihood estimation method is used to obtain the minimum value of the loss function, i.e. the correct probability p (C)_k) Maximum recognition category k:

wherein δ (·) is a kronecker function

r represents a characteristic sequence T_SCorrectly identify the category.

The invention has the following beneficial effects and advantages:

1. the method for extracting the features of the combined curvature sketch designed by the invention can automatically extract the common features of the three-dimensional model and the two-dimensional image, and the problem of image noise can be effectively solved by the local mean Gaussian curvature and the local mean curvature used by the combined curvature.

2. The invention designs a multi-angle feature learning bidirectional recurrent neural network, can simultaneously consider the feature sequence of the three-dimensional model under multiple angles, and can accurately identify the three-dimensional target in the two-dimensional image at any angle.

Drawings

FIG. 1 is a flow chart of the method of the present invention;

FIG. 2 is a diagram of a multi-angle feature learning bi-directional recurrent neural network framework in the method of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and examples.

The invention is mainly divided into two parts, as shown in fig. 1, the method flow chart of the invention is shown, and the specific implementation process is as follows.

Step 1: calculating the joint curvature of the target three-dimensional model, forming a curvature sketch of the three-dimensional model by extracting local maximum values of the joint curvature, and generating a 360-degree two-dimensional image by utilizing transmission projection transformation as input of a training recurrent neural network;

step 1.1: is provided with

Is the normal vector for a given point (x, y, z) on the three-dimensional model. Order to

Then p is_x,p_y,q_x,q_yIs defined as

The gaussian curvature G of the three-dimensional model_KIs composed of

G_K＝|C|，

Wherein the curvature matrix

Mean curvature M of three-dimensional model_KIs composed of

trace (·) is the trace of the matrix. In order to eliminate noise influence, the invention calculates the average Gaussian curvature in a 3 multiplied by 3 neighborhood around the normal vector of each point on the three-dimensional model

And mean curvature

Wherein

Is the average curvature matrix and is,

are respectively p, q, p_x,p_y,q_x,q_yAverage in the 3 × 3 neighborhood. Thus, we define the joint curvature of the three-dimensional model

Is composed of

Step 1.2: extracting combined curvatures

The local maximum points form a curvature sketch R of the three-dimensional model R_Sketch. Generating a three-dimensional curvature sketch R through perspective projection transformation_Sketch360 deg. two-dimensional projection image P_mM =1, 2.., 360, as an input to the BRNN.

Step 2: the invention adopts a Deep Recurrent Neural Network (DRNN) as a curvature characteristic identification method, and a DRNN frame is shown as figure 2. And (3) utilizing multi-angle feature learning BRNN to depict sequence attributes of the three-dimensional model under multiple visual angles, and utilizing a softmax function to obtain the identification category with the maximum correct probability in a softmax layer.

Step 2.1: in order to depict the characteristic sequence of the three-dimensional model under different visual angles, the one-dimensional characteristic sequence of the three-dimensional model under multiple visual angles is defined as T_SS 1,2, 360, then the signature sequence T_SOutput at the i-th layer of BRNN is divided into forward output

And reverse output

Respectively output with a sequence on the BRNN of the layer

Reverse output of BRNN next sequence at this layer

And the forward output of the upper layer BRNN

And reverse output

The following relationships exist:

wherein

B is bias, and tanh is neuron activation function; then the characteristic sequence T_STotal output O at BRNN^sI.e. input I of full connection level fc_fcIs composed of

Wherein the content of the first and second substances,

the connection weights of the forward output and the reverse output at the full connection layer are respectively.

Step 2.2: characteristic sequence T_SThe cumulative output at full connection level fc is

I.e. the sequence property. Calculating the correct probability p (C) of the recognition result being the kth class by utilizing a softmax function at a softmax layer_k)

Wherein C is the total number of identification classes, A_kAnd outputting the result of the sequence attribute of the kth three-dimensional target in the full connection layer fc. Then, the maximum likelihood estimation method is used to obtain the minimum value of the loss function, i.e. the correct probability p (C)_k) Maximum recognition category k:

where δ (-) is a kronecker function

r represents a characteristic sequence T_SCorrectly identify the category.

Claims

1. A three-dimensional target identification method based on curvature characteristic recurrent neural network is characterized by comprising the following steps:

step 1: calculating joint curvature of a three-dimensional model of an object

Extracting combined curvatures

step 2: inputting a 360-degree two-dimensional image into the BRNN, and utilizing multi-angle characteristics to learn and calculate sequence attributes of the image under multiple visual angles; obtaining the identification category when the correct probability of the sequence attribute is maximum by utilizing a softmax function in a softmax layer; the BRNN is a bidirectional recurrent neural network;

the joint curvature of the three-dimensional model of the calculation target

The method comprises the following steps:

is provided with

Then p is_x,p_y,q_x,q_yIs defined as

And mean curvature

Wherein the content of the first and second substances,

being the mean curvature matrix, trace (-) is the trace of the matrix,

are respectively p, q, p_x,p_y,q_x,q_yAverage in the 3 × 3 neighborhood;

defining a joint curvature of a three-dimensional model R of an object

Comprises the following steps:

2. the method for identifying three-dimensional objects based on curvature feature recurrent neural network as claimed in claim 1, wherein said inputting 360 ° two-dimensional image into BRNN, using multi-angle feature to learn and calculate its sequence attribute under multi-view, comprises the following steps:

And reverse output

Reverse output of BRNN next sequence at this layer

And the forward output of the upper layer BRNN

And reverse output

The following relationships exist:

wherein the content of the first and second substances,

b is a bias, and tanh is a neuron activation function;

wherein the content of the first and second substances,

I.e. the sequence property.

3. The three-dimensional object recognition method based on the curvature feature recurrent neural network as claimed in claim 1, wherein the recognition class when the correct probability of the sequence attribute is maximum is found by the softmax layer by using the softmax function, comprising the following steps:

wherein δ (·) is a kronecker function

r represents a characteristic sequence T_SCorrect identification category of; t is_STo acquire a one-dimensional sequence of features for a 360 ° two-dimensional image, s 1, 2.