CN108154066B - Three-dimensional target identification method based on curvature characteristic recurrent neural network - Google Patents
Three-dimensional target identification method based on curvature characteristic recurrent neural network Download PDFInfo
- Publication number
- CN108154066B CN108154066B CN201611096314.1A CN201611096314A CN108154066B CN 108154066 B CN108154066 B CN 108154066B CN 201611096314 A CN201611096314 A CN 201611096314A CN 108154066 B CN108154066 B CN 108154066B
- Authority
- CN
- China
- Prior art keywords
- dimensional
- curvature
- sequence
- target
- brnn
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/64—Three-dimensional objects
- G06V20/647—Three-dimensional objects by matching two-dimensional images to three-dimensional objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to an image recognition technology, and provides a three-dimensional target recognition method based on a curvature characteristic recurrent neural network in order to effectively depict the characteristics of a three-dimensional target under different visual angles and aim at the problem of image noise in the process of recognizing the three-dimensional target. Firstly, the method obtains the combined curvature of the target three-dimensional model by calculating the local mean Gaussian curvature and the mean curvature of the target three-dimensional model, forms a curvature sketch of the three-dimensional model by extracting the local maximum value of the combined curvature, and generates a 360-degree two-dimensional image sequence by utilizing transmission projection transformation as the input of training a recurrent neural network; secondly, a Bidirectional Recurrent Neural Network (BRNN) is used as a three-dimensional model multi-view sequence feature learning method, and the identification category with the maximum correct probability is obtained by utilizing a softmax function in a softmax layer. The method can automatically extract the common characteristics of the three-dimensional target and the two-dimensional image, and can keep better robustness and higher target identification rate under the condition of image noise.
Description
Technical Field
The invention relates to the technical field of image recognition, in particular to a three-dimensional target recognition method based on a curvature characteristic recurrent neural network.
Background
Three-dimensional target recognition refers to a process of automatically detecting, positioning and recognizing a specified target mode from any given two-dimensional image scene, and is one of the key problems of computer vision research. With the continuous development of computer vision technology, three-dimensional target recognition is more and more widely applied to the fields of industrial detection, augmented reality, medical images and the like. However, due to the influence of factors such as illumination change, image noise and target shielding, it is difficult to extract common features of a three-dimensional target and two-dimensional images thereof under different viewing angles, and the problem to be solved is urgent to be identified.
The key of the three-dimensional target identification is to find out the two-dimensional expression of a three-dimensional target model and extract the common characteristics of the three-dimensional target and the two-dimensional image. The existing three-dimensional target identification method mainly comprises an artificial marking point-based method, a geometric feature-based method, a deep learning-based method and the like. The method based on the manual marking points needs to manually initialize the characteristic points in the two-dimensional image, and the method has no repeatability because of the need of manual interaction; the method based on the geometric features realizes target identification by extracting information such as a center line skeleton, a contour shape and the like of a target, but the method has poor identification effect under the condition that noise exists in an image; the method based on deep learning utilizes a deep neural network to fuse low-level image features into high-level features with semantic information, and can well solve the problem of image noise of a two-dimensional image in the process of identifying a three-dimensional target. Therefore, it is desirable to provide an automatic three-dimensional target identification method robust to image noise problem in images with different viewing angles.
Disclosure of Invention
The invention aims to more effectively depict the characteristics of a three-dimensional target under different visual angles, reduce the sensitivity of the characteristic extraction process to image noise and improve the identification accuracy of the three-dimensional target.
The technical scheme adopted by the invention for realizing the purpose is as follows: a three-dimensional target identification method based on curvature characteristic recurrent neural network comprises the following steps:
step 1: calculating joint curvature of a three-dimensional model of an objectExtracting combined curvaturesThe local maximum values form a curvature sketch R of the three-dimensional modelSketch(ii) a Then, a curvature sketch R of the three-dimensional model is conductedSketchGeneration of a 360 DEG two-dimensional image P using transmission projective transformationmWherein m is 1, 2.., 360;
step 2: inputting a 360-degree two-dimensional image into the BRNN, and utilizing multi-angle characteristics to learn and calculate sequence attributes of the image under multiple visual angles; obtaining the identification category when the correct probability of the sequence attribute is maximum by utilizing a softmax function in a softmax layer; the BRNN is a bidirectional recurrent neural network.
The joint curvature of the three-dimensional model of the calculation targetThe method comprises the following steps:
is provided withIs a normal vector of a given point (x, y, z) on the target three-dimensional model R; order toThen p isx,py,qx,qyIs defined as
Calculating the mean Gaussian curvature in a 3 × 3 neighborhood around the normal vector of each point on the three-dimensional model RAnd mean curvature
Wherein the content of the first and second substances,being the mean curvature matrix, trace (-) is the trace of the matrix,are respectively p, q, px,py,qx,qyAverage in the 3 × 3 neighborhood;
defining a joint curvature of a three-dimensional model R of an objectComprises the following steps:
the method comprises the following steps of inputting a 360-degree two-dimensional image into the BRNN, and utilizing multi-angle features to learn and calculate the sequence attributes of the image under multiple visual angles, wherein the method comprises the following steps:
one-dimensional characteristic sequence T for acquiring 360-degree two-dimensional imageSS 1,2, 360, then the signature sequence TSOutput at the i-th layer of BRNN is divided into forward outputAnd reverse outputAnd respectively output with a sequence on the BRNN of the local layer in the forward directionReverse output of BRNN next sequence at this layerAnd the forward output of the upper layer BRNNAnd reverse outputThe following relationships exist:
wherein the content of the first and second substances,b is a bias, and tanh is a neuron activation function;
then the characteristic sequence TSTotal output O at BRNNsI.e. input I of full connection level fcfcComprises the following steps:
wherein the content of the first and second substances,respectively is the connection weight of the forward output and the reverse output on the full connection layer;
thus, the signature sequence TSThe cumulative output at full connection level fc isI.e. the sequence property.
The identification category when the correct probability of the sequence attribute is maximum is obtained by utilizing a softmax function at a softmax layer, and the method comprises the following steps:
calculating the correct probability p (C) of the recognition result being the kth class by utilizing a softmax function at a softmax layerk)
Wherein C is the total number of identification categories, AkAccumulating and outputting a result of the sequence attribute of the kth three-dimensional target at the full connection layer fc;
then, the maximum likelihood estimation method is used to obtain the minimum value of the loss function, i.e. the correct probability p (C)k) Maximum recognition category k:
wherein δ (·) is a kronecker functionr represents a characteristic sequence TSCorrectly identify the category.
The invention has the following beneficial effects and advantages:
1. the method for extracting the features of the combined curvature sketch designed by the invention can automatically extract the common features of the three-dimensional model and the two-dimensional image, and the problem of image noise can be effectively solved by the local mean Gaussian curvature and the local mean curvature used by the combined curvature.
2. The invention designs a multi-angle feature learning bidirectional recurrent neural network, can simultaneously consider the feature sequence of the three-dimensional model under multiple angles, and can accurately identify the three-dimensional target in the two-dimensional image at any angle.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a diagram of a multi-angle feature learning bi-directional recurrent neural network framework in the method of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples.
The invention is mainly divided into two parts, as shown in fig. 1, the method flow chart of the invention is shown, and the specific implementation process is as follows.
Step 1: calculating the joint curvature of the target three-dimensional model, forming a curvature sketch of the three-dimensional model by extracting local maximum values of the joint curvature, and generating a 360-degree two-dimensional image by utilizing transmission projection transformation as input of a training recurrent neural network;
step 1.1: is provided withIs the normal vector for a given point (x, y, z) on the three-dimensional model. Order toThen p isx,py,qx,qyIs defined asThe gaussian curvature G of the three-dimensional modelKIs composed of
GK=|C|,
Wherein the curvature matrixMean curvature M of three-dimensional modelKIs composed oftrace (·) is the trace of the matrix. In order to eliminate noise influence, the invention calculates the average Gaussian curvature in a 3 multiplied by 3 neighborhood around the normal vector of each point on the three-dimensional modelAnd mean curvature
WhereinIs the average curvature matrix and is,are respectively p, q, px,py,qx,qyAverage in the 3 × 3 neighborhood. Thus, we define the joint curvature of the three-dimensional modelIs composed of
Step 1.2: extracting combined curvaturesThe local maximum points form a curvature sketch R of the three-dimensional model RSketch. Generating a three-dimensional curvature sketch R through perspective projection transformationSketch360 deg. two-dimensional projection image PmM =1, 2.., 360, as an input to the BRNN.
Step 2: the invention adopts a Deep Recurrent Neural Network (DRNN) as a curvature characteristic identification method, and a DRNN frame is shown as figure 2. And (3) utilizing multi-angle feature learning BRNN to depict sequence attributes of the three-dimensional model under multiple visual angles, and utilizing a softmax function to obtain the identification category with the maximum correct probability in a softmax layer.
Step 2.1: in order to depict the characteristic sequence of the three-dimensional model under different visual angles, the one-dimensional characteristic sequence of the three-dimensional model under multiple visual angles is defined as TSS 1,2, 360, then the signature sequence TSOutput at the i-th layer of BRNN is divided into forward outputAnd reverse outputRespectively output with a sequence on the BRNN of the layerReverse output of BRNN next sequence at this layerAnd the forward output of the upper layer BRNNAnd reverse outputThe following relationships exist:
whereinB is bias, and tanh is neuron activation function; then the characteristic sequence TSTotal output O at BRNNsI.e. input I of full connection level fcfcIs composed of
Wherein the content of the first and second substances,the connection weights of the forward output and the reverse output at the full connection layer are respectively.
Step 2.2: characteristic sequence TSThe cumulative output at full connection level fc isI.e. the sequence property. Calculating the correct probability p (C) of the recognition result being the kth class by utilizing a softmax function at a softmax layerk)
Wherein C is the total number of identification classes, AkAnd outputting the result of the sequence attribute of the kth three-dimensional target in the full connection layer fc. Then, the maximum likelihood estimation method is used to obtain the minimum value of the loss function, i.e. the correct probability p (C)k) Maximum recognition category k:
Claims (3)
1. A three-dimensional target identification method based on curvature characteristic recurrent neural network is characterized by comprising the following steps:
step 1: calculating joint curvature of a three-dimensional model of an objectExtracting combined curvaturesThe local maximum values form a curvature sketch R of the three-dimensional modelSketch(ii) a Then, a curvature sketch R of the three-dimensional model is conductedSketchGeneration of a 360 DEG two-dimensional image P using transmission projective transformationmWherein m is 1, 2.., 360;
step 2: inputting a 360-degree two-dimensional image into the BRNN, and utilizing multi-angle characteristics to learn and calculate sequence attributes of the image under multiple visual angles; obtaining the identification category when the correct probability of the sequence attribute is maximum by utilizing a softmax function in a softmax layer; the BRNN is a bidirectional recurrent neural network;
the joint curvature of the three-dimensional model of the calculation targetThe method comprises the following steps:
is provided withIs a normal vector of a given point (x, y, z) on the target three-dimensional model R; order toThen p isx,py,qx,qyIs defined as
Calculating the mean Gaussian curvature in a 3 × 3 neighborhood around the normal vector of each point on the three-dimensional model RAnd mean curvature
Wherein the content of the first and second substances,being the mean curvature matrix, trace (-) is the trace of the matrix,are respectively p, q, px,py,qx,qyAverage in the 3 × 3 neighborhood;
defining a joint curvature of a three-dimensional model R of an objectComprises the following steps:
2. the method for identifying three-dimensional objects based on curvature feature recurrent neural network as claimed in claim 1, wherein said inputting 360 ° two-dimensional image into BRNN, using multi-angle feature to learn and calculate its sequence attribute under multi-view, comprises the following steps:
one-dimensional characteristic sequence T for acquiring 360-degree two-dimensional imageSS 1,2, 360, then the signature sequence TSOutput at the i-th layer of BRNN is divided into forward outputAnd reverse outputAnd respectively output with a sequence on the BRNN of the local layer in the forward directionReverse output of BRNN next sequence at this layerAnd the forward output of the upper layer BRNNAnd reverse outputThe following relationships exist:
wherein the content of the first and second substances,b is a bias, and tanh is a neuron activation function;
then the characteristic sequence TSTotal output O at BRNNsI.e. input I of full connection level fcfcComprises the following steps:
wherein the content of the first and second substances,respectively is the connection weight of the forward output and the reverse output on the full connection layer;
3. The three-dimensional object recognition method based on the curvature feature recurrent neural network as claimed in claim 1, wherein the recognition class when the correct probability of the sequence attribute is maximum is found by the softmax layer by using the softmax function, comprising the following steps:
calculating the correct probability p (C) of the recognition result being the kth class by utilizing a softmax function at a softmax layerk)
Wherein C is the total number of identification categories, AkAccumulating and outputting a result of the sequence attribute of the kth three-dimensional target at the full connection layer fc;
then, the maximum likelihood estimation method is used to obtain the minimum value of the loss function, i.e. the correct probability p (C)k) Maximum recognition category k:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611096314.1A CN108154066B (en) | 2016-12-02 | 2016-12-02 | Three-dimensional target identification method based on curvature characteristic recurrent neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611096314.1A CN108154066B (en) | 2016-12-02 | 2016-12-02 | Three-dimensional target identification method based on curvature characteristic recurrent neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108154066A CN108154066A (en) | 2018-06-12 |
CN108154066B true CN108154066B (en) | 2021-04-27 |
Family
ID=62470160
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611096314.1A Active CN108154066B (en) | 2016-12-02 | 2016-12-02 | Three-dimensional target identification method based on curvature characteristic recurrent neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108154066B (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109166183B (en) * | 2018-07-16 | 2023-04-07 | 中南大学 | Anatomical landmark point identification method and identification equipment |
CN109496316B (en) * | 2018-07-28 | 2022-04-01 | 合刃科技(深圳)有限公司 | Image recognition system |
CN109242955B (en) * | 2018-08-17 | 2023-03-24 | 山东师范大学 | Workpiece manufacturing characteristic automatic identification method and device based on single image |
CN109493354B (en) * | 2018-10-10 | 2021-08-06 | 中国科学院上海技术物理研究所 | Target two-dimensional geometric shape reconstruction method based on multi-view images |
CN110287783A (en) * | 2019-05-18 | 2019-09-27 | 天嗣智能信息科技(上海)有限公司 | A kind of video monitoring image human figure identification method |
CN117315397B (en) * | 2023-10-11 | 2024-06-07 | 电子科技大学 | Classification method for noise data containing labels based on class curvature |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101770566A (en) * | 2008-12-30 | 2010-07-07 | 复旦大学 | Quick three-dimensional human ear identification method |
CN104166842A (en) * | 2014-07-25 | 2014-11-26 | 同济大学 | Three-dimensional palm print identification method based on partitioning statistical characteristic and combined expression |
CN104463111A (en) * | 2014-11-21 | 2015-03-25 | 天津工业大学 | Three-dimensional face recognition method fused with multi-scale feature region curvatures |
CN105205478A (en) * | 2015-10-23 | 2015-12-30 | 天津工业大学 | 3-dimensional human face recognition method integrating anthropometry and curvelet transform |
KR101592294B1 (en) * | 2014-09-03 | 2016-02-05 | 배재대학교 산학협력단 | Decimation Method For Complex Three Dimensional Polygonal Mesh Data |
CN106097431A (en) * | 2016-05-09 | 2016-11-09 | 王红军 | A kind of object global recognition method based on 3 d grid map |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9984473B2 (en) * | 2014-07-09 | 2018-05-29 | Nant Holdings Ip, Llc | Feature trackability ranking, systems and methods |
US10019784B2 (en) * | 2015-03-18 | 2018-07-10 | Toshiba Medical Systems Corporation | Medical image processing apparatus and method |
-
2016
- 2016-12-02 CN CN201611096314.1A patent/CN108154066B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101770566A (en) * | 2008-12-30 | 2010-07-07 | 复旦大学 | Quick three-dimensional human ear identification method |
CN104166842A (en) * | 2014-07-25 | 2014-11-26 | 同济大学 | Three-dimensional palm print identification method based on partitioning statistical characteristic and combined expression |
KR101592294B1 (en) * | 2014-09-03 | 2016-02-05 | 배재대학교 산학협력단 | Decimation Method For Complex Three Dimensional Polygonal Mesh Data |
CN104463111A (en) * | 2014-11-21 | 2015-03-25 | 天津工业大学 | Three-dimensional face recognition method fused with multi-scale feature region curvatures |
CN105205478A (en) * | 2015-10-23 | 2015-12-30 | 天津工业大学 | 3-dimensional human face recognition method integrating anthropometry and curvelet transform |
CN106097431A (en) * | 2016-05-09 | 2016-11-09 | 王红军 | A kind of object global recognition method based on 3 d grid map |
Non-Patent Citations (2)
Title |
---|
"Study on novel Curvature Features for 3D fingerprint recognition";Feng Liu 等;《Neurocomputing》;20151130;第168卷;第599-608页 * |
"基于模型的任意视点下三维目标识别研究";许俊峰;《中国优秀硕士学位论文全文数据库 信息科技辑》;20160115(第01期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN108154066A (en) | 2018-06-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108154066B (en) | Three-dimensional target identification method based on curvature characteristic recurrent neural network | |
CN110866953B (en) | Map construction method and device, and positioning method and device | |
CN108898063B (en) | Human body posture recognition device and method based on full convolution neural network | |
CN108537191B (en) | Three-dimensional face recognition method based on structured light camera | |
Wei et al. | Applications of structure from motion: a survey | |
CN107424161B (en) | Coarse-to-fine indoor scene image layout estimation method | |
Tau et al. | Dense correspondences across scenes and scales | |
CN108564120B (en) | Feature point extraction method based on deep neural network | |
CN104077760A (en) | Rapid splicing system for aerial photogrammetry and implementing method thereof | |
CN105740775A (en) | Three-dimensional face living body recognition method and device | |
CN113361542B (en) | Local feature extraction method based on deep learning | |
CN106919944A (en) | A kind of wide-angle image method for quickly identifying based on ORB algorithms | |
CN111832484A (en) | Loop detection method based on convolution perception hash algorithm | |
CN110751097B (en) | Semi-supervised three-dimensional point cloud gesture key point detection method | |
CN105513094A (en) | Stereo vision tracking method and stereo vision tracking system based on 3D Delaunay triangulation | |
CN114998566A (en) | Interpretable multi-scale infrared small and weak target detection network design method | |
CN110120013A (en) | A kind of cloud method and device | |
CN116385660A (en) | Indoor single view scene semantic reconstruction method and system | |
He et al. | A generative feature-to-image robotic vision framework for 6D pose measurement of metal parts | |
Feng | Mask RCNN-based single shot multibox detector for gesture recognition in physical education | |
CN107330363A (en) | A kind of quick Internet advertising board detection method | |
Li et al. | Sparse-to-local-dense matching for geometry-guided correspondence estimation | |
CN104361573B (en) | The SIFT feature matching algorithm of Fusion of Color information and global information | |
Konishi et al. | Detection of target persons using deep learning and training data generation for Tsukuba challenge | |
CN117351078A (en) | Target size and 6D gesture estimation method based on shape priori |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CB03 | Change of inventor or designer information |
Inventor after: Liang Wei Inventor after: Li Yang Inventor after: Zheng Meng Inventor after: Peng Shiwei Inventor before: Liang Wei Inventor before: Li Yang Inventor before: Zheng Meng Inventor before: Tan Jindong Inventor before: Peng Shiwei |
|
CB03 | Change of inventor or designer information |