CN107944459A - A kind of RGB D object identification methods - Google Patents

A kind of RGB D object identification methods Download PDF

Info

Publication number
CN107944459A
CN107944459A CN201711315171.3A CN201711315171A CN107944459A CN 107944459 A CN107944459 A CN 107944459A CN 201711315171 A CN201711315171 A CN 201711315171A CN 107944459 A CN107944459 A CN 107944459A
Authority
CN
China
Prior art keywords
image
feature
rgb
surface normal
characteristic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711315171.3A
Other languages
Chinese (zh)
Inventor
雷建军
倪敏
丛润民
侯春萍
陈越
牛力杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN201711315171.3A priority Critical patent/CN107944459A/en
Publication of CN107944459A publication Critical patent/CN107944459A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a kind of RGB D object identification methods, the recognition methods comprises the following steps:The gray level image by coloured image generation, the surface normal by depth image generation are obtained, by coloured image, gray level image, depth image and surface normal collectively as more data pattern informations;High-level characteristic in coloured image, gray level image and surface normal is extracted by convolution recurrent neural network respectively;Utilize the high-level characteristic of convolution Fei Sheer vectors recurrent neural network extraction depth image;Above-mentioned multiple high-level characteristics are subjected to Fusion Features, the total characteristic of object is obtained, will realize object recognition task in the total characteristic input feature vector grader of object.The present invention merges a variety of data patterns, extracts more accurate RGB D object features, and then improve the accuracy rate of object identification.

Description

A kind of RGB-D object identification methods
Technical field
The present invention relates to deep learning, technical field of stereoscopic vision, more particularly to a kind of RGB-D object identification methods.
Background technology
Object identification is one of key technical problem of computer vision field, has important researching value and extensively should Use prospect.As the further development of sensing technology is with applying, the Kinect of coloured image and depth image can be obtained at the same time Camera etc. is increasingly becoming the mainstream imaging device of a new generation.In general, coloured image can provide the letter such as the texture of target, color Breath, depth image can provide the information such as effective depth, shape, and two kinds of information complement one another, and then further enhance various The performance of visual task.The depth information in RGB-D data how is fully excavated, explores depth and the relation of color data, into One step improves the emphasis and difficult point that object recognition rate is research.Therefore, have towards the object recognition technique research of RGB-D images Highly important theoretical and application value.
From the angle of feature generation, object identification method is broadly divided into two classes, i.e., based on the artificial method for obtaining feature and The method that feature is obtained based on study.The main distinction of these two kinds of methods is the acquisition modes of feature, the former is with artificial side Formula obtains feature, and the latter then extracts target signature by way of study.Obtained feature is inputted to grader (as supported Vector machine, random forest etc.) in classify, and then realize identification mission.
In based on the artificial method for obtaining feature, the feature being commonly used has SIFT (Scale-invariant Feature Transform), SURF (Speed Up Robust Feature) and texture primitive etc., these features can be effective Ground describes the color and the three-dimensional geometric information of texture information and depth image of coloured image.However, the spy of such method extraction Sign often has certain limitation, is only capable of fetching portion clue, and be not easy to expand to different data sets or different moulds In formula.
In contrast, the method extraction feature based on study can be obtained directly from initial data by study, mode More flexibly, it is reliable, wherein representational method has layering and matching tracking, convolution K averages descriptor, layering sparse coding drawn game Portion's codes co-ordinates etc..
The above method is usually handled just for coloured image and depth image, and have ignored other data patterns (such as Gray level image, surface normal etc.) useful effect to identification.
The content of the invention
The present invention is for recognition accuracy existing for current RGB-D object recognition techniques is low, feature describes imperfection etc. and asks Topic, proposes a kind of RGB-D object identification methods, it is intended to merges a variety of data patterns, it is special to extract more accurate RGB-D objects Sign, and then the accuracy rate of object identification is improved, it is described below:
A kind of RGB-D object identification methods, the recognition methods comprise the following steps:
The gray level image by coloured image generation, the surface normal by depth image generation are obtained, by coloured image, ash Image, depth image and surface normal are spent collectively as more data pattern informations;
The high level extracted respectively by convolution-recurrent neural network in coloured image, gray level image and surface normal is special Sign;
Utilize the high-level characteristic of convolution-Fei Sheer vectors-recurrent neural network extraction depth image;
Above-mentioned multiple high-level characteristics are subjected to Fusion Features, obtain the total characteristic of object, the total characteristic of object are inputted special Object recognition task is realized in sign grader.
It is described to be specially by the step of above-mentioned multiple high-level characteristics progress Fusion Features:
F=[Fd;Fc;Fn;Fg]
Wherein, F be represent object total characteristic, FdFor the feature of depth figure, FcFor the feature of coloured image, FnFor table The feature of face normal vector, FgFor the feature of gray level image.
The feature classifiers are specially softmax graders.
The beneficial effect of technical solution provided by the invention is:
1st, the present invention introduces Fei Sheer vector modules on the basis of convolution-recurrent neural network, denser, complete to obtain Standby depth characteristic statement;
2nd, the present invention merges a variety of data patterns, efficiently solves the RGB-D objects because caused by feature learning is not comprehensive The problem of recognition effect is poor.
Brief description of the drawings
Fig. 1 is a kind of flow chart of RGB-D object identification methods;
Fig. 2 is qualitative effect figure.
Embodiment
To make the object, technical solutions and advantages of the present invention clearer, embodiment of the present invention is made below further It is described in detail on ground.
Embodiment 1
A variety of data patterns such as combination of embodiment of the present invention coloured image, depth image, gray level image and surface normal, A kind of RGB-D object identification methods are proposed, concrete operation step is as follows:
101:The gray level image by coloured image generation, the surface normal by depth image generation are obtained, by cromogram Picture, gray level image, depth image and surface normal are collectively as more data pattern informations;
102:Height in coloured image, gray level image and surface normal is extracted by convolution-recurrent neural network respectively Layer feature;
103:Utilize the high-level characteristic of convolution-Fei Sheer vectors-recurrent neural network extraction depth image;
104:Above-mentioned multiple high-level characteristics are subjected to Fusion Features, obtain the total characteristic of object, the total characteristic of object is defeated Enter and object recognition task is realized in feature classifiers.
In conclusion the embodiment of the present invention is by above-mentioned steps 101- steps 104, it is basic in convolution-recurrent neural network Upper introducing Fei Sheer vector modules, are stated with obtaining denser, complete depth characteristic;A variety of data patterns are merged, effectively Solve the problems, such as that the RGB-D object identification effects because caused by feature learning is not comprehensive are poor.
Embodiment 2
The scheme in embodiment 1 is further introduced with reference to specific calculation formula, example, it is as detailed below Description:
201:Obtain more data pattern informations;
In order to more fully excavate coloured image and deep image information, the embodiment of the present invention adds two kinds of data moulds Formula, the i.e. gray level image by coloured image generation and the surface normal by depth image generation, provide as object identification More useful informations.Specifically, depth image and surface normal can provide the geological information of object, coloured image with Gray level image can provide the texture information of object.
202:Utilize the high-level characteristic of convolution-recurrent neural network extraction cromogram, gray-scale map and surface normal;
Wherein, convolution-recurrent neural network (Convolutional-Recursive Neural Network, CNN- RNN) model is made of convolutional neural networks and recurrent neural network two parts.Raw image data inputted after block divides to Convolutional neural networks carry out feature extraction, and then obtain the shift-invariant operator of low-dimensional.
In order to further improve the accuracy rate of object identification, the shift-invariant operator quilt of the low-dimensional of convolutional neural networks output Input carries out further feature extraction into recurrent neural network.Recurrent neural network can effectively capture the layer of input data The structural information of secondaryization feature and object, and then obtain more accurate feature statement.
1) convolutional neural networks
Convolutional neural networks are a kind of effective feature extraction structures, it mainly rolls up input picture using wave filter Feature of the product operation extraction with translation invariant characteristic.
The convolutional neural networks that the embodiment of the present invention utilizes mainly include convolution algorithm and pond computing, pass through convolution algorithm Multiple Feature Mappings can be obtained, multiple Feature Mappings represent the information of object different levels.Pond computing effect is to be based on office Portion's correlation principle carries out sub-sampling to multiple Feature Mappings, so as to retain useful information while data volume is reduced.
If the size of input picture is dI×dI, and it is d to have K sizeP×dPWave filter be used for convolution algorithm, then K d can be obtained after convolution algorithmI-dPThe filter response of+1 dimension.Then, it is d using sizel×dl, step-length be S window pair Filter response carries out average pondization operation, obtains the pondization that size is r × r and responds, wherein, r=(dI-dP-dl+1)/s+1.Cause This, convolutional neural networks are applied to piece image can obtain the three-dimensional matrice that size is K × r × r.
2) recurrent neural network
The thought of recurrent neural network is to be learnt in tree construction by recursion method using identical neutral net To the feature of stratification, and then obtain more detailed structural information.In recurrent neural network research work before, tree knot The constituted mode of structure is more flexible, but it is exchanged for by cost of arithmetic speed, thus is not easy to carry out parallel search or utilization Large-scale matrix concurrent operation.Therefore, the embodiment of the present invention employs the recurrent neural network of fixed tree construction, and to original net Network structure is extended, that is, it is per laminated and adjacent vector block and more than vectorial right to allow.Therefore, the embodiment of the present invention uses Recursive Neural Network Structure can obtain more neighborhood informations, and then obtain more accurate object features statement.
Recurrent neural network is using the output of convolutional neural networks as input, the three-dimensional matrice that convolutional neural networks are exported In every group of adjacent column vector be defined as block, multiple merged blocks form father's vectorFor convenience, the present invention is implemented Example is calculated using square, its size is denoted as K × b × b, then his father's vector can be expressed as:
Wherein,It is coefficient matrix, f represents nonlinear operation, and common mathematical function has tanh etc., mi(i=1, 2,...,b2) represent subvector.
203:Utilize the high-level characteristic of convolution-Fei Sheer vectors-recurrent neural network extraction depth map;
Fei Sheer vectors are a kind of advanced feature coding methods, it can generate parameter model (such as by data sample Gauss hybrid models) one group of local feature vectors is encoded to the character representation become compared with higher-dimension.
If X={ xk, k=1, K is the feature description obtained from convolutional neural networks, utilizes diagonal covariance matrix Training gauss hybrid models, formula are as follows:
Wherein, K is characterized the dimension of mapping, { ωiii, i=1,2, N represents the mixing of gauss hybrid models respectively Weight, average and diagonal covariance, N are the numbers of gauss hybrid models.γk(i) descriptor of i-th of Gaussian Profile is represented. Therefore, by Fei Sheer vector extraction depth image feature byWithComposition, i.e.,
204:Multiple features fusion;
Gray level image and surface are added in order to more fully excavate colored and deep image information, the embodiment of the present invention Two kinds of data patterns of normal vector.Gray level image and surface normal, can be with respectively as the supplement of coloured image and depth image More useful informations are provided for object identification.Using convolution-recurrent neural network respectively to colored, gray scale and surface normal After three kinds of data patterns carry out feature extraction, after being merged with the depth characteristic of convolution-Fei Sheer vectors-recurrent neural network extraction Final object features are obtained, are denoted as:
F=[Fd;Fc;Fn;Fg]
Wherein, F be represent object total characteristic, FdFor the feature of depth figure, FcFor the feature of coloured image, FnFor table The feature of face normal vector, FgFor the feature of gray level image.
205:Object recognition task will can be realized in object features F input feature vector graders.
Wherein, in view of the simplicity of softmax graders computationally, the embodiment of the present invention utilize softmax graders Realize object identification, the softmax graders are known to those skilled in the art, and the embodiment of the present invention does not repeat this.
In conclusion the embodiment of the present invention is by above-mentioned steps 201- steps 205, it is basic in convolution-recurrent neural network Upper introducing Fei Sheer vector modules, are stated with obtaining denser, complete depth characteristic;A variety of data patterns are merged, effectively Solve the problems, such as that the RGB-D object identification effects because caused by feature learning is not comprehensive are poor.
Embodiment 3
Feasibility verification is carried out to the scheme in Examples 1 and 2 with reference to Fig. 2, it is described below:
Fig. 2 gives the confusion matrix of the visualization result of this method, i.e. recognition result.Wherein, the transverse axis of confusion matrix Represent the object classification of prediction, altogether 51 classes, such as apple, bowl, cereal__box etc., the longitudinal axis represents true in data set Real object classification.
The value of confusion matrix diagonal entry represents accuracy of identification of this method in each classification, and such as a rows, b arranges The numerical value of element represent for a type objects to be mistakenly identified as the percentage of b type objects.From figure 2 it can be seen that this method obtains Preferable recognition result was obtained, most of classification can obtain higher discrimination.
The embodiment of the present invention is tested this method with control methods in RGB-D databases, obtains object identification Accuracy rate.RGB-D databases are the video sequences collected by Kincet cameras under the indoor environments such as kitchen, office Row, are related to 300 different objects, 51 class data.Each object example be from three viewing angles (30 °, 45 ° of horizontal line and 60 °) shot.Experimental result shows that the recognition accuracy of this method can reach 87.60%, Random Forest sides The recognition accuracy of method is that the recognition accuracy of 79.60%, Linear SVM methods is 81.90%.So with Random Forest methods are compared, and this method obtains 10.05% performance gain.Compared with Linear SVM methods, this method obtains 5.70% performance gain, algorithm performance are excellent.
Wherein, Random Forest methods and Linear SVM methods are algorithm known in those skilled in the art, The embodiment of the present invention does not repeat this.
It will be appreciated by those skilled in the art that attached drawing is the schematic diagram of a preferred embodiment, the embodiments of the present invention Sequence number is for illustration only, does not represent the quality of embodiment.
The foregoing is merely presently preferred embodiments of the present invention, is not intended to limit the invention, it is all the present invention spirit and Within principle, any modification, equivalent replacement, improvement and so on, should all be included in the protection scope of the present invention.

Claims (3)

1. a kind of RGB-D object identification methods, it is characterised in that the recognition methods comprises the following steps:
The gray level image by coloured image generation, the surface normal by depth image generation are obtained, by coloured image, gray-scale map Picture, depth image and surface normal are collectively as more data pattern informations;
High-level characteristic in coloured image, gray level image and surface normal is extracted by convolution-recurrent neural network respectively;
Utilize the high-level characteristic of convolution-Fei Sheer vectors-recurrent neural network extraction depth image;
Above-mentioned multiple high-level characteristics are subjected to Fusion Features, obtain the total characteristic of object, by the total characteristic input feature vector point of object Object recognition task is realized in class device.
2. a kind of RGB-D object identification methods according to claim 1, it is characterised in that described by above-mentioned multiple high levels Feature carry out Fusion Features the step of be specially:
F=[Fd;Fc;Fn;Fg]
Wherein, F be represent object total characteristic, FdFor the feature of depth figure, FcFor the feature of coloured image, FnFor surface method The feature of vector, FgFor the feature of gray level image.
A kind of 3. RGB-D object identification methods according to claim 1, it is characterised in that the tagsort implement body For softmax graders.
CN201711315171.3A 2017-12-09 2017-12-09 A kind of RGB D object identification methods Pending CN107944459A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711315171.3A CN107944459A (en) 2017-12-09 2017-12-09 A kind of RGB D object identification methods

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711315171.3A CN107944459A (en) 2017-12-09 2017-12-09 A kind of RGB D object identification methods

Publications (1)

Publication Number Publication Date
CN107944459A true CN107944459A (en) 2018-04-20

Family

ID=61943773

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711315171.3A Pending CN107944459A (en) 2017-12-09 2017-12-09 A kind of RGB D object identification methods

Country Status (1)

Country Link
CN (1) CN107944459A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109389621A (en) * 2018-09-11 2019-02-26 淮阴工学院 RGB-D method for tracking target based on the fusion of multi-mode depth characteristic
CN109685842A (en) * 2018-12-14 2019-04-26 电子科技大学 A kind of thick densification method of sparse depth based on multiple dimensioned network
CN109886102A (en) * 2019-01-14 2019-06-14 华中科技大学 A kind of tumble behavior Spatio-temporal domain detection method based on depth image
CN110119710A (en) * 2019-05-13 2019-08-13 广州锟元方青医疗科技有限公司 Cell sorting method, device, computer equipment and storage medium
CN110807798A (en) * 2018-08-03 2020-02-18 华为技术有限公司 Image recognition method, system, related device and computer readable storage medium
CN111222468A (en) * 2020-01-08 2020-06-02 浙江光珀智能科技有限公司 People stream detection method and system based on deep learning
CN111476816A (en) * 2019-09-29 2020-07-31 深圳市捷高电子科技有限公司 Intelligent efficient simultaneous recognition method for multiple objects
CN113065521A (en) * 2021-04-26 2021-07-02 北京航空航天大学杭州创新研究院 Object recognition method, device, apparatus, and medium
CN113240653A (en) * 2021-05-19 2021-08-10 中国联合网络通信集团有限公司 Rice quality detection method, device, server and system

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106778810A (en) * 2016-11-23 2017-05-31 北京联合大学 Original image layer fusion method and system based on RGB feature Yu depth characteristic
CN106826815A (en) * 2016-12-21 2017-06-13 江苏物联网研究发展中心 Target object method of the identification with positioning based on coloured image and depth image

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106778810A (en) * 2016-11-23 2017-05-31 北京联合大学 Original image layer fusion method and system based on RGB feature Yu depth characteristic
CN106826815A (en) * 2016-12-21 2017-06-13 江苏物联网研究发展中心 Target object method of the identification with positioning based on coloured image and depth image

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110807798A (en) * 2018-08-03 2020-02-18 华为技术有限公司 Image recognition method, system, related device and computer readable storage medium
CN110807798B (en) * 2018-08-03 2022-04-12 华为技术有限公司 Image recognition method, system, related device and computer readable storage medium
CN109389621A (en) * 2018-09-11 2019-02-26 淮阴工学院 RGB-D method for tracking target based on the fusion of multi-mode depth characteristic
CN109389621B (en) * 2018-09-11 2021-04-06 淮阴工学院 RGB-D target tracking method based on multi-mode depth feature fusion
CN109685842A (en) * 2018-12-14 2019-04-26 电子科技大学 A kind of thick densification method of sparse depth based on multiple dimensioned network
CN109886102A (en) * 2019-01-14 2019-06-14 华中科技大学 A kind of tumble behavior Spatio-temporal domain detection method based on depth image
CN110119710A (en) * 2019-05-13 2019-08-13 广州锟元方青医疗科技有限公司 Cell sorting method, device, computer equipment and storage medium
CN111476816A (en) * 2019-09-29 2020-07-31 深圳市捷高电子科技有限公司 Intelligent efficient simultaneous recognition method for multiple objects
CN111222468A (en) * 2020-01-08 2020-06-02 浙江光珀智能科技有限公司 People stream detection method and system based on deep learning
CN113065521A (en) * 2021-04-26 2021-07-02 北京航空航天大学杭州创新研究院 Object recognition method, device, apparatus, and medium
CN113065521B (en) * 2021-04-26 2024-01-26 北京航空航天大学杭州创新研究院 Object identification method, device, equipment and medium
CN113240653A (en) * 2021-05-19 2021-08-10 中国联合网络通信集团有限公司 Rice quality detection method, device, server and system

Similar Documents

Publication Publication Date Title
CN107944459A (en) A kind of RGB D object identification methods
CN109543606B (en) Human face recognition method with attention mechanism
Wang et al. SaliencyGAN: Deep learning semisupervised salient object detection in the fog of IoT
Li et al. Building-a-nets: Robust building extraction from high-resolution remote sensing images with adversarial networks
CN107844795B (en) Convolutional neural networks feature extracting method based on principal component analysis
CN104573731B (en) Fast target detection method based on convolutional neural networks
CN111243093B (en) Three-dimensional face grid generation method, device, equipment and storage medium
CN107808129A (en) A kind of facial multi-characteristic points localization method based on single convolutional neural networks
CN110738207A (en) character detection method for fusing character area edge information in character image
CN103810504B (en) Image processing method and device
CN106981080A (en) Night unmanned vehicle scene depth method of estimation based on infrared image and radar data
CN110246181B (en) Anchor point-based attitude estimation model training method, attitude estimation method and system
CN108334830A (en) A kind of scene recognition method based on target semanteme and appearance of depth Fusion Features
Li et al. LPSNet: a novel log path signature feature based hand gesture recognition framework
CN106650630A (en) Target tracking method and electronic equipment
CN105205453B (en) Human eye detection and localization method based on depth self-encoding encoder
CN107103613A (en) A kind of three-dimension gesture Attitude estimation method
CN104318570A (en) Self-adaptation camouflage design method based on background
CN104298974A (en) Human body behavior recognition method based on depth video sequence
CN109712127A (en) A kind of electric transmission line fault detection method for patrolling video flowing for machine
CN115330940B (en) Three-dimensional reconstruction method, device, equipment and medium
CN101794459A (en) Seamless integration method of stereoscopic vision image and three-dimensional virtual object
CN113537180B (en) Tree obstacle identification method and device, computer equipment and storage medium
CN112329771B (en) Deep learning-based building material sample identification method
CN107066979A (en) A kind of human motion recognition method based on depth information and various dimensions convolutional neural networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20180420