CN109784297A

CN109784297A - A kind of Three-dimensional target recognition based on deep learning and Optimal Grasp method

Info

Publication number: CN109784297A
Application number: CN201910077632.0A
Authority: CN
Inventors: 陈丹; 林清泉
Original assignee: Fuzhou University
Current assignee: Fuzhou University
Priority date: 2019-01-26
Filing date: 2019-01-26
Publication date: 2019-05-21

Abstract

The present invention relates to a kind of Three-dimensional target recognitions based on deep learning and Optimal Grasp method.This method, firstly, obtaining image using Kinect camera；Then, an improved faster RCNN model is constructed in the first order, the target object in image is identified and positioned, the region being partitioned into where target object is simultaneously rotated accordingly；Finally, constructing a new faster RCNN model in the second level, the coordinate and rotation angle at target object Optimal Grasp position are obtained, realizes the Optimal Grasp of target object.The method of the present invention improves model in target identification portion, strengthens identifying and positioning for the target object small to imaging, and using the posture for determining target object based on the method for dichotomy, reduces runing time, improve precision；This method first detects entire target object, then out of detection region searching object optimal pose, not only reduce find feature range, also reduce identification error probability.

Description

A kind of Three-dimensional target recognition based on deep learning and Optimal Grasp method

Technical field

The present invention relates to technical field of robot vision, and in particular to a kind of Three-dimensional target recognition based on deep learning with Optimal Grasp method.

Background technique

With the fast development of robot and the application of machine vision, robot crawl also produces many variations.From original Come the simple intelligent recognition grabbed till now, Optimal Grasp and makes a response to external environment.These variations all signify machine Device people is just gradually developed to intelligent direction, so that the various actions of robot are increasingly as the mankind.

In field of machine vision, target detection and Optimal Grasp are all research hotspots in recent years.Optimal Grasp requires machine Device portrait people is the same, can not only identify the classification of target object, also to find the pose that target object is easier crawl.It passes The crawl pose method of system handles image information using traditional feature extracting method, these methods are generally by designer It is formed for particular problem hand-designed, because by factors such as the shape of target object, size, angle change, exterior light photographs It influences, thus extracted feature is extensive indifferent, robustness is poor, it is difficult to adapt to new object.Feature is extracted with tradition Method is compared, and the advantage of deep learning is not needing artificially to carry out setting certain feature when extracting feature, but uses A kind of general learning process makes model learn the feature of target object from large-scale data.Therefore depth learning technology is answered For in the target detection and Optimal Grasp of robot.

At present in the robot Optimal Grasp field based on deep learning, target object may be implemented there are many method Optimal Grasp, but be difficult to realize that the Optimal Grasp of lesser target object is imaged, and it is longer to handle the time.

Summary of the invention

The purpose of the present invention is to provide a kind of Three-dimensional target recognitions based on deep learning and Optimal Grasp method, solve It is difficult to realize that the Optimal Grasp of lesser target object and processing time longer problem is imaged.

To achieve the above object, the technical scheme is that a kind of Three-dimensional target recognition based on deep learning with most Excellent grasping means, includes the following steps:

Step S1, image is obtained using Kinect camera, and carries out image preprocessing；

Step S2, an improved faster RCNN model is constructed in the first order, the target object in image is known It is not rotated accordingly simultaneously with positioning, the region being partitioned into where target object；

Step S3, a new faster RCNN model is constructed in the second level, obtains target object Optimal Grasp position Coordinate and rotation angle, realize the Optimal Grasp of target object.

Further, in step sl, described image pre-processes, including carries out contours extract to depth image, and will wheel Wide image, color image and depth image carry out pixel value by the weight of predetermined ratio respectively and are added, and obtain blending image.

Further, in step s 2, the improved faster RCNN model, including sequentially connected convolutional layer 1, Pond layer 1, convolutional layer 2, convolutional layer 3, convolutional layer 4, pond layer 2, convolutional layer 5, fused layer, the convolutional layer 3 is gone back and fused layer Connection, meanwhile, in order to the small object of recognition imaging, the parameter of convolutional layer 2 and convolutional layer 4 is adjusted, i.e. convolutional layer 2 Parameter adjustment are as follows: convolution kernel size is 3*3, and border extended number is 1, step-length 1；The parameter of convolutional layer 4 adjusts are as follows: convolution kernel is big Small is 5*5, and border extended number is 2, step-length 2

Further, in step s 2, described to construct an improved faster RCNN model in the first order, to image In the specific implementation process that is identified and positioned of target object are as follows: the fusion that image preprocessing obtains will be carried out through step S1 Image inputs improved faster RCNN model, merges by the extraction of 5 layers of convolutional layer and characteristic image, then with train Category feature be compared, obtain classification and the position of target object.

Further, in step s 2, the region being partitioned into where target object and the specific mistake rotated accordingly Journey are as follows:

Step S21, identifying and positioning for the target object in image is found according to improved faster RCNN model The profile of target object, so that target object be split from image:

Step S22, the profile of target object is surrounded using minimum area-encasing rectangle, and determines rotating range:

Step S23, image is carried out by multiple rotary using dichotomy.

Compared to the prior art, the invention has the following advantages: the present invention uses tandem type faster RCNN model Realize the Optimal Grasp of target object.In the improved faster RCNN model of the first order, the spy of target object is saved as far as possible Sign realizes identification and positioning to small target object is imaged.It is accurate using dichotomy in the faster RCNN model of the second level It determines the posture of target object, there is the imaging object suitable for arbitrary size, improve position and attitude accuracy and reduction processing The features such as time.Automation industrial to realization, intelligence have good application prospect.

Detailed description of the invention

Fig. 1 is the improved faster RCNN model schematic of the present invention.

Fig. 2 is the flow chart of robot Optimal Grasp system.

Specific embodiment

With reference to the accompanying drawing, technical solution of the present invention is specifically described.

The present invention provides a kind of Three-dimensional target recognitions based on deep learning and Optimal Grasp method, including walk as follows It is rapid:

In step s 2, the improved faster RCNN model, including sequentially connected convolutional layer 1, pond layer 1, volume Lamination 2, convolutional layer 3, convolutional layer 4, pond layer 2, convolutional layer 5, fused layer, the convolutional layer 3 are also connect with fused layer, meanwhile, In order to the small object of recognition imaging, the parameter of convolutional layer 2 and convolutional layer 4 is adjusted.Before improvement, the ginseng of convolutional layer 2 Number are as follows: convolution kernel size is 5*5, and border extended number is 2, step-length 2；The parameter of convolutional layer 4: convolution kernel size is 3*3, edge Spreading number is 1, step-length 1.After improvement, the parameter of convolutional layer 2 are as follows: convolution kernel size is 3*3, and border extended number is 1, and step-length is 1；The parameter of convolutional layer 4: convolution kernel size is 5*5, and border extended number is 2, step-length 2.

The following are specific implementation processes of the invention.

As shown in Fig. 2, a kind of Three-dimensional target recognition based on deep learning of the invention and Optimal Grasp method, including such as Lower step:

S1, image is obtained using Kinect camera, and is pre-processed；

S2, it is input to the improved faster RCNN model of the first order, detects classification and the position of target object；

S3, segmented image are simultaneously rotated；

S4, it is input to second level faster RCNN model, obtains position and the posture of target object Optimal Grasp.

In S1 when image preprocessing, contours extract is carried out to depth image, and by contour images, color image and depth The weight of image respectively by a certain percentage carries out pixel value addition, obtains blending image.

It is to input blending image obtained in S1 when being input to the improved faster RCNN model of the first order in S2 Into improved faster RCNN model, merged by the extraction of 5 layers of feature extraction layer and characteristic image, then with train Category feature be compared, obtain classification and the position of target object.

It is to be split according to the rectangle frame of first order faster RCNN model output, by mesh in S3 when segmented image Mark object is split from overall background.When rotating image in S3, target object frame is risen using minimum area-encasing rectangle first Come.Assuming that lesser angle is α ° between the short side and x-axis of minimum rectangle, β ° of reselection rotation steps appropriate then can be true Determining rotating range is (alpha-beta) ° and (alpha+beta) °.The image after (alpha-beta) ° and (alpha+beta) ° will be rotated and be input to second level faster In RCNN model, model can export the probability that rectangle frame in this two images is Optimal Grasp pose.Compare (alpha-beta) ° and (α+ Output probability β) ° chooses probability big one (assuming that being (alpha+beta) °) with α ° and forms new rotating range, then uses two points Method chooses α ° and the angle (alpha+beta/2) ° of (alpha+beta) ° centre, the image of rotation alpha ° and (alpha+beta) ° is input to trained mould In type, compare the α ° of output probability with (alpha+beta) °, choose big one of probability and form new rotating range again with (alpha+beta/2) °, And so on.As long as being computed probably 6 times (obtaining 6 postrotational images) of rotation, so that it may propose attitude angle precision Height is to 1 °.

When obtaining position and the posture of target object Optimal Grasp in S4, which is in S3 in rotated image The coordinate of Optimal Grasp needs to be translated into the original image obtained in S1.The posture of target object is rotated according in S3 Angle calculation obtain.

The above are preferred embodiments of the present invention, all any changes made according to the technical solution of the present invention, and generated function is made When with range without departing from technical solution of the present invention, all belong to the scope of protection of the present invention.

Claims

1. a kind of Three-dimensional target recognition based on deep learning and Optimal Grasp method, which comprises the steps of:

Step S2, the first order construct an improved faster RCNN model, to the target object in image carry out identification and Positioning, the region being partitioned into where target object are simultaneously rotated accordingly；

Step S3, a new faster RCNN model is constructed in the second level, obtains the coordinate at target object Optimal Grasp position With rotation angle, the Optimal Grasp of target object is realized.

2. a kind of Three-dimensional target recognition based on deep learning according to claim 1 and Optimal Grasp method, feature It is, in step sl, described image pretreatment, including contours extract is carried out to depth image, and by contour images, cromogram Picture carries out pixel value by the weight of predetermined ratio respectively with depth image and is added, and obtains blending image.

3. a kind of Three-dimensional target recognition based on deep learning according to claim 1 and Optimal Grasp method, feature It is, in step s 2, the improved faster RCNN model, including sequentially connected convolutional layer 1, pond layer 1, convolution Layer 2, convolutional layer 3, convolutional layer 4, pond layer 2, convolutional layer 5, fused layer, the convolutional layer 3 are also connect with fused layer, meanwhile, it is Can the small object of recognition imaging, the parameter of convolutional layer 2 and convolutional layer 4 is adjusted, i.e. the parameter adjustment of convolutional layer 2 are as follows: Convolution kernel size is 3*3, and border extended number is 1, step-length 1；The parameter of convolutional layer 4 adjusts are as follows: convolution kernel size is 5*5, side Edge spreading number is 2, step-length 2.

4. a kind of Three-dimensional target recognition based on deep learning according to claim 3 and Optimal Grasp method, feature It is, it is in step s 2, described to construct an improved faster RCNN model in the first order, to the target object in image The specific implementation process identified and positioned are as follows: the blending image that image preprocessing obtains will be carried out through step S1, input changes Into faster RCNN model, merged by the extraction of 5 layers of convolutional layer and characteristic image, then with trained category feature It is compared, obtains classification and the position of target object.

5. a kind of Three-dimensional target recognition based on deep learning according to claim 1 and Optimal Grasp method, feature It is, in step s 2, the region being partitioned into where target object and the detailed process rotated accordingly are as follows:

Step S21, target is found to identifying and positioning for the target object in image according to improved faster RCNN model The profile of object, so that target object be split from image:

Step S23, image is carried out by multiple rotary using dichotomy.