CN107563388A - A kind of convolutional neural networks object identification method based on depth information pre-segmentation - Google Patents

A kind of convolutional neural networks object identification method based on depth information pre-segmentation Download PDF

Info

Publication number
CN107563388A
CN107563388A CN201710838112.8A CN201710838112A CN107563388A CN 107563388 A CN107563388 A CN 107563388A CN 201710838112 A CN201710838112 A CN 201710838112A CN 107563388 A CN107563388 A CN 107563388A
Authority
CN
China
Prior art keywords
layer
convolution
neural networks
image
convolutional neural
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710838112.8A
Other languages
Chinese (zh)
Inventor
王晟
左东昊
谢丽萍
钱唯
刘正阳
方郅昊
高英淇
成奕霖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northeastern University China
Original Assignee
Northeastern University China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northeastern University China filed Critical Northeastern University China
Priority to CN201710838112.8A priority Critical patent/CN107563388A/en
Publication of CN107563388A publication Critical patent/CN107563388A/en
Pending legal-status Critical Current

Links

Landscapes

  • Image Analysis (AREA)

Abstract

The present invention relates to a kind of convolutional neural networks object identification method based on depth information pre-segmentation, comprise the following steps:Step 1:Gather the depth image and coloured image of scene;Step 2:The depth image of object is partitioned into from the depth image of scene;Step 3:According to the segmentation scope of the depth image of object, the coloured image of object is partitioned into from the coloured image of scene;Step 4:Processing is filled to the coloured image split;Step 5:Populated coloured image is input into convolutional neural networks to be identified, exports recognition result.The convolutional neural networks object identification method based on depth information pre-segmentation of the present invention, can identify the multiple objects in complicated picture, training speed is fast, recognition speed is fast, low to hardware requirement, can reduce convolutional neural networks over-fitting.

Description

A kind of convolutional neural networks object identification method based on depth information pre-segmentation
Technical field
The present invention relates to image procossing and object recognition technique field, and in particular to a kind of based on depth information pre-segmentation Convolutional neural networks object identification method.
Background technology
The computer of object explains that the application for robotics, artificial intelligence etc. has critical effect.Will It is that prescience is studied that the subject image of sensor collection, which is converted into human understandable information (such as word, sound, image etc.), Focus.
Mainly simple RGB image is identified for existing object identification method.This identification method may incite somebody to action The global characteristics of image reduce the discrimination of object as local feature so as to over-fitting occur.Another new knowledge Other method is the RGB of object and depth image to be together put into convolution and neutral net is trained and identified.This method Resolution is high compared with first method, but amount of calculation is excessive in the training and identification for more objects.
For problem present in current image recognition, it is proposed that new algorithm, is reduced while discrimination is improved Amount of calculation.
The content of the invention
The present invention provides a kind of thing based on depth information pre-segmentation and convolutional neural networks in view of the shortcomings of the prior art Body recognition methods, the multiple objects in complicated picture can be identified, recognition speed is fast, can reduce convolutional neural networks over-fitting Occur.
The present invention provides a kind of convolutional neural networks object identification method based on depth information pre-segmentation, including following step Suddenly:
Step 1:Gather the depth image and coloured image of scene;
Step 2:The depth image of object is partitioned into from the depth image of scene;
Step 3:According to the segmentation scope of the depth image of object, the colour of object is partitioned into from the coloured image of scene Image;
Step 4:The coloured image split is filled and scaling is handled;
Step 5:Populated coloured image is input into convolutional neural networks to be identified, exports recognition result.
In the convolutional neural networks object identification method based on depth information pre-segmentation of the present invention, step 1 is specially:
Synchronization gathers the depth image and coloured image of Same Scene, and image capture device can be Kinect device Or more mesh cameras.
In the convolutional neural networks object identification method based on depth information pre-segmentation of the present invention, step 2 is specially:
Step 2.1:The depth image of scene is divided into by foreground and background using Otsu algorithm, prospect represents target object, Its depth is within the specific limits;
Step 2.2:Using seed region growth algorithm, the depth image of target object is divided from the depth image of scene Cut out.
In the convolutional neural networks object identification method based on depth information pre-segmentation of the present invention, step 2.2 is specific For:
Step 2.2.1:Five pixels are selected at random in the depth bounds of prospect as seed point;
Step 2.2.2:8 pixels around each seed point are just traveled through, the picture when grey scale change is less than 4 Vegetarian refreshments assimilates into seed point;
Step 2.2.3:Repeat step 2.2.2 is divided into seed point and non-seed point until pixel all on picture;
Step 2.2.4:The image that seed point is formed is split to the depth image for obtaining object.
In the convolutional neural networks object identification method based on depth information pre-segmentation of the present invention, step 3 is specially:
The coloured image of scene and the pixel of depth image correspond, can be according to the framing bits of the depth image of object Put on the coloured image for corresponding to scene, and then the color images of object are come out.
In the convolutional neural networks object identification method based on depth information pre-segmentation of the present invention, step 4 includes:
Step 4.1:The RGB color value of filling region is set, the ratio of width to height for the coloured image split is filled to 1:1;
Step 4.2:Coloured image after filling is adjusted to by defined size using bilinearity difference arithmetic.
In the convolutional neural networks object identification method based on depth information pre-segmentation of the present invention, set in step 4.1 Following several method can be used by determining the RGB color value of filling region:
A. the RGB color value of the edge pixel point of article is taken into averaging operation, the color is referred to as edge average, filled The RGB color value in region is the inverse of edge average;
B. the RGB color value for setting filling region is (0,0,0), that is, fills black;
C. the RGB color value for setting filling region is (255,255,255), i.e. filling white.
In the convolutional neural networks object identification method based on depth information pre-segmentation of the present invention, step 5 includes:
Step 5.1:Build convolutional neural networks model;
Step 5.2:The image construction training set for gathering a variety of objects is trained to convolutional neural networks model;
Step 5.3:Populated coloured image is input to the convolutional neural networks model trained to be identified, and it is defeated Go out result.
In the convolutional neural networks object identification method based on depth information pre-segmentation of the present invention, structure in step 5.1 It is the neutral net for including 20 hidden layers to build convolutional neural networks, is specially:
First layer is that the wave filter for having 64 3*3 in convolutional layer conv1, conv1 carries out the convolution operation that step-length is 1 pixel Simultaneously by a nonlinear activation layer after terminating before carrying out convolution there is edge filling Padding operations in convolution ReLU functions are as activation primitive;
The second layer is that the wave filter for having 64 3*3 in convolutional layer conv2, conv2 carries out the convolution operation that step-length is 1 pixel Simultaneously by a nonlinear activation layer after terminating before carrying out convolution there is edge filling Padding operations in convolution ReLU functions are as activation primitive;
Third layer is pond layer subsampling1 layers, is operated during pond using maximum pondization;
4th layer is that the wave filter for having 128 3*3 in convolutional layer conv3, conv3 carries out the convolution behaviour that step-length is 1 pixel Make simultaneously after terminating before carrying out convolution there is edge filling Padding operations in convolution by a nonlinear activation layer ReLU functions are as activation primitive;
Layer 5 is that the wave filter for having 128 3*3 in convolutional layer conv4, conv4 carries out the convolution behaviour that step-length is 1 pixel Make simultaneously after terminating before carrying out convolution there is edge filling Padding operations in convolution by a nonlinear activation layer ReLU functions are as activation primitive;
Layer 6 is pond layer subsampling2 layers, is operated during pond using maximum pondization;
Layer 7 is that the wave filter for having 256 3*3 in convolutional layer conv5, conv5 carries out the convolution behaviour that step-length is 1 pixel Make simultaneously after terminating before carrying out convolution there is edge filling Padding operations in convolution by a nonlinear activation layer ReLU functions are as activation primitive;
8th layer is that the wave filter for having 256 3*3 in convolutional layer conv6, conv6 carries out the convolution behaviour that step-length is 1 pixel Make simultaneously after terminating before carrying out convolution there is edge filling Padding operations in convolution by a nonlinear activation layer ReLU functions are as activation primitive;
9th layer is that the wave filter for having 256 3*3 in convolutional layer conv7, conv7 carries out the convolution behaviour that step-length is 1 pixel Make simultaneously after terminating before carrying out convolution there is edge filling Padding operations in convolution by a nonlinear activation layer ReLU functions are as activation primitive;
Tenth layer is pond layer subsampling3 layers, is operated during pond using maximum pondization;
Eleventh floor is convolutional layer conv8, and the wave filter for having 512 3*3 in conv8 carries out the convolution that step-length is 1 pixel Operation is simultaneously by a nonlinear activation after terminating before carrying out convolution there is edge filling Padding operations in convolution Layer ReLU functions are as activation primitive;
Floor 12 is convolutional layer conv9, and the wave filter for having 512 3*3 in conv9 carries out the convolution that step-length is 1 pixel Operation is simultaneously by a nonlinear activation after terminating before carrying out convolution there is edge filling Padding operations in convolution Layer ReLU functions are as activation primitive;
13rd layer is convolutional layer conv10, and the wave filter for having 512 3*3 in conv10 carries out the volume that step-length is 1 pixel Product operation is simultaneously non-linear sharp by one after terminating before carrying out convolution there is edge filling Padding operations in convolution Layer ReLU functions living are as activation primitive;
14th layer is pond layer subsampling4 layers, is operated during pond using maximum pondization;
15th layer is convolutional layer conv11, and the wave filter for having 512 3*3 in conv11 carries out the volume that step-length is 1 pixel Product operation is simultaneously non-linear sharp by one after terminating before carrying out convolution there is edge filling Padding operations in convolution Layer ReLU functions living are as activation primitive;
16th layer is convolutional layer conv12, and the wave filter for having 512 3*3 in conv12 carries out the volume that step-length is 1 pixel Product operation is simultaneously non-linear sharp by one after terminating before carrying out convolution there is edge filling Padding operations in convolution Layer ReLU functions living are as activation primitive;
17th layer is convolutional layer conv13, and the wave filter for having 512 3*3 in conv13 carries out the volume that step-length is 1 pixel Product operation is simultaneously non-linear sharp by one after terminating before carrying out convolution there is edge filling Padding operations in convolution Layer ReLU functions living are as activation primitive;
18th is pond layer subsampling5 layers, is operated during pond using maximum pondization;
19th layer is that full articulamentum Fc uses average pooling, and the training and prediction of neutral net are improved with this Speed;
20th layer is classification layer Softmax, and the characteristic vector input classification layer of full articulamentum Fc outputs is identified The tag along sort of object, the probability of every kind of tag along sort is calculated, and the label of maximum probability is exported.
The convolutional neural networks object identification method based on depth information pre-segmentation of the present invention, can identify complicated picture In multiple objects, it is fast to reduce required training burden, recognition speed, low to hardware requirement, can reduce convolutional neural networks mistake Fitting.
Brief description of the drawings
Fig. 1 is a kind of flow of convolutional neural networks object identification method based on depth information pre-segmentation of the present invention Figure;
Fig. 2 is the structural representation of the convolutional neural networks of the present invention.
Embodiment
The invention provides a kind of convolutional neural networks object identification method based on depth information pre-segmentation, such as Fig. 1 institutes Show, recognition methods comprises the following steps:
Step 1:Gather the depth image and coloured image of scene;
Step 2:The depth image of object is partitioned into from the depth image of scene;
Step 3:According to the segmentation scope of the depth image of object, the colour of object is partitioned into from the coloured image of scene Image;
Step 4:The coloured image split is filled and scaling is handled;
Step 5:Populated coloured image is input into convolutional neural networks to be identified, exports recognition result.
Step 1 is specially:
Synchronization gathers the depth image and coloured image of Same Scene, and image capture device can be Kinect device Or more mesh cameras.
Coloured image can be obtained by camera, and the acquisition method of depth image is following several including being not limited to:1. pass through pair Then camera collection assistant images carry out pattern match, can obtain two different coordinates of the object in binocular camera, Depth image is calculated using geometrical relationship so as to combine the distance between dual camera.This cheap precision of method compared with It is low, and be difficult to differentiate the object more than five meters.2. being scanned while rotation at a high speed by laser radar, obtain around some Object is to the distance of sensor, and this technology is widely used in a variety of applications on pilotless automobile, for example, December 10 in 2015 Day, Baidu's pilotless automobile actually road is successfully tested, and the Velodyne HDL-64E used on its vehicle body are exactly such Laser radar technique.Similar technical price is very high, and the price of an equipment can reach hundreds thousand of RMB.3. use The business machines such as the Kinect of Microsoft.Depth information is obtained by the way of dual camera combination infrared camera.It is this Mode has reached preferable balance between price and precision, but the private problem of algorithm be present.
The depth information of image is gathered and then is calculated by two cameras, and a color information camera wherein Station acquisition, and both pixels are not corresponded, it is necessary to by certain correction algorithm by depth image and colour Image alignment, by taking Microsoft Kinect as an example, there is the corresponding image of function pair two to carry out alignment pair in its supporting software development kit It should handle.
Step 2 is specially:
Step 2.1:The depth image of scene is divided into by foreground and background using Otsu algorithm, prospect represents target object, Its depth is within the specific limits;
Now the depth of prospect is in certain scope, and for example target object is split as prospect by Otsu algorithm, For the distance of the target object range sensor between 35cm-42cm, the distance is referred to as prospect distance range.But because picture In have other noise range sensors also within prospect distance range, we using this value of gray value corresponding to 37cm as The seed of seed growth.
Step 2.2:Using seed region growth algorithm, the depth image of target object is divided from the depth image of scene Cut out, be specially:
Step 2.2.1:Five pixels are selected at random in the depth bounds of prospect as seed point;
Step 2.2.2:8 pixels around each seed point are just traveled through, the picture when grey scale change is less than 4 Vegetarian refreshments assimilates into seed point;
Step 2.2.3:Repeat step 2.2.2 is divided into seed point and non-seed point until pixel all on picture;
Step 2.2.4:The image that seed point is formed is split to the depth image for obtaining object.
We can be used for multiple times Otsu algorithm and be split picture during Range Image Segmentation, and to each The result of segmentation carries out algorithm of region growing to remove incoherent information, and such as two cups are put on the table, region life Long algorithm can be split two cups.
Step 3 is specially:
The coloured image of scene and the pixel of depth image correspond, can be according to the framing bits of the depth image of object Put on the coloured image for corresponding to scene, and then the color images of object are come out.
It is not of uniform size in the image scaled after Da-Jin algorithm and the segmentation of seed region growth method, but the present invention uses Convolutional neural networks need the dimension scale of picture to be unified for:Wide 224 pixel, high 224 pixel.So we are carried out down to image State two operations, filling and scaling.
Step 4 is filled to the coloured image split and scaling processing specifically includes:
Step 4.1:The RGB color value of filling region is set, the ratio of width to height for the coloured image split is filled to 1:1;
The RGB color value of filling region is set in step 4.1 can use following several method:
A. the RGB color value of the edge pixel point of article is taken into averaging operation, the color is referred to as edge average, filled The RGB color value in region is the inverse of edge average;
B. the RGB color value for setting filling region is (0,0,0), that is, fills black;
C. the RGB color value for setting filling region is (255,255,255), i.e. filling white.
It is filled, it is assumed that the length of the picture is a width of (w, h), if w>H, picture is filled into (w, w);If w<H, by picture It is filled into (h, h).
Step 4.2:Coloured image after filling is adjusted to by defined size using bilinearity difference arithmetic.
In the present embodiment, the image scaling of length a width of (x, x) of filling will have been completed to (224,244).
After the completion of image procossing, by each pixel of image according to from left to right, order from top to bottom is pressed successively Row, which is input in convolutional neural networks, carries out image recognition.Step 5 includes:
Step 5.1:Build convolutional neural networks model;
When it is implemented, structure convolutional neural networks are the neutral net for including 20 hidden layers, it is specially:
First layer is that the wave filter for having 64 3*3 in convolutional layer conv1, conv1 carries out the convolution operation that step-length is 1 pixel Simultaneously by a nonlinear activation layer after terminating before carrying out convolution there is edge filling Padding operations in convolution ReLU functions are as activation primitive;
The second layer is that the wave filter for having 64 3*3 in convolutional layer conv2, conv2 carries out the convolution operation that step-length is 1 pixel Simultaneously by a nonlinear activation layer after terminating before carrying out convolution there is edge filling Padding operations in convolution ReLU functions are as activation primitive;
Third layer is pond layer subsampling1 layers, is operated during pond using maximum pondization;
4th layer is that the wave filter for having 128 3*3 in convolutional layer conv3, conv3 carries out the convolution behaviour that step-length is 1 pixel Make simultaneously after terminating before carrying out convolution there is edge filling Padding operations in convolution by a nonlinear activation layer ReLU functions are as activation primitive;
Layer 5 is that the wave filter for having 128 3*3 in convolutional layer conv4, conv4 carries out the convolution behaviour that step-length is 1 pixel Make simultaneously after terminating before carrying out convolution there is edge filling Padding operations in convolution by a nonlinear activation layer ReLU functions are as activation primitive;
Layer 6 is pond layer subsampling2 layers, is operated during pond using maximum pondization;
Layer 7 is that the wave filter for having 256 3*3 in convolutional layer conv5, conv5 carries out the convolution behaviour that step-length is 1 pixel Make simultaneously after terminating before carrying out convolution there is edge filling Padding operations in convolution by a nonlinear activation layer ReLU functions are as activation primitive;
8th layer is that the wave filter for having 256 3*3 in convolutional layer conv6, conv6 carries out the convolution behaviour that step-length is 1 pixel Make simultaneously after terminating before carrying out convolution there is edge filling Padding operations in convolution by a nonlinear activation layer ReLU functions are as activation primitive;
9th layer is that the wave filter for having 256 3*3 in convolutional layer conv7, conv7 carries out the convolution behaviour that step-length is 1 pixel Make simultaneously after terminating before carrying out convolution there is edge filling Padding operations in convolution by a nonlinear activation layer ReLU functions are as activation primitive;
Tenth layer is pond layer subsampling3 layers, is operated during pond using maximum pondization;
Eleventh floor is convolutional layer conv8, and the wave filter for having 512 3*3 in conv8 carries out the convolution that step-length is 1 pixel Operation is simultaneously by a nonlinear activation after terminating before carrying out convolution there is edge filling Padding operations in convolution Layer ReLU functions are as activation primitive;
Floor 12 is convolutional layer conv9, and the wave filter for having 512 3*3 in conv9 carries out the convolution that step-length is 1 pixel Operation is simultaneously by a nonlinear activation after terminating before carrying out convolution there is edge filling Padding operations in convolution Layer ReLU functions are as activation primitive;
13rd layer is convolutional layer conv10, and the wave filter for having 512 3*3 in conv10 carries out the volume that step-length is 1 pixel Product operation is simultaneously non-linear sharp by one after terminating before carrying out convolution there is edge filling Padding operations in convolution Layer ReLU functions living are as activation primitive;
14th layer is pond layer subsampling4 layers, is operated during pond using maximum pondization;
15th layer is convolutional layer conv11, and the wave filter for having 512 3*3 in conv11 carries out the volume that step-length is 1 pixel Product operation is simultaneously non-linear sharp by one after terminating before carrying out convolution there is edge filling Padding operations in convolution Layer ReLU functions living are as activation primitive;
16th layer is convolutional layer conv12, and the wave filter for having 512 3*3 in conv12 carries out the volume that step-length is 1 pixel Product operation is simultaneously non-linear sharp by one after terminating before carrying out convolution there is edge filling Padding operations in convolution Layer ReLU functions living are as activation primitive;
17th layer is convolutional layer conv13, and the wave filter for having 512 3*3 in conv13 carries out the volume that step-length is 1 pixel Product operation is simultaneously non-linear sharp by one after terminating before carrying out convolution there is edge filling Padding operations in convolution Layer ReLU functions living are as activation primitive;
18th is pond layer subsampling5 layers, is operated during pond using maximum pondization;
19th layer is that full articulamentum Fc uses average pooling, and the training and prediction of neutral net are improved with this Speed;
20th layer is classification layer Softmax, and the characteristic vector input classification layer of full articulamentum Fc outputs is identified The tag along sort of object, the probability of every kind of tag along sort is calculated, and the label of maximum probability is exported.
Step 5.2:The image construction training set for gathering a variety of objects is trained to convolutional neural networks model;
Step 5.3:Populated coloured image is input to the convolutional neural networks model trained to be identified, and it is defeated Go out result.
The present invention improves Generalization Capability using the complete convolutional layer weight of Image-Net pre-training with this.
Presently preferred embodiments of the present invention is the foregoing is only, the thought being not intended to limit the invention is all the present invention's Within spirit and principle, any modification, equivalent substitution and improvements made etc., it should be included in the scope of the protection.

Claims (9)

1. a kind of convolutional neural networks object identification method based on depth information pre-segmentation, it is characterised in that including following step Suddenly:
Step 1:Gather the depth image and coloured image of scene;
Step 2:The depth image of object is partitioned into from the depth image of scene;
Step 3:According to the segmentation scope of the depth image of object, the cromogram of object is partitioned into from the coloured image of scene Picture;
Step 4:The coloured image split is filled and scaling is handled;
Step 5:Populated coloured image is input into convolutional neural networks to be identified, exports recognition result.
2. the convolutional neural networks object identification method based on depth information pre-segmentation, its feature exist as claimed in claim 1 In step 1 is specially:
Synchronization gathers the depth image and coloured image of Same Scene, and image capture device can be Kinect device or more Mesh camera.
3. the convolutional neural networks object identification method based on depth information pre-segmentation, its feature exist as claimed in claim 1 In step 2 is specially:
Step 2.1:The depth image of scene is divided into by foreground and background using Otsu algorithm, prospect represents target object, its depth Degree is within the specific limits;
Step 2.2:Using seed region growth algorithm, the depth image of target object is partitioned into from the depth image of scene Come.
4. the convolutional neural networks object identification method based on depth information pre-segmentation, its feature exist as claimed in claim 3 In step 2.2 is specially:
Step 2.2.1:Five pixels are selected at random in the depth bounds of prospect as seed point;
Step 2.2.2:8 pixels around each seed point are just traveled through, the pixel when grey scale change is less than 4 Assimilate into seed point;
Step 2.2.3:Repeat step 2.2.2 is divided into seed point and non-seed point until pixel all on picture;
Step 2.2.4:The image that seed point is formed is split to the depth image for obtaining object.
5. the convolutional neural networks object identification method based on depth information pre-segmentation, its feature exist as claimed in claim 1 In step 3 is specially:
The coloured image of scene and the pixel of depth image correspond, can be according to the split position pair of the depth image of object Should be on the coloured image of scene, and then the color images of object are come out.
6. the convolutional neural networks object identification method based on depth information pre-segmentation, its feature exist as claimed in claim 1 In step 4 includes:
Step 4.1:The RGB color value of filling region is set, the ratio of width to height for the coloured image split is filled to 1:1;
Step 4.2:Coloured image after filling is adjusted to by defined size using bilinearity difference arithmetic.
7. the convolutional neural networks object identification method based on depth information pre-segmentation, its feature exist as claimed in claim 6 In the RGB color value of filling region is set in step 4.1 can use following several method:
A. the RGB color value of the edge pixel point of article is taken into averaging operation, the color is referred to as edge average, filling region RGB color value be edge average inverse;
B. the RGB color value for setting filling region is (0,0,0), that is, fills black;
C. the RGB color value for setting filling region is (255,255,255), i.e. filling white.
8. the convolutional neural networks object identification method based on depth information pre-segmentation, its feature exist as claimed in claim 1 In step 5 includes:
Step 5.1:Build convolutional neural networks model;
Step 5.2:The image construction training set for gathering a variety of objects is trained to convolutional neural networks model;
Step 5.3:Populated coloured image is input to the convolutional neural networks model trained to be identified, and exports knot Fruit.
9. the convolutional neural networks object identification method based on depth information pre-segmentation, its feature exist as claimed in claim 8 In it is the neutral net for including 20 hidden layers that convolutional neural networks are built in step 5.1, is specially:
First layer be have in convolutional layer conv1, conv1 64 3*3 wave filter carry out step-length be the convolution operation of 1 pixel simultaneously By a nonlinear activation layer ReLU letter after terminating before carrying out convolution there is edge filling Padding operations in convolution Number is used as activation primitive;
The second layer be have in convolutional layer conv2, conv2 64 3*3 wave filter carry out step-length be the convolution operation of 1 pixel simultaneously By a nonlinear activation layer ReLU letter after terminating before carrying out convolution there is edge filling Padding operations in convolution Number is used as activation primitive;
Third layer is pond layer subsampling1 layers, is operated during pond using maximum pondization;
4th layer is to have 128 3*3 wave filter in convolutional layer conv3, conv3 to carry out the convolution operation that step-length is 1 pixel same When before carrying out convolution there is edge filling Padding operation terminate in convolution after by a nonlinear activation layer ReLU Function is as activation primitive;
Layer 5 is to have 128 3*3 wave filter in convolutional layer conv4, conv4 to carry out the convolution operation that step-length is 1 pixel same When before carrying out convolution there is edge filling Padding operation terminate in convolution after by a nonlinear activation layer ReLU Function is as activation primitive;
Layer 6 is pond layer subsampling2 layers, is operated during pond using maximum pondization;
Layer 7 is to have 256 3*3 wave filter in convolutional layer conv5, conv5 to carry out the convolution operation that step-length is 1 pixel same When before carrying out convolution there is edge filling Padding operation terminate in convolution after by a nonlinear activation layer ReLU Function is as activation primitive;
8th layer is to have 256 3*3 wave filter in convolutional layer conv6, conv6 to carry out the convolution operation that step-length is 1 pixel same When before carrying out convolution there is edge filling Padding operation terminate in convolution after by a nonlinear activation layer ReLU Function is as activation primitive;
9th layer is to have 256 3*3 wave filter in convolutional layer conv7, conv7 to carry out the convolution operation that step-length is 1 pixel same When before carrying out convolution there is edge filling Padding operation terminate in convolution after by a nonlinear activation layer ReLU Function is as activation primitive;
Tenth layer is pond layer subsampling3 layers, is operated during pond using maximum pondization;
Eleventh floor is convolutional layer conv8, and the wave filter for having 512 3*3 in conv8 carries out the convolution operation that step-length is 1 pixel Simultaneously by a nonlinear activation layer after terminating before carrying out convolution there is edge filling Padding operations in convolution ReLU functions are as activation primitive;
Floor 12 is convolutional layer conv9, and the wave filter for having 512 3*3 in conv9 carries out the convolution operation that step-length is 1 pixel Simultaneously by a nonlinear activation layer after terminating before carrying out convolution there is edge filling Padding operations in convolution ReLU functions are as activation primitive;
13rd layer is convolutional layer conv10, has 512 3*3 wave filter to carry out step-length in conv10 and is grasped for the convolution of 1 pixel Make simultaneously after terminating before carrying out convolution there is edge filling Padding operations in convolution by a nonlinear activation layer ReLU functions are as activation primitive;
14th layer is pond layer subsampling4 layers, is operated during pond using maximum pondization;
15th layer is convolutional layer conv11, has 512 3*3 wave filter to carry out step-length in conv11 and is grasped for the convolution of 1 pixel Make simultaneously after terminating before carrying out convolution there is edge filling Padding operations in convolution by a nonlinear activation layer ReLU functions are as activation primitive;
16th layer is convolutional layer conv12, has 512 3*3 wave filter to carry out step-length in conv12 and is grasped for the convolution of 1 pixel Make simultaneously after terminating before carrying out convolution there is edge filling Padding operations in convolution by a nonlinear activation layer ReLU functions are as activation primitive;
17th layer is convolutional layer conv13, has 512 3*3 wave filter to carry out step-length in conv13 and is grasped for the convolution of 1 pixel Make simultaneously after terminating before carrying out convolution there is edge filling Padding operations in convolution by a nonlinear activation layer ReLU functions are as activation primitive;
18th is pond layer subsampling5 layers, is operated during pond using maximum pondization;
19th layer is that full articulamentum Fc uses average pooling, and training and predetermined speed of neutral net are improved with this;
20th layer is classification layer Softmax, by the characteristic vector input classification layer of full articulamentum Fc outputs, is identified object Tag along sort, calculate the probability of every kind of tag along sort, and the label of maximum probability is exported.
CN201710838112.8A 2017-09-18 2017-09-18 A kind of convolutional neural networks object identification method based on depth information pre-segmentation Pending CN107563388A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710838112.8A CN107563388A (en) 2017-09-18 2017-09-18 A kind of convolutional neural networks object identification method based on depth information pre-segmentation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710838112.8A CN107563388A (en) 2017-09-18 2017-09-18 A kind of convolutional neural networks object identification method based on depth information pre-segmentation

Publications (1)

Publication Number Publication Date
CN107563388A true CN107563388A (en) 2018-01-09

Family

ID=60980169

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710838112.8A Pending CN107563388A (en) 2017-09-18 2017-09-18 A kind of convolutional neural networks object identification method based on depth information pre-segmentation

Country Status (1)

Country Link
CN (1) CN107563388A (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109917419A (en) * 2019-04-12 2019-06-21 中山大学 A kind of depth fill-in congestion system and method based on laser radar and image
CN110059597A (en) * 2019-04-04 2019-07-26 南京理工大学 Scene recognition method based on depth camera
CN110084828A (en) * 2019-04-29 2019-08-02 北京华捷艾米科技有限公司 A kind of image partition method, device and terminal device
CN110136144A (en) * 2019-05-15 2019-08-16 北京华捷艾米科技有限公司 A kind of image partition method, device and terminal device
CN110232326A (en) * 2019-05-20 2019-09-13 平安科技(深圳)有限公司 A kind of D object recognition method, device and storage medium
CN110378276A (en) * 2019-07-16 2019-10-25 顺丰科技有限公司 Vehicle-state acquisition methods, device, equipment and storage medium
CN111272753A (en) * 2019-09-06 2020-06-12 山东大学 Tunnel inner wall rock quartz content prediction system and method based on image recognition and analysis
CN111468430A (en) * 2020-04-17 2020-07-31 无锡雪浪数制科技有限公司 Depth vision-based coal gangue separation method
CN111667493A (en) * 2020-05-27 2020-09-15 华中科技大学 Orchard fruit tree region segmentation method and system based on deformable convolutional neural network
CN111742344A (en) * 2019-06-28 2020-10-02 深圳市大疆创新科技有限公司 Image semantic segmentation method, movable platform and storage medium
CN112183185A (en) * 2020-08-13 2021-01-05 天津大学 Liquid leakage detection method based on optical flow method and CNN-SVM
CN113052124A (en) * 2021-04-09 2021-06-29 济南博观智能科技有限公司 Identification method and device for fogging scene and computer-readable storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104217214A (en) * 2014-08-21 2014-12-17 广东顺德中山大学卡内基梅隆大学国际联合研究院 Configurable convolutional neural network based red green blue-distance (RGB-D) figure behavior identification method
CN106127164A (en) * 2016-06-29 2016-11-16 北京智芯原动科技有限公司 The pedestrian detection method with convolutional neural networks and device is detected based on significance
US9542626B2 (en) * 2013-09-06 2017-01-10 Toyota Jidosha Kabushiki Kaisha Augmenting layer-based object detection with deep convolutional neural networks
CN106384353A (en) * 2016-09-12 2017-02-08 佛山市南海区广工大数控装备协同创新研究院 Target positioning method based on RGBD

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9542626B2 (en) * 2013-09-06 2017-01-10 Toyota Jidosha Kabushiki Kaisha Augmenting layer-based object detection with deep convolutional neural networks
CN104217214A (en) * 2014-08-21 2014-12-17 广东顺德中山大学卡内基梅隆大学国际联合研究院 Configurable convolutional neural network based red green blue-distance (RGB-D) figure behavior identification method
CN106127164A (en) * 2016-06-29 2016-11-16 北京智芯原动科技有限公司 The pedestrian detection method with convolutional neural networks and device is detected based on significance
CN106384353A (en) * 2016-09-12 2017-02-08 佛山市南海区广工大数控装备协同创新研究院 Target positioning method based on RGBD

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110059597A (en) * 2019-04-04 2019-07-26 南京理工大学 Scene recognition method based on depth camera
CN109917419B (en) * 2019-04-12 2021-04-13 中山大学 Depth filling dense system and method based on laser radar and image
CN109917419A (en) * 2019-04-12 2019-06-21 中山大学 A kind of depth fill-in congestion system and method based on laser radar and image
CN110084828A (en) * 2019-04-29 2019-08-02 北京华捷艾米科技有限公司 A kind of image partition method, device and terminal device
CN110136144A (en) * 2019-05-15 2019-08-16 北京华捷艾米科技有限公司 A kind of image partition method, device and terminal device
CN110232326A (en) * 2019-05-20 2019-09-13 平安科技(深圳)有限公司 A kind of D object recognition method, device and storage medium
CN110232326B (en) * 2019-05-20 2024-05-31 平安科技(深圳)有限公司 Three-dimensional object recognition method, device and storage medium
CN111742344A (en) * 2019-06-28 2020-10-02 深圳市大疆创新科技有限公司 Image semantic segmentation method, movable platform and storage medium
WO2020258297A1 (en) * 2019-06-28 2020-12-30 深圳市大疆创新科技有限公司 Image semantic segmentation method, movable platform, and storage medium
CN110378276A (en) * 2019-07-16 2019-10-25 顺丰科技有限公司 Vehicle-state acquisition methods, device, equipment and storage medium
CN110378276B (en) * 2019-07-16 2021-11-30 顺丰科技有限公司 Vehicle state acquisition method, device, equipment and storage medium
CN111272753A (en) * 2019-09-06 2020-06-12 山东大学 Tunnel inner wall rock quartz content prediction system and method based on image recognition and analysis
CN111468430A (en) * 2020-04-17 2020-07-31 无锡雪浪数制科技有限公司 Depth vision-based coal gangue separation method
CN111667493A (en) * 2020-05-27 2020-09-15 华中科技大学 Orchard fruit tree region segmentation method and system based on deformable convolutional neural network
CN112183185A (en) * 2020-08-13 2021-01-05 天津大学 Liquid leakage detection method based on optical flow method and CNN-SVM
CN113052124A (en) * 2021-04-09 2021-06-29 济南博观智能科技有限公司 Identification method and device for fogging scene and computer-readable storage medium

Similar Documents

Publication Publication Date Title
CN107563388A (en) A kind of convolutional neural networks object identification method based on depth information pre-segmentation
CN109584248B (en) Infrared target instance segmentation method based on feature fusion and dense connection network
CN108830285B (en) Target detection method for reinforcement learning based on fast-RCNN
CN108492271B (en) Automatic image enhancement system and method fusing multi-scale information
CN106920243A (en) The ceramic material part method for sequence image segmentation of improved full convolutional neural networks
CN104077577A (en) Trademark detection method based on convolutional neural network
CN111553949B (en) Positioning and grabbing method for irregular workpiece based on single-frame RGB-D image deep learning
CN114724120B (en) Vehicle target detection method and system based on radar vision semantic segmentation adaptive fusion
CN110991444B (en) License plate recognition method and device for complex scene
CN108280488A (en) Object identification method is captured based on shared neural network
CN113160062B (en) Infrared image target detection method, device, equipment and storage medium
CN110136130A (en) A kind of method and device of testing product defect
CN110443775B (en) Discrete wavelet transform domain multi-focus image fusion method based on convolutional neural network
CN106780546A (en) The personal identification method of the motion blur encoded point based on convolutional neural networks
CN108052989A (en) A kind of image classification method based on Spline convolution neutral net
CN110399908A (en) Classification method and device based on event mode camera, storage medium, electronic device
CN109977834B (en) Method and device for segmenting human hand and interactive object from depth image
CN109360179A (en) A kind of image interfusion method, device and readable storage medium storing program for executing
CN109872326B (en) Contour detection method based on deep reinforced network jump connection
CN109977899B (en) Training, reasoning and new variety adding method and system for article identification
CN112562255A (en) Intelligent image detection method for cable channel smoke and fire condition in low-light-level environment
CN109117717A (en) A kind of city pedestrian detection method
CN108734200A (en) Human body target visible detection method and device based on BING features
CN112215861A (en) Football detection method and device, computer readable storage medium and robot
CN111291818B (en) Non-uniform class sample equalization method for cloud mask

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180109