CN109784145B - Target detection method based on depth map and storage medium - Google Patents

Target detection method based on depth map and storage medium Download PDF

Info

Publication number
CN109784145B
CN109784145B CN201811480757.XA CN201811480757A CN109784145B CN 109784145 B CN109784145 B CN 109784145B CN 201811480757 A CN201811480757 A CN 201811480757A CN 109784145 B CN109784145 B CN 109784145B
Authority
CN
China
Prior art keywords
candidate frame
depth
candidate
frame
size
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811480757.XA
Other languages
Chinese (zh)
Other versions
CN109784145A (en
Inventor
彭博文
王行
李骊
周晓军
盛赞
李朔
杨淼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing HJIMI Technology Co Ltd
Original Assignee
Beijing HJIMI Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing HJIMI Technology Co Ltd filed Critical Beijing HJIMI Technology Co Ltd
Priority to CN201811480757.XA priority Critical patent/CN109784145B/en
Publication of CN109784145A publication Critical patent/CN109784145A/en
Application granted granted Critical
Publication of CN109784145B publication Critical patent/CN109784145B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

A specific target detection method and storage medium based on depth image, said method is to define the size of the real candidate frame through the size of the target object to be measured, and calculate the window traversal interval of the candidate frame; traversing the depth image to obtain a central point pixel coordinate of the candidate frame; obtaining the depth value of the center point of each candidate frame, and screening to obtain effective candidate frames; calculating the actual required frame length of the remaining effective candidate frames; and setting a filtering threshold value, filtering out points with overlarge depth difference between the depth of the effective candidate frame and the depth of the central point, and then further performing deep learning preprocessing and deep learning. The method can increase the step length of image traversal as much as possible, can filter out partial invalid candidate frames, and can calculate the side length of the candidate frame according to the depth value of the central point and the real size of the target object, thereby preventing the multi-scale candidate frame from being generated at the same position, saving a large amount of calculation and providing good convenience for rapid target detection.

Description

Target detection method based on depth map and storage medium
Technical Field
The invention relates to the field of image detection, in particular to a target detection method based on a depth map, which can greatly reduce the initial number of candidate frames on the depth map, thereby greatly reducing the number of model calculation and providing detection efficiency.
Background
Image classification, detection and segmentation are three major tasks in the field of computer vision. As the most basic part, the current main target detection image images are RGB images, and the development of structured light technology and TOF technology, depth images are gradually becoming new data sources.
In recent years, with the rapid development of deep learning technology, the speed and accuracy of detecting a specific target in an image are greatly improved. But the real-time detection effect of the video image is far from being achieved. However, compared with the traditional target detection scheme, the target detection based on deep learning has higher precision and better adaptability. The commonly used image target detection algorithm adopts the following schemes:
at present, mainstream target detection algorithms based on deep learning models can be divided into two categories: (1) the two-stage detection algorithm divides the detection problem into two stages, firstly generates candidate regions (region poppesals), then classifies the candidate regions (generally, the position refinement is also needed), and the typical representatives of the algorithm are R-CNN algorithms based on the region poppesals, such as R-CNN, Fast R-CNN and the like; (2) one-stage detection algorithm, which does not require a region pro-posal stage, directly generates class probability and position coordinate values of an object, comparing typical algorithms such as YOLO and SSD.
The method completely depends on random candidate frame extraction, and has the following defects:
(1) the candidate frames have high randomness and are not selected in a targeted manner;
(2) the size of the candidate frame needs to be selected in a multi-scale mode, and the number of the candidate frames is greatly increased;
(3) there may be a large amount of overlap between candidate frames, resulting in an increased number of candidate frames;
(4) the initial convolution image is too large, and the operation amount is large.
Therefore, how to reduce the number of candidate frames and improve the overlap ratio between the candidate frames and the target in the frame in the depth image recognition, so that the computation amount is reduced in the subsequent convolution algorithm to quickly detect the object, becomes a technical problem to be solved by the conventional image recognition technology.
Disclosure of Invention
The invention aims to provide a depth map-based specific target detection method, which utilizes the real size information of a target object and can greatly reduce the initial number of candidate frames on a depth image, thereby greatly reducing the number of model calculation and improving the detection efficiency.
In order to achieve the purpose, the invention adopts the following technical scheme:
a specific target detection method based on a depth image comprises the following steps:
step S110 of calculating traversal step: defining the size of a real candidate frame according to the size L of the target object to be detected, and calculating by using a formula (1) to obtain a candidate frame window traversal interval StrideGUnit is a pixel
StrideG=0.5*L*fxyformula/D (1)
Wherein L is the size of the target object to be measured, fxyThe distance is the main distance of the depth sensor, the unit is the pixel unit, and D is the farthest distance of the target object to be detected;
image traversal step S120: according to the size of the real candidate frame and the candidate frame window traversal interval StrideGTraversing the depth image and acquiring the pixel coordinates of the center points of all the candidate frames;
valid candidate box screening step S130: obtaining the depth value of the center point of each candidate frame, comparing the depth value with the farthest distance D of the target object to be detected, wherein the candidate frame smaller than the farthest distance D is an effective candidate frame, and otherwise, the candidate frame is an invalid candidate frame;
the actually required frame length calculation step S140 of the valid candidate frame: calculating the actual required frame length L of the remaining effective candidate frames by formula (3) using the actual depth value d of the center point of each candidate framepixel
Lpixel=L*fxyFormula (3);
a filtering step S150: according to the calculated actual required frame length L of the effective candidate framepixelAnd setting a filtering threshold value, and filtering out points with overlarge differences between the depths of the effective candidate frames and the depth of the central point. Therefore, the points of the effective candidate frame with the depth exceeding the filtering threshold value with the depth of the central point are filtered, and the foreground and background points in the candidate frame can be filtered.
Optionally, in step S110, the size of the real candidate frame is 1.25 to 1.75 times the real size of the target object to be measured.
Optionally, in step S110, the size of the real candidate frame is 1.5 times the real size of the target object to be measured.
Optionally, the filtering threshold is set at the actual required frame length Lpixel1/3-2/3.
Optionally, the filtering threshold is the actual required frame length LpixelHalf of that.
Optionally, the method further includes, in the deep learning preprocessing step S160: the depth image is sampled to a fixed resolution and then the pixel values are normalized.
Optionally, the deep learning after the deep learning preprocessing step S160 is to train a multi-output CNN model, input a single-channel image, and output a new central point coordinate and a binary classification result.
The present invention further discloses a storage medium capable of being used to store computer-executable instructions, characterized by: the computer-executable instructions, when executed by a processor, perform the depth map-based object detection method described above.
According to the method, the target scale information of the depth map is introduced into the step length setting extracted from the target detection candidate box, so that the step length of image traversal can be increased as much as possible; meanwhile, the filtering of the depth value of the central point can filter out partial invalid candidate boxes; and then, the side length of the candidate frame is calculated according to the depth value of the central point and the real size of the target object, so that the situation that multi-scale candidate frames need to be generated at the same position can be prevented, a large amount of calculated amount is saved, and good convenience is provided for rapid target detection.
Drawings
FIG. 1 is a flow diagram of a depth map based specific target detector in accordance with a specific embodiment of the present invention;
fig. 2 is a schematic diagram of the center points of the candidate boxes after traversing the depth map in the embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
The invention relates to a depth image-based specific target detection method, wherein an implemented object is a depth image, and for the depth image, firstly, the maximum window size of a target is calculated as a window interval, namely a traversal step length, according to the size of a target object and the defined maximum distance of a target to be detected in the depth image; secondly, traversing the image on the depth image according to the window interval to generate a candidate frame, and judging whether the candidate frame is effective or not according to the depth of the central point of the candidate frame; and then calculating the actual required frame length of the effective candidate frame, and filtering partial foreground and background points by using the actual required frame length so as to reduce the number of candidate frames for the next deep learning and reduce interference during the candidate frame learning.
Further, referring to fig. 1, a flow chart of a specific target detector based on a depth map is shown, which includes the following steps:
step S110 of calculating traversal step: defining the size of a real candidate frame according to the size L of the target object to be detected, and calculating by using a formula (1) to obtain a candidate frame window traversal interval StrideGUnit is a pixel
StrideG=0.5*L*fxyformula/D (1)
Wherein L is the size of the target object to be measured, fxyThe unit is the principal distance of the depth sensor, the unit is the unit of pixel, and D is the farthest distance of the target object to be detected.
Optionally, the size of the real candidate frame may be 1.25 to 1.75 times the real size of the target object to be measured, and further preferably, the size of the real candidate frame may be 1.5 times the real size of the target object to be measured.
For example, assuming that the size of the target object to be measured is 170mm, the space size of the candidate frame may be defined as 255 mm.
In step S110, the target scale information of the depth map is introduced into the step setting extracted by the target detection candidate box, so that the step size of image traversal can be increased as much as possible.
Image traversal step S120: according to the size of the real candidate frame and the candidate frame window traversal interval StrideGAnd traversing the depth image and acquiring the pixel coordinates of the center point of all the candidate frames.
Referring to fig. 2, a schematic diagram of the center point of each candidate box after traversing the depth map is shown.
Valid candidate box screening step S130: and obtaining the depth value of the central point of each candidate frame, comparing the depth value with the farthest distance D of the target object to be detected, wherein the candidate frame smaller than the farthest distance D is an effective candidate frame, and otherwise, the candidate frame is an invalid candidate frame.
That is, it is equivalent to the determination of the valid candidate frame by the formula (2).
Figure BDA0001893322710000051
P represents whether the candidate box is valid, D represents the actual depth value of the center point of the candidate box, and when the depth value is larger than 0 and smaller than D, the candidate box is valid, and otherwise, the candidate box is invalid.
Therefore, partially invalid candidate frames can be filtered out through step S130.
The actually required frame length calculation step S140 of the valid candidate frame: calculating the actual required frame length L of the remaining effective candidate frames by formula (3) using the actual depth value d of the center point of each candidate framepixel
Lpixel=L*fxyFormula/d (3)
In the step, the side length of the candidate frame is calculated according to the depth value of the central point and the real size of the target object, so that the situation that a multi-scale candidate frame needs to be generated at the same position can be prevented, and a large amount of calculation is saved.
A filtering step S150: according to the calculated actual required frame length L of the effective candidate framepixelAnd setting a filtering threshold value, and filtering out points with overlarge differences between the depths of the effective candidate frames and the depth of the central point. Therefore, the points of the effective candidate frame with the depth exceeding the filtering threshold value with the depth of the central point are filtered, and the foreground and background points in the candidate frame can be filtered.
In an alternative embodiment, theTo actually require the frame length Lpixel1/3-2/3, a filtering threshold is set.
Further optionally, the filtering threshold is the actual required frame length LpixelHalf of that.
Therefore, the extraction of the target detection candidate box using the real size of the depth map and the processing of partial foreground and background points are completed through steps S110 to S150, so that the target can be quickly found in the depth learning of the next step, and the correlation disturbance is reduced.
Deep learning preprocessing step S160: the depth image is sampled to a fixed resolution and then the pixel values are normalized.
In an alternative embodiment, a bilinear difference or nearest neighbor difference algorithm may be used to sample to a fixed resolution, e.g., 64 x 64, 128 x 128, and then normalize the pixel values to 0-1 based on the cube side length.
The subsequent deep learning may be to train a multi-output CNN model, input a single-channel image, and output a new centroid coordinate (regressor) and a binary result (classifier).
The present invention further discloses a storage medium capable of being used to store computer-executable instructions, characterized by:
the computer-executable instructions, when executed by a processor, perform the depth map-based object detection method described above.
Therefore, the invention introduces the target scale information of the depth map into the step setting extracted from the target detection candidate frame, and can increase the step length of image traversal as much as possible; meanwhile, the filtering of the depth value of the central point can filter out partial invalid candidate boxes; and then, the side length of the candidate frame is calculated according to the depth value of the central point and the real size of the target object, so that the situation that multi-scale candidate frames need to be generated at the same position can be prevented, and a large amount of calculation amount is saved. Because the number of the candidate frames is small, and the distance between the candidate frames is half of the target size, the target may not be in the central area of the candidate frames, the central position of the target object can be located through the CNN regressor, and the classifier can judge whether the target is the target to be detected, thereby providing good convenience for rapid target detection.
While the invention has been described in further detail with reference to specific preferred embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (8)

1. A specific target detection method based on a depth image comprises the following steps:
step S110 of calculating traversal step: defining the size of a real candidate frame according to the size L of the target object to be detected, and calculating by using a formula (1) to obtain a candidate frame window traversal interval StrideGUnit is a pixel
StrideG=0.5*L*fxyformula/D (1)
Wherein L is the size of the target object to be measured, fxyThe distance is the main distance of the depth sensor, the unit is the pixel unit, and D is the farthest distance of the target object to be detected;
image traversal step S120: according to the size of the real candidate frame and the candidate frame window traversal interval StrideGTraversing the depth image and acquiring the pixel coordinates of the center points of all the candidate frames;
valid candidate box screening step S130: obtaining the depth value of the center point of each candidate frame, comparing the depth value with the farthest distance D of the target object to be detected, wherein the candidate frame smaller than the farthest distance D is an effective candidate frame, and otherwise, the candidate frame is an invalid candidate frame;
the actually required frame length calculation step S140 of the valid candidate frame: calculating the actual required frame length L of the remaining effective candidate frames by formula (3) using the actual depth value d of the center point of each candidate framepixel
Lpixel=L*fxyFormula (3);
a filtering step S150: according to the calculated valid candidateActual required frame length L of framepixelAnd setting a filtering threshold value, and filtering out points with overlarge depth difference between the depth and the central point in the effective candidate frame, so that points with the depth difference value between the depth and the central point in the effective candidate frame exceeding the filtering threshold value are filtered out, and foreground and background points in the candidate frame can be filtered out.
2. The specific object detection method according to claim 1, characterized in that:
in step S110, the size of the real candidate frame is 1.25 to 1.75 times the real size of the target object to be measured.
3. The specific object detection method according to claim 2, characterized in that:
in step S110, the size of the real candidate frame is 1.5 times the real size of the target object to be measured.
4. The specific object detection method according to claim 1, characterized in that:
the filtering threshold is set at the actual required frame length Lpixel1/3-2/3.
5. The specific object detection method according to claim 4, characterized in that:
the filtering threshold is the actual required frame length LpixelHalf of that.
6. The specific object detection method according to claim 1, characterized in that:
further comprising, a deep learning preprocessing step S160: the depth image is sampled to a fixed resolution and then the pixel values are normalized.
7. The specific object detection method according to claim 6, characterized in that:
the deep learning after the deep learning preprocessing step S160 is to train a multi-output CNN model, input a single-channel image, and output a new centroid coordinate and a binary result.
8. A storage medium capable of being used to store computer-executable instructions, characterized by:
the computer-executable instructions, when executed by a processor, perform the depth map based object detection method of any one of claims 1-7.
CN201811480757.XA 2018-12-05 2018-12-05 Target detection method based on depth map and storage medium Active CN109784145B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811480757.XA CN109784145B (en) 2018-12-05 2018-12-05 Target detection method based on depth map and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811480757.XA CN109784145B (en) 2018-12-05 2018-12-05 Target detection method based on depth map and storage medium

Publications (2)

Publication Number Publication Date
CN109784145A CN109784145A (en) 2019-05-21
CN109784145B true CN109784145B (en) 2021-03-16

Family

ID=66496635

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811480757.XA Active CN109784145B (en) 2018-12-05 2018-12-05 Target detection method based on depth map and storage medium

Country Status (1)

Country Link
CN (1) CN109784145B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111127514B (en) * 2019-12-13 2024-03-22 华南智能机器人创新研究院 Method and device for tracking target by robot
CN111738995B (en) * 2020-06-10 2023-04-14 苏宁云计算有限公司 RGBD image-based target detection method and device and computer equipment
CN112070759B (en) * 2020-09-16 2023-10-24 浙江光珀智能科技有限公司 Fork truck tray detection and positioning method and system

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100707206B1 (en) * 2005-04-11 2007-04-13 삼성전자주식회사 Depth Image-based Representation method for 3D objects, Modeling method and apparatus using it, and Rendering method and apparatus using the same
CN105930793B (en) * 2016-04-19 2019-04-16 中山大学 A kind of human body detecting method based on the study of SAE feature visualization
CN106778835B (en) * 2016-11-29 2020-03-24 武汉大学 Remote sensing image airport target identification method fusing scene information and depth features
US10460180B2 (en) * 2017-04-20 2019-10-29 GM Global Technology Operations LLC Systems and methods for visual classification with region proposals
CN107301377B (en) * 2017-05-26 2020-08-18 浙江大学 Face and pedestrian sensing system based on depth camera
CN107742113B (en) * 2017-11-08 2019-11-19 电子科技大学 One kind being based on the posterior SAR image complex target detection method of destination number
CN108597003A (en) * 2018-04-20 2018-09-28 腾讯科技(深圳)有限公司 A kind of article cover generation method, device, processing server and storage medium
CN108682039B (en) * 2018-04-28 2022-03-25 国网山西省电力公司电力科学研究院 Binocular stereo vision measuring method

Also Published As

Publication number Publication date
CN109784145A (en) 2019-05-21

Similar Documents

Publication Publication Date Title
CN110533084B (en) Multi-scale target detection method based on self-attention mechanism
US10970566B2 (en) Lane line detection method and apparatus
WO2018103608A1 (en) Text detection method, device and storage medium
CN111640089B (en) Defect detection method and device based on feature map center point
CN109784145B (en) Target detection method based on depth map and storage medium
CN109086724B (en) Accelerated human face detection method and storage medium
CN111860494B (en) Optimization method and device for image target detection, electronic equipment and storage medium
CN109214380B (en) License plate inclination correction method
JP2011134114A (en) Pattern recognition method and pattern recognition apparatus
CN111160291B (en) Human eye detection method based on depth information and CNN
KR102074073B1 (en) Method for detecting vehicles and apparatus using the same
CN109543498B (en) Lane line detection method based on multitask network
CN107247967B (en) Vehicle window annual inspection mark detection method based on R-CNN
CN111985314A (en) ViBe and improved LBP-based smoke detection method
JP2011165170A (en) Object detection device and program
CN111160100A (en) Lightweight depth model aerial photography vehicle detection method based on sample generation
CN113221739B (en) Monocular vision-based vehicle distance measuring method
CN111881775B (en) Real-time face recognition method and device
JP2011170890A (en) Face detecting method, face detection device, and program
CN117011346A (en) Blower image registration algorithm
JP2008165496A (en) Image normalization device, object detection device, object detection system and program
JP2006309714A (en) Face discrimination method and device, and program
CN110910332B (en) Visual SLAM system dynamic fuzzy processing method
CN114283280A (en) Water surface floating garbage identification method based on improved convolutional neural network
CN113920155A (en) Moving target tracking algorithm based on kernel correlation filtering

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant