WO2021249351A1 - 一种基于rgbd图像的目标检测方法、装置及计算机设备 - Google Patents

一种基于rgbd图像的目标检测方法、装置及计算机设备 Download PDF

Info

Publication number
WO2021249351A1
WO2021249351A1 PCT/CN2021/098681 CN2021098681W WO2021249351A1 WO 2021249351 A1 WO2021249351 A1 WO 2021249351A1 CN 2021098681 W CN2021098681 W CN 2021098681W WO 2021249351 A1 WO2021249351 A1 WO 2021249351A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
image
depth
tested
rgbd
Prior art date
Application number
PCT/CN2021/098681
Other languages
English (en)
French (fr)
Inventor
荆伟
唐诗尧
汪明明
冀怀远
Original Assignee
苏宁易购集团股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 苏宁易购集团股份有限公司 filed Critical 苏宁易购集团股份有限公司
Publication of WO2021249351A1 publication Critical patent/WO2021249351A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/136Segmentation; Edge detection involving thresholding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Definitions

  • the present invention relates to the technical field of deep learning and target detection, in particular to a target detection method, device and computer equipment based on RGBD images.
  • the unmanned store project combines offline retail and artificial intelligence to provide a new shopping method that is as smooth as online shopping.
  • the system uses full coverage to photograph the behavior trajectory of each customer entering the store, and provides real-time commodity recommendation and settlement services, which in a true sense achieves a take-and-go non-perceived shopping experience.
  • embodiments of the present invention provide a target detection method, device, and computer equipment based on RGBD images, which expand training data by setting corresponding data enhancement methods for RGBD images, and adaptively return to RGBD images. Operations such as unified method and filtering processing reduce false detections and missed detections, and can accurately and efficiently obtain target detection results.
  • the technical solution is as follows:
  • a target detection method based on RGBD images includes:
  • the target association relationship between the target component to be tested and the target to be tested is determined according to the output box, and a target detection result is obtained according to the target association relationship.
  • the method further includes:
  • acquiring a depth image in a targetless scene, and calculating the parameters required for the adaptive normalization operation and the parameters required for the depth information filtering includes:
  • Collect N depth images take the non-zero median of them, merge them into a depth map and denoise, then specify a part of the ground area in the depth map, obtain the ground mask by the region growing method and denoise, and then denoise.
  • the noised ground mask and the denoised depth map are calculated to obtain different ground area depth maps, and the mean value of the non-zero area in the different ground area depth maps is calculated.
  • performing data preprocessing on the RGB image and the depth image respectively includes:
  • the method before inputting the RGBD image into a preset deep learning model to obtain a preliminary candidate frame including at least the target to be tested and the target component to be tested, the method further includes:
  • the random pixel zeroing includes:
  • the size is consistent with the depth map, set random n pixels in it to zero as a mask for data enhancement, where n is an integer not less than 1.
  • threshold filtering, NMS filtering, and depth information filtering are performed on the preliminary candidate frame to obtain an output frame, including:
  • Threshold filtering is performed on the preliminary candidate frame first to filter out the frames whose confidence is less than the preset confidence threshold;
  • determining the target association relationship between the target component to be tested and the target to be tested according to the output box, and obtaining the target detection result according to the target association relationship includes:
  • the preliminary candidate frame of the target to be tested and the preliminary candidate frame of the target component to be tested are bound to the target association relationship according to the cross-to-parallel ratio of the target to be tested and the target component to be tested In a predetermined operation, if it is determined that the intersection ratio is greater than a preset threshold, it is determined that the target component to be tested and the target to be tested belong to the same person, and the target detection result is output.
  • a target detection device based on RGBD images includes:
  • the image acquisition module is used to: acquire RGB images and corresponding depth images
  • a data preprocessing module configured to: perform data preprocessing on the RGB image and the depth image respectively, and the data preprocessing includes at least an adaptive normalization operation;
  • the RGBD image merging module is used to: align and merge the preprocessed RGB image and the depth image into an RGBD image;
  • a model calculation module configured to: input the RGBD image into a preset deep learning model to obtain a preliminary candidate frame including at least the target to be tested and the target component to be tested;
  • a filtering module configured to: perform threshold filtering, NMS filtering, and depth information filtering on the preliminary candidate frame to obtain an output frame;
  • the detection result obtaining module is configured to determine the target association relationship between the target component to be tested and the target to be tested according to the output box, and obtain a target detection result according to the target association relationship.
  • the device further includes a parameter calculation module for:
  • acquiring a depth image in a targetless scene, and calculating the parameters required for the adaptive normalization operation and the parameters required for the depth information filtering includes:
  • Collect N depth images take the non-zero median of them, merge them into a depth map and denoise, then specify a part of the ground area in the depth map, obtain the ground mask by the region growing method and denoise, and then denoise.
  • the noised ground mask and the denoised depth map are calculated to obtain different ground area depth maps, and the mean value of the non-zero area in the different ground area depth maps is calculated.
  • the data preprocessing module is used for:
  • the device further includes a model training module for:
  • the random pixel zeroing includes:
  • the size is consistent with the depth map, set random n pixels in it to zero as a mask for data enhancement, where n is an integer not less than 1.
  • the filtering module is used for:
  • Threshold filtering is performed on the preliminary candidate frame first to filter out the frames whose confidence is less than the preset confidence threshold;
  • the detection result acquisition module is configured to:
  • the preliminary candidate frame of the target to be tested and the preliminary candidate frame of the target component to be tested are bound to the target association relationship according to the cross-to-parallel ratio of the target to be tested and the target component to be tested In a predetermined operation, if it is determined that the intersection ratio is greater than a preset threshold, it is determined that the target component to be tested and the target to be tested belong to the same person, and the target detection result is output.
  • a computer device for target detection based on RGBD images including: a processor;
  • the memory is configured to store executable instructions of the processor; wherein the processor is configured to execute the steps of the RGBD image-based target detection method according to any one of the above solutions via the executable instructions.
  • FIG. 1 is a flowchart of a target detection method based on RGBD images provided by Embodiment 1 of the present invention
  • FIG. 2 is a flowchart of sub-steps of step 102 in Figure 1;
  • FIG. 3 is a flowchart of sub-steps of step 105 in Figure 1;
  • FIG. 4 is a schematic structural diagram of a target detection device based on RGBD images provided by Embodiment 2 of the present invention.
  • FIG. 5 is a schematic diagram of the hardware structure of a computer device for target detection based on RGBD images provided by Embodiment 3 of the present invention.
  • FIG. 6 is a business flowchart of a target detection method, device, and computer equipment based on RGBD images provided by Embodiment 4 of the present invention.
  • Fig. 7 is a flowchart of the depth information filtering process in Fig. 6.
  • the RGBD image-based target detection method, device, and computer equipment provided by the embodiments of the present invention increase the amount of information by collecting RGBD images; expand training data by setting corresponding data enhancement methods for RGBD images, and improve the accuracy of model training;
  • the image undergoes corresponding adaptive normalization operation to avoid the algorithm performance degradation caused by the change of the camera height during RGBD acquisition; through threshold filtering, NMS filtering and depth information filtering, the accuracy of target detection is improved, and the target detection results can be accurately and efficiently obtained Therefore, it is suitable for a variety of application scenarios involving target detection or target recognition. It is especially suitable for pedestrian detection methods for complex overhead surveillance scenarios. It can accurately and efficiently detect pedestrians’ human bodies, hands, heads and other targets.
  • the performance degradation problem in the scene is filtered out while some false detections are filtered out, and accurate location information and category information of the target are provided for tasks such as pedestrian tracking, instance segmentation, pedestrian ReID, human-goods interaction, dynamic product identification, etc., which can provide for the monitoring of unmanned stores Effective target detection capability.
  • the target to be tested here can be a moving target including a human body, an animal, etc., a static target, etc.
  • Fig. 1 is a flowchart of a target detection method based on an RGBD image provided by Embodiment 1 of the present invention.
  • Fig. 2 is a flowchart of sub-steps of step 102 in Fig. 1.
  • Fig. 3 is a flowchart of sub-steps of step 105 in Fig. 1.
  • the RGBD image-based target detection method includes the following steps:
  • an RGBD camera is used to obtain RGB image data containing multiple targets to be tested and their corresponding depth image data.
  • step 101 may also be implemented in other ways, and the embodiment of the present invention does not limit the specific manner.
  • the above-mentioned parameter calculation process can adopt the following method: Collect N depth images, take the non-zero median of them, merge them into a depth map and denoise, then specify a part of the ground area in the depth map, and use the region growing method Obtain the ground mask and denoise, and then calculate the depth map of different ground areas from the denoised ground mask and the denoised depth map, and calculate the mean value of the non-zero area in the depth map of different ground areas.
  • the acquisition of the depth image in the targetless scene can be completed at the same time as step 101 above.
  • step 102 above further includes the following sub-steps:
  • step 102 may also be implemented in other ways, and the embodiment of the present invention does not limit the specific manner.
  • the preset deep learning model here can adopt any possible deep learning model in the prior art such as ssd, yolov3, centernet, and so on.
  • the target component to be tested here may include a part of the human body such as a human head and a human hand.
  • step 104 the following steps are implemented:
  • the random pixel zeroing adopts the following method: for a single-channel picture with a value of 1, and the size consistent with the depth map, zero random n pixels therein as a mask for data enhancement, where n Is an integer not less than 1.
  • the above 105 steps include the following sub-steps:
  • step 105 may also be implemented in other ways, and the embodiment of the present invention does not limit the specific manner.
  • the preliminary candidate frame of the target to be tested and the preliminary candidate frame of the target component to be tested are bound to the target association relationship according to the cross-to-comparison ratio of the target to be tested and the target component to be tested. If the ratio is greater than the preset threshold, it is determined that the target component to be tested and the target to be tested belong to the same person, and the target detection result is output.
  • the preliminary candidate frame of the human body is bound with the preliminary candidate frame of the human body component, and if it is determined that the binding relationship between the two is greater than a preset threshold, it is determined that the human body component and the human body belong to the same person, and output as the target detection result.
  • step 106 may also be implemented in other ways, and the embodiment of the present invention does not limit the specific manner.
  • FIG. 4 is a schematic structural diagram of a target detection device based on RGBD images provided by Embodiment 2 of the present invention.
  • the RGBD image-based target detection device provided by Embodiment 2 of the present invention includes an image acquisition module 21, a data preprocessing module 22, an RGBD image merging module 23, a model calculation module 24, a filtering module 25, and detection results Get module 26.
  • the image acquisition module 21 is used for: acquiring RGB images and corresponding depth images
  • the data preprocessing module 22 is used for: performing data preprocessing on the RGB image and the depth image respectively, and the data preprocessing includes at least adaptive normalization Operation, preferably zero-padded the RGB image and the depth image to the preset picture ratio; then respectively scale to the preset input size; and finally perform the adaptive normalization operation respectively
  • the RGBD image merging module 23 is used for: presetting the data The processed RGB image and the depth image are aligned and merged into an RGBD image
  • the model calculation module 24 is used to: input the RGBD image into the preset deep learning model to obtain a preliminary candidate frame including at least the target to be tested and the target component to be tested
  • filter module 25 is used for: acquiring RGB images and corresponding depth images
  • the data preprocessing module 22 is used for: performing data preprocessing on the RGB image and the depth image respectively, and the data preprocessing includes at least adaptive normalization Operation, preferably zero-padded the RGB image and the depth
  • threshold filtering is performed on the preliminary candidate frame to filter out the frames whose confidence is less than the preset confidence threshold; and then pass The NMS algorithm filters out redundant overlapping frames; then uses depth information to further filter, and the remaining frames are output frames; the detection result acquisition module 26 is used to: determine the target association relationship between the target component to be tested and the target to be tested according to the output frame, Obtain the target detection result according to the target association relationship.
  • the preliminary candidate frame of the target to be tested and the preliminary candidate frame of the target component to be tested are subjected to the target association relationship based on the cross-to-parallel ratio of the target to be tested and the target component to be tested In the binding operation, if it is determined that the intersection ratio is greater than the preset threshold, it is determined that the target component to be tested and the target to be tested belong to the same person, and the target detection result is output.
  • the preliminary candidate frame of the human body is bound with the preliminary candidate frame of the human body component, and if it is determined that the binding relationship between the two is greater than a preset threshold, it is determined that the human body component and the human body belong to the same person, and the result is the target detection result Output.
  • the above-mentioned object detection device based on RGBD image further includes:
  • the parameter calculation module 27 is used to obtain a depth image in a targetless scene, and calculate parameters required for adaptive normalization operations and parameters required for depth information filtering.
  • the above-mentioned parameter calculation process adopts the following method: Collect N depth images, take their non-zero medians and merge them into a depth map and denoise, then specify a part of the ground area in the depth map, and obtain the ground by the area growth method. Mask and denoise, and then calculate the depth map of different ground areas from the denoised ground mask and the denoised depth map, and calculate the mean value of the non-zero area in the depth map of different ground areas.
  • the aforementioned RGBD image-based target detection device further includes:
  • the model training module 28 is configured to: perform data enhancement operations on the collected RGBD image training data, and train to obtain a preset deep learning model, wherein the data enhancement operations include at least one of the following methods: random rotation of the RGBD image, At least one of zooming, flipping, and translation; performing Gaussian noise processing and/or zeroing random pixels on the depth image; performing Gaussian noise processing on the RGB image.
  • the above-mentioned random pixel zeroing includes: for a single-channel picture with a value of 1, the size of which is consistent with the depth map, zeroing random n pixels in it as a mask for data enhancement, where n is not less than 1 Integer.
  • FIG. 5 is a schematic diagram of the hardware structure of a computer device based on RGBD image target detection provided in Embodiment 3 of the present invention.
  • the computer device based on RGBD image target detection provided in Embodiment 3 of the present invention includes:
  • the processor 31 configured to store executable instructions of the processor 31; wherein the processor 31 is configured to execute the RGBD image-based target detection method of any one of the above solutions via the executable instructions step.
  • the memory 32 may be a non-permanent memory in a computer readable medium, random access memory (RAM) and/or non-volatile memory, etc., such as read-only memory (ROM) or flash memory (flash RAM). At least one memory chip is included.
  • RAM random access memory
  • ROM read-only memory
  • flash RAM flash memory
  • the memory 32 may be used to store a program for realizing the above-mentioned target detection method
  • the processor 31 may be used to load and execute a program stored in the memory 32 to implement each step of the foregoing target detection method.
  • a program stored in the memory 32 may be used to load and execute a program stored in the memory 32 to implement each step of the foregoing target detection method.
  • the computer equipment may be a server, a computer, and so on. Therefore, the structural composition of the computer device is not limited to the memory and the processor, but may also include other hardware devices, such as input devices, storage devices, etc., which can be determined according to the configuration of the computer device, and will not be listed here.
  • FIG. 6 is a business flow chart of the method, device and equipment for target detection based on RGBD images provided by Embodiments 1 to 3 of the present invention, and shows a preferred embodiment.
  • N is an integer greater than 1
  • i, j are pixel indexes
  • h, w are the height and width of the depth image
  • Median() is the median
  • NonZero() is the zero in the array
  • Depth floor Mask ⁇ Depth
  • Depth floor is a depth image containing only the ground area
  • Mask is a mask calculated by the area growth method to represent the ground area
  • Depth is the depth image.
  • RGBD image collection RGBD images are collected by the RGBD camera.
  • RGB image is adaptively normalized by the following method:
  • RGB norm RGB/255-c RGB ;
  • RGB norm is a normalized RGB image
  • RGB is a RGB image before normalization
  • c RGB is a preset constant
  • the depth image is adaptively normalized by the following method:
  • Depth norm Depth/D mean -c D ;
  • Depth norm is the normalized depth image
  • Depth is the depth image before normalization
  • D mean is the mean value of the non-zero area in the Depth floor
  • c D is a preset constant.
  • the specific method is: extract the feature map by using the convolutional neural network, and output the preliminary candidate frame information on the basis of the feature map.
  • Post-filtering processing the preliminary candidate frames are filtered through the threshold to filter out the frames with insufficient confidence, and then the redundant overlapping frames are filtered through the NMS algorithm, and then the depth information is used for further filtering, and the remaining frames are the final output frames.
  • the NMS algorithm is an algorithm that removes overlapping frames based on the candidate frame IOU (Intersection Ratio) and confidence.
  • Human body component binding Bind the preliminary candidate frame Box 1 of the human body and the preliminary candidate frame Box 2 of the human body component. If it meets:
  • area() is the calculated area
  • is the intersection
  • thresh is the preset threshold used to determine the association relationship between the component frame and the human frame target.
  • the target detection apparatus and equipment based on RGBD images provided in the above embodiments trigger the target detection service
  • only the division of the above functional modules is used for illustration.
  • the above functions can be allocated according to needs. It is completed by different functional modules, that is, the internal structure of the device and equipment is divided into different functional modules to complete all or part of the functions described above.
  • the RGBD image-based target detection device and device provided in the above-mentioned embodiments belong to the same concept as the RGBD image-based target detection method embodiment, and the specific implementation process is detailed in the method embodiment, which will not be repeated here.
  • the RGBD image-based target detection method, device, and computer equipment provided by the embodiments of the present invention have the following beneficial effects compared with the prior art:
  • These computer program instructions can be provided to the processor of a general-purpose computer, a special-purpose computer, an embedded processor, or other programmable data processing equipment to generate a machine, so that the instructions executed by the processor of the computer or other programmable data processing equipment are used to generate It is a device that realizes the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
  • These computer program instructions can also be stored in a computer-readable memory that can direct a computer or other programmable data processing equipment to work in a specific manner, so that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction device.
  • the device implements the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
  • These computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operation steps are executed on the computer or other programmable equipment to produce computer-implemented processing, so as to execute on the computer or other programmable equipment.
  • the instructions provide steps for implementing functions specified in a flow or multiple flows in the flowchart and/or a block or multiple blocks in the block diagram.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

本发明公开了一种基于RGBD图像的目标检测方法、装置及计算机设备,属于深度学习和目标检测技术领域。所述方法包括:获取RGB图像与对应深度图像;对所述RGB图像、所述深度图像分别进行数据预处理,所述数据预处理至少包括自适应归一化操作;将数据预处理后的RGB图像与深度图像对齐合并成RGBD图像;将所述RGBD图像输入预设深度学习模型,获得至少包括待测目标和待测目标组件的初步候选框;对所述初步候选框进行阈值过滤、NMS过滤和深度信息过滤,获得输出框;根据所述输出框确定所述待测目标组件与所述待测目标的目标关联关系,根据所述目标关联关系获得目标检测结果。本发明减少了误检、漏检现象,能够准确高效地获取目标检测结果。

Description

一种基于RGBD图像的目标检测方法、装置及计算机设备 技术领域
本发明涉及深度学习和目标检测技术领域,特别涉及一种基于RGBD图像的目标检测方法、装置及计算机设备。
背景技术
为顺应智能零售的时代潮流,无人店项目将线下零售和人工智能相结合,提供一种和线上购物一样流畅的全新购物方式。系统通过全覆盖式拍摄进店的每一个顾客的行为轨迹,实时提供商品推荐和结算等服务,真正意义上做到即拿即走的无感知购物体验。
技术问题
目前的目标检测算法及其使用的数据增强方法都仅使用RGB图像数据,尽管通过利用传统数据增强方法数据获取方便,设备成本较低,但存在信息量不足的问题,容易造成误检与漏检,进而影响到目标检测算法的准确度,甚至导致整体系统无法正常运行从而影响出店结算。
技术解决方案
为了解决现有技术的问题,本发明实施例提供了一种基于RGBD图像的目标检测方法、装置及计算机设备,通过对RGBD图像设置相应数据增强方法扩充训练数据,以及针对RGBD图像的自适应归一化方法和过滤处理等操作,减少了误检、漏检现象,能够准确高效地获取目标检测结果。所述技术方案如下:
一方面,提供了一种基于RGBD图像的目标检测方法,所述方法包括:
获取RGB图像与对应深度图像;
对所述RGB图像、所述深度图像分别进行数据预处理,所述数据预处理至少包括自适应归一化操作;
将数据预处理后的RGB图像与深度图像对齐合并成RGBD图像;
将所述RGBD图像输入预设深度学习模型,获得至少包括待测目标和待测目标组件的初步候选框;
对所述初步候选框进行阈值过滤、NMS过滤和深度信息过滤,获得输出框;
根据所述输出框确定所述待测目标组件与所述待测目标的目标关联关系,根据所述目标关联关系获得目标检测结果。
优选地,所述方法还包括:
获取无目标场景下的深度图像,计算所述自适应归一化操作所需参数和所述深度信息过滤所需参数。
优选地,获取无目标场景下的深度图像,计算所述自适应归一化操作所需参数和所述深度信息过滤所需参数,包括:
采集N张深度图像,取其非零中位数合并为一张深度图并去噪,然后在所述深度图中指定一部分地面区域,由区域生长法获得地面蒙版并去噪,再由去噪后的地面蒙版与去噪后的深度图计算得到不同地面区域深度图,计算所述不同地面区域深度图中非零区域的均值。
优选地,对所述RGB图像、所述深度图像分别进行数据预处理,包括:
将所述RGB图像、所述深度图像分别补零到预设图片比例;
再分别缩放到预设输入尺寸;
最后分别进行所述自适应归一化操作。
优选地,在将所述RGBD图像输入预设深度学习模型,获得至少包括待测目标和待测目标组件的初步候选框之前,所述方法还包括:
对采集的RGBD图像训练数据进行数据增强操作,并训练得到所述预设深度学习模型,其中所述数据增强操作至少包括以下方式的一种:
对RGBD图像进行随机旋转、缩放、翻转、平移中的至少一种操作;
对深度图像,进行高斯噪声处理和/或随机像素置零;
对RGB图像,进行高斯噪声处理。
优选地,所述随机像素置零包括:
对于一张值为1、尺寸与深度图一致的单通道图片,将其中的随机n个像素置零,作为用于数据增强的掩模,其中n为不小于1的整数。
优选地,对所述初步候选框进行阈值过滤、NMS过滤和深度信息过滤,获得输出框,包括:
先对所述初步候选框进行阈值过滤,过滤掉置信度小于预设置信度阈值的框;
然后通过NMS算法过滤掉多余重叠的框;
再利用深度信息进一步过滤,剩余的框即为所述输出框。
优选地,根据所述输出框确定所述待测目标组件与所述待测目标的目标关联关系,根据所述目标关联关系获得目标检测结果,包括:
基于所述输出框,根据所述待测目标与所述待测目标组件的交并比,将所述待测目标初步候选框与所述待测目标组件的初步候选框进行目标关联关系的绑定操作,若判定所述交并比大于预设阈值,则确定所述待测目标组件与所述待测目标属于同一人,并作为目标检测结果输出。
另一方面,提供了一种基于RGBD图像的目标检测装置,所述装置包括:
图像获取模块,用于:获取RGB图像与对应深度图像;
数据预处理模块,用于:对所述RGB图像、所述深度图像分别进行数据预处理,所述数据预处理至少包括自适应归一化操作;
RGBD图像合并模块,用于:将数据预处理后的RGB图像与深度图像对齐合并成RGBD图像;
模型计算模块,用于:将所述RGBD图像输入预设深度学习模型,获得至少包括待测目标和待测目标组件的初步候选框;
过滤模块,用于:对所述初步候选框进行阈值过滤、NMS过滤和深度信息过滤,获得输出框;
检测结果获取模块,用于:根据所述输出框确定所述待测目标组件与所述待测目标的目标关联关系,根据所述目标关联关系获得目标检测结果。
优选地,所述装置还包括参数计算模块,用于:
获取无目标场景下的深度图像,计算所述自适应归一化操作所需参数和所述深度信息过滤所需参数。
优选地,获取无目标场景下的深度图像,计算所述自适应归一化操作所需参数和所述深度信息过滤所需参数,包括:
采集N张深度图像,取其非零中位数合并为一张深度图并去噪,然后在所述深度图中指定一部分地面区域,由区域生长法获得地面蒙版并去噪,再由去噪后的地面蒙版与去噪后的深度图计算得到不同地面区域深度图,计算所述不同地面区域深度图中非零区域的均值。
优选地,所述数据预处理模块,用于:
将所述RGB图像、所述深度图像分别补零到预设图片比例;
再分别缩放到预设输入尺寸;
最后分别进行所述自适应归一化操作。
优选地,所述装置还包括模型训练模块,用于:
对采集的RGBD图像训练数据进行数据增强操作,并训练得到所述预设深度学习模型,其中所述数据增强操作至少包括以下方式的一种:
对RGBD图像进行随机旋转、缩放、翻转、平移中的至少一种操作;
对深度图像,进行高斯噪声处理和/或随机像素置零;
对RGB图像,进行高斯噪声处理。
优选地,所述随机像素置零包括:
对于一张值为1、尺寸与深度图一致的单通道图片,将其中的随机n个像素置零,作为用于数据增强的掩模,其中n为不小于1的整数。
优选地,所述过滤模块,用于:
先对所述初步候选框进行阈值过滤,过滤掉置信度小于预设置信度阈值的框;
然后通过NMS算法过滤掉多余重叠的框;
再利用深度信息进一步过滤,剩余的框即为所述输出框。
优选地,所述检测结果获取模块,用于:
基于所述输出框,根据所述待测目标与所述待测目标组件的交并比,将所述待测目标初步候选框与所述待测目标组件的初步候选框进行目标关联关系的绑定操作,若判定所述交并比大于预设阈值,则确定所述待测目标组件与所述待测目标属于同一人,并作为目标检测结果输出。
又一方面,提供了一种基于RGBD图像的目标检测计算机设备,包括:处理器;
存储器,用于存储有所述处理器的可执行指令;其中,所述处理器配置为经由所述可执行指令来执行上述方案任一项所述的基于RGBD图像的目标检测方法的步骤。
有益效果
本发明实施例提供的技术方案带来的有益效果是:
1、通过采集RGBD图像增加了信息量;
2、通过对RGBD图像设置相应数据增强方法扩充训练数据,提高模型训练的精度;
3、通过对RGBD图像进行相应自适应归一化操作,避免了RGBD采集时因摄像头高度改变引起的算法性能下降;
4、通过阈值过滤、NMS过滤和深度信息过滤,提高目标检测准确率,能够准确高效地获取目标检测结果。
附图说明
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本发明实施例1提供的基于RGBD图像的目标检测方法流程图;
图2是图1中102步骤的子步骤流程图;
图3是图1中105步骤的子步骤流程图;
图4是本发明实施例2提供的基于RGBD图像的目标检测装置结构示意图;
图5是本发明实施例3提供的基于RGBD图像的目标检测计算机设备硬件结构示意图;
图6是本发明实施例4提供的基于RGBD图像的目标检测方法,装置及计算机设备的业务流程图;
图7是图6中深度信息过滤过程流程图。
本发明的实施方式
为使本发明的目的、技术方案和优点更加清楚,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。在本发明的描述中,“多个”的含义是两个以上,除非另有明确具体的限定。
本发明实施例提供的基于RGBD图像的目标检测方法、装置及计算机设备,通过采集RGBD图像增加了信息量;通过对RGBD图像设置相应数据增强方法扩充训练数据,提高模型训练的精度;通过对RGBD图像进行相应自适应归一化操作,避免了RGBD采集时因摄像头高度改变引起的算法性能下降;通过阈值过滤、NMS过滤和深度信息过滤,提高目标检测准确率,能够准确高效地获取目标检测结果,因此适用于涉及目标检测或目标识别的多种应用场景,尤其适用于针对复杂俯拍监控场景的行人检测方法,能够准确高效地检测行人的人体、人手、人头等目标,在缓解算法在新场景中性能下降问题的同时过滤掉部分误检,为行人跟踪、实例分割、行人ReID、人货交互、动态商品识别等任务提供目标的准确位置信息与类别信息,可为无人店的监控提供有效的目标检测能力。
下面结合具体实施例及附图,对本发明实施例提供的基于RGBD图像的目标检测方法、装置及计算机设备作详细说明。需要说明的是,这里的待测目标可以是包含人体、动物等的活动目标以及静态目标等。
实施例1
图1是本发明实施例1提供的基于RGBD图像的目标检测方法流程图。图2是图1中102步骤的子步骤流程图。图3是图1中105步骤的子步骤流程图。
如图1所示,本发明实施例提供的基于RGBD图像的目标检测方法,包括以下步骤:
101、获取RGB图像与对应深度图像。
优选地,利用RGBD摄像机获取包含多个待测目标的RGB图像数据与其对应的深度图像数据。
值得注意的是,上述步骤101的过程,除了上述步骤所述的方式之外,还可以通过其他方式实现该过程,本发明实施例对具体的方式不加以限定。
另外优选地,在上述101步骤之前,实施以下步骤:
(特别地在监控场景内)获取无目标场景下的深度图像,并由深度图像数据计算自适应归一化操作所需参数和深度信息过滤所需参数。进一步优选地,上述参数计算过程可以采用以下方式:采集N张深度图像,取其非零中位数合并为一张深度图并去噪,然后在深度图中指定一部分地面区域,由区域生长法获得地面蒙版并去噪,再由去噪后的地面蒙版与去噪后的深度图计算得到不同地面区域深度图,计算不同地面区域深度图中非零区域的均值。优选地,这里获取无目标场景下的深度图像可与上述101步骤同时完成。
102、对RGB图像、深度图像分别进行数据预处理,数据预处理至少包括自适应归一化操作。
优选地,上述102步骤进一步包括以下子步骤:
1021、将RGB图像、深度图像分别补零到预设图片比例;
1022、再分别将RGB图像、深度图像缩放到预设输入尺寸;
1023、最后分别对RGB图像、深度图像进行自适应归一化操作。
值得注意的是,上述步骤102的过程,除了上述步骤所述的方式之外,还可以通过其他方式实现该过程,本发明实施例对具体的方式不加以限定。
103、将数据预处理后的RGB图像与深度图像对齐合并成RGBD图像。
104、将RGBD图像输入预设深度学习模型,获得至少包括待测目标和待测目标组件的初步候选框。
优选地,这里的预设深度学习模型可以采用ssd、yolov3、centernet等现有技术中任何可能的深度学习模型。示例性地,当需要对人体的待测目标进行检测时,这里的待测目标组件可以是包含人头、人手等人体的局部。
另外优选地,在上述104步骤之前,实施以下步骤:
对采集的RGBD图像训练数据进行数据增强操作,并训练得到预设深度学习模型,其中数据增强操作至少包括以下方式的一种:
对RGBD图像进行随机旋转、缩放、翻转、平移中的至少一种操作;
对深度图像,进行高斯噪声处理和/或随机像素置零;
对RGB图像,进行高斯噪声处理。
进一步优选地,随机像素置零采用以下方式:对于一张值为1、尺寸与深度图一致的单通道图片,将其中的随机n个像素置零,作为用于数据增强的掩模,其中n为不小于1的整数。
105、对初步候选框进行阈值过滤、NMS过滤和深度信息过滤,获得输出框。
优选地,上述105步骤包括以下子步骤:
1051、先对初步候选框进行阈值过滤,过滤掉置信度小于预设置信度阈值的框;
1052、然后通过NMS算法过滤掉多余重叠的框;
1053、再利用深度信息进一步过滤,剩余的框即为输出框。
值得注意的是,上述步骤105的过程,除了上述步骤所述的方式之外,还可以通过其他方式实现该过程,本发明实施例对具体的方式不加以限定。
106、根据输出框确定待测目标组件与待测目标的目标关联关系,根据目标关联关系获得目标检测结果。
优选地,基于输出框,根据待测目标与待测目标组件的交并比,将待测目标初步候选框与待测目标组件的初步候选框进行目标关联关系的绑定操作,若判定交并比大于预设阈值,则确定待测目标组件与待测目标属于同一人,并作为目标检测结果输出。
示例性地,将人体的初步候选框与人体组件的初步候选框进行绑定,若判定两者绑定关系大于预设阈值,则确定人体组件与人体属于同一人,并作为目标检测结果输出。
值得注意的是,上述步骤106的过程,除了上述步骤所述的方式之外,还可以通过其他方式实现该过程,本发明实施例对具体的方式不加以限定。
实施例2
图4是本发明实施例2提供的基于RGBD图像的目标检测装置结构示意图。如图4所示,本发明实施例2提供的基于RGBD图像的目标检测装置,包括图像获取模块21、数据预处理模块22、RGBD图像合并模块23、模型计算模块24、过滤模块25和检测结果获取模块26。
具体地,图像获取模块21,用于:获取RGB图像与对应深度图像;数据预处理模块22,用于:对RGB图像、深度图像分别进行数据预处理,数据预处理至少包括自适应归一化操作,优选地将RGB图像、深度图像分别补零到预设图片比例;再分别缩放到预设输入尺寸;最后分别进行自适应归一化操作;RGBD图像合并模块23,用于:将数据预处理后的RGB图像与深度图像对齐合并成RGBD图像; 模型计算模块24,用于:将RGBD图像输入预设深度学习模型,获得至少包括待测目标和待测目标组件的初步候选框;过滤模块25,用于:对初步候选框进行阈值过滤、NMS过滤和深度信息过滤,获得输出框,优选地先对初步候选框进行阈值过滤,过滤掉置信度小于预设置信度阈值的框;然后通过NMS算法过滤掉多余重叠的框;再利用深度信息进一步过滤,剩余的框即为输出框;检测结果获取模块26,用于:根据输出框确定待测目标组件与待测目标的目标关联关系,根据目标关联关系获得目标检测结果,优选地,基于输出框,根据待测目标与待测目标组件的交并比,将待测目标初步候选框与待测目标组件的初步候选框进行目标关联关系的绑定操作,若判定交并比大于预设阈值,则确定待测目标组件与待测目标属于同一人,并作为目标检测结果输出。示例性地,将人体的初步候选框与人体组件的初步候选框进行绑定,若判定两者绑定关系大于预设阈值,则确定人体组件与所述人体属于同一人,并作为目标检测结果输出。
优选地,上述基于RGBD图像的目标检测装置还包括:
参数计算模块27,用于:获取无目标场景下的深度图像,计算自适应归一化操作所需参数和深度信息过滤所需参数。优选地,上述参数计算过程采用以下方式:采集N张深度图像,取其非零中位数合并为一张深度图并去噪,然后在深度图中指定一部分地面区域,由区域生长法获得地面蒙版并去噪,再由去噪后的地面蒙版与去噪后的深度图计算得到不同地面区域深度图,计算不同地面区域深度图中非零区域的均值。
另外优选地,上述基于RGBD图像的目标检测装置还包括:
模型训练模块28,用于:对采集的RGBD图像训练数据进行数据增强操作,并训练得到预设深度学习模型,其中所述数据增强操作至少包括以下方式的一种:对RGBD图像进行随机旋转、缩放、翻转、平移中的至少一种操作;对深度图像,进行高斯噪声处理和/或随机像素置零;对RGB图像,进行高斯噪声处理。上述随机像素置零包括:对于一张值为1、尺寸与深度图一致的单通道图片,将其中的随机n个像素置零,作为用于数据增强的掩模,其中n为不小于1的整数。
实施例3
图5是本发明实施例3提供的基于RGBD图像的目标检测的计算机设备硬件结构示意图,如图5所示,本发明实施例3提供的基于RGBD图像的目标检测的计算机设备包括:
处理器31;存储器32,用于存储有处理器31的可执行指令;其中,处理器31配置为经由所述可执行指令来执行上述方案任一项所述的基于RGBD图像的目标检测方法的步骤。
其中,存储器32可以是计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM),存储器包括至少一个存储芯片。
本实施例中,该存储器32可以用来存储实现上述目标检测方法的程序;
处理器31可以用来加载并执行存储器32存储的程序,以实现上述目标检测方法的各个步骤,具体实现过程可以参照上述方法实施例相应部分的描述。
在实际应用中,该计算机设备可以是服务器、电脑等。因此,计算机设备的结构组成并不局限于存储器和处理器,还可以包括其他硬件设备,如输入设备、存储设备等等,可以根据该计算机设备的配置确定,在这里不做一一列举。
实施例4
图6是本发明实施例1至3提供的基于RGBD图像的目标检测方法,装置及设备的业务流程图,示出了一优选实施方式。
如图6所示,在该优选实施例中,主要包括以下流程:
1、参数计算:通过采集环境深度信息,计算归一化参数和地面蒙板并对深度图像去噪,首先采集 N张深度图像DEPTH=[ Depth 1,Depth 2 ,… Depth N ],取其非零中位数合并为一张深度图像 Depth以去噪,伪代码如下:
Figure 972101dest_path_image001
其中 N为大于1的整数, i,j 为像素索引, h,w为深度图像的高和宽, Median()为取中位数, NonZero()为去掉数组中的零;然后人为在深度图像中指定小部分地面区域,由区域生长法可获得地面蒙版 Mask并去噪,其中地面区域为1,其它为0;由 Mask与深度图像 Depth可得到地面不同区域的深度值:
Depth floor=Mask Depth
其中, Depth floor 为仅包含地面区域的深度图像, Mask为通过区域生长法计算得到的用于表示地面区域的掩模, Depth为深度图像。
最后,统计 Depth floor 中非零区域的均值,记为 D mean
2、RGBD图像采集:由RGBD摄像头采集RGBD图像。
3、数据预处理:首先将分别将RGB图像、深度图像的图片补零到预设的图片比例,再分别将其缩放到预设的输入尺寸,最后将分别自适应归一化后的结果作为模型的输入。具体地,通过以下方法对RGB图像进行自适应归一化:
RGB norm=RGB/255-c RGB
其中, RGB norm 为归一化后的RGB图像, RGB为归一化前的RGB图像, c RGB 为预设常数。
通过以下方法对深度图像进行自适应归一化:
Depth norm=Depth/D mean-c D
其中, Depth norm 为归一化后的深度图像, Depth为归一化前的深度图像, D mean Depth floor 中非零区域的均值, c D 为预设常数。
4、利用深度学习目标检测模型(如yolo、ssd、centernet等)得到初步候选框。具体的做法为:利用卷积神经网络提取特征图,在特征图的基础上输出初步候选框信息。
5、过滤后处理:初步候选框通过阈值过滤掉置信度不够的框,然后通过NMS算法过滤掉多余重叠的框,再利用深度信息进一步过滤,剩余的框即为最后的输出框。NMS算法为一种根据候选框IOU(交并比)与置信度去除重叠框的算法。
具体地,如图7所示,判断是否满足过滤的判定条件,根据判定结果选择过滤或保留。
利用深度信息过滤的方法流程图及伪代码如下:
Figure 73043dest_path_image002
6、人体组件绑定:将人体的初步候选框 Box 1 与人体组件的初步候选框 Box 2 进行绑定操作,若满足:
 area(Box 1 Box 2)/area(Box 2)>thresh
则判定该组件 Box 2 与人体 Box 1 属于同一人。其中, area()为计算面积, 为取交集, thresh为用于判断组件框与人体框目标关联关系的预设阈值。
需要说明的是:上述实施例提供的基于RGBD图像的目标检测装置、设备在触发目标检测业务时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置、设备的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的基于RGBD图像的目标检测装置、设备与基于RGBD图像的目标检测方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
上述所有可选技术方案,可以采用任意结合形成本发明的可选实施例,在此不再一一赘述。
综上所述,本发明实施例提供的基于RGBD图像的目标检测方法、装置及计算机设备,相比现有技术,具有以下有益效果:
1、通过采集RGBD图像增加了信息量;
2、通过对RGBD图像设置相应数据增强方法扩充训练数据,提高模型训练的精度;
3、通过对RGBD图像进行相应自适应归一化操作,避免了RGBD采集时因摄像头高度改变引起的算法性能下降;
4、通过阈值过滤、NMS过滤和深度信息过滤,提高目标检测准确率,能够准确高效地获取目标检测结果。
本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。
本申请实施例中是参照根据本申请实施例中实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
尽管已描述了本申请实施例中的优选实施例,但本领域内的技术人员一旦得知了基本创造性概念,则可对这些实施例作出另外的变更和修改。所以,所附权利要求意欲解释为包括优选实施例以及落入本申请实施例中范围的所有变更和修改。
显然,本领域的技术人员可以对本发明进行各种改动和变型而不脱离本发明的精神和范围。这样,倘若本发明的这些修改和变型属于本发明权利要求及其等同技术的范围之内,则本发明也意图包含这些改动和变型在内。
以上所述仅为本发明的较佳实施例,并不用以限制本发明,凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。

Claims (10)

  1. 一种基于RGBD图像的目标检测方法,其特征在于,所述方法包括:
    获取RGB图像与对应深度图像;
    对所述RGB图像、所述深度图像分别进行数据预处理,所述数据预处理至少包括自适应归一化操作;
    将数据预处理后的RGB图像与深度图像对齐合并成RGBD图像;
    将所述RGBD图像输入预设深度学习模型,获得至少包括待测目标和待测目标组件的初步候选框;
    对所述初步候选框进行阈值过滤、NMS过滤和深度信息过滤,获得输出框;
    根据所述输出框确定所述待测目标组件与所述待测目标的目标关联关系,根据所述目标关联关系获得目标检测结果。
  2. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    获取无目标场景下的深度图像,计算所述自适应归一化操作所需参数和所述深度信息过滤所需参数。
  3. 根据权利要求2所述的方法,其特征在于,获取无目标场景下的深度图像,计算所述自适应归一化操作所需参数和所述深度信息过滤所需参数,包括:
    采集N张深度图像,取其非零中位数合并为一张深度图并去噪,然后在所述深度图中指定一部分地面区域,由区域生长法获得地面蒙版并去噪,再由去噪后的地面蒙版与去噪后的深度图计算得到不同地面区域深度图,计算所述不同地面区域深度图中非零区域的均值。
  4. 根据权利要求1至3任一项所述的方法,其特征在于,对所述RGB图像、所述深度图像分别进行数据预处理,包括:
    将所述RGB图像、所述深度图像分别补零到预设图片比例;
    再分别缩放到预设输入尺寸;
    最后分别进行所述自适应归一化操作。
  5. 根据权利要求1至3任一项所述的方法,其特征在于,在将所述RGBD图像输入预设深度学习模型,获得至少包括待测目标和待测目标组件的初步候选框之前,所述方法还包括:
    对采集的RGBD图像训练数据进行数据增强操作,并训练得到所述预设深度学习模型,其中所述数据增强操作至少包括以下方式的一种:
    对RGBD图像进行随机旋转、缩放、翻转、平移中的至少一种操作;
    对深度图像,进行高斯噪声处理和/或随机像素置零;
    对RGB图像,进行高斯噪声处理。
  6. 根据权利要求5所述的方法,其特征在于,所述随机像素置零包括:
    对于一张值为1、尺寸与深度图一致的单通道图片,将其中的随机n个像素置零,作为用于数据增强的掩模,其中n为不小于1的整数。
  7. 根据权利要求1至3任一项所述的方法,其特征在于,对所述初步候选框进行阈值过滤、NMS过滤和深度信息过滤,获得输出框,包括:
    先对所述初步候选框进行阈值过滤,过滤掉置信度小于预设置信度阈值的框;
    然后通过NMS算法过滤掉多余重叠的框;
    再利用深度信息进一步过滤,剩余的框即为所述输出框。
  8. 根据权利要求1至3任一项所述的方法,其特征在于,根据所述输出框确定所述待测目标组件与所述待测目标的目标关联关系,根据所述目标关联关系获得目标检测结果,包括:
    基于所述输出框,根据所述待测目标与所述待测目标组件的交并比,将所述待测目标初步候选框与所述待测目标组件的初步候选框进行目标关联关系的绑定操作,若判定所述交并比大于预设阈值,则确定所述待测目标组件与所述待测目标属于同一人,并作为目标检测结果输出。
  9. 一种基于RGBD图像的目标检测装置,其特征在于,所述装置包括:
    图像获取模块,用于:获取RGB图像与对应深度图像;
    数据预处理模块,用于:对所述RGB图像、所述深度图像分别进行数据预处理,所述数据预处理至少包括自适应归一化操作;
    RGBD图像合并模块,用于:将数据预处理后的RGB图像与深度图像对齐合并成RGBD图像;
    模型计算模块,用于:将所述RGBD图像输入预设深度学习模型,获得至少包括待测目标和待测目标组件的初步候选框;
    过滤模块,用于:对所述初步候选框进行阈值过滤、NMS过滤和深度信息过滤,获得输出框;
    检测结果获取模块,用于:根据所述输出框确定所述待测目标组件与所述待测目标的目标关联关系,根据所述目标关联关系获得目标检测结果。
  10. 一种基于RGBD图像的目标检测计算机设备,其特征在于,包括:
    处理器;
    存储器,用于存储有所述处理器的可执行指令;
    其中,所述处理器配置为经由所述可执行指令来执行权利要求1至8中任一项所述的基于RGBD图像的目标检测方法的步骤。
PCT/CN2021/098681 2020-06-10 2021-06-07 一种基于rgbd图像的目标检测方法、装置及计算机设备 WO2021249351A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010523578.0A CN111738995B (zh) 2020-06-10 2020-06-10 一种基于rgbd图像的目标检测方法、装置及计算机设备
CN202010523578.0 2020-06-10

Publications (1)

Publication Number Publication Date
WO2021249351A1 true WO2021249351A1 (zh) 2021-12-16

Family

ID=72648704

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/098681 WO2021249351A1 (zh) 2020-06-10 2021-06-07 一种基于rgbd图像的目标检测方法、装置及计算机设备

Country Status (2)

Country Link
CN (1) CN111738995B (zh)
WO (1) WO2021249351A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114693612A (zh) * 2022-03-16 2022-07-01 深圳大学 一种基于深度学习的膝关节骨肿瘤检测方法及相关装置
CN114821676A (zh) * 2022-06-29 2022-07-29 珠海视熙科技有限公司 客流人体检测方法、装置、存储介质及客流统计相机

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111738995B (zh) * 2020-06-10 2023-04-14 苏宁云计算有限公司 一种基于rgbd图像的目标检测方法、装置及计算机设备
CN112509028A (zh) * 2020-11-18 2021-03-16 中铁第五勘察设计院集团有限公司 估算窗户面积的方法和装置
CN112818932A (zh) * 2021-02-26 2021-05-18 北京车和家信息技术有限公司 图像处理方法、障碍物检测方法、装置、介质及车辆
CN112926497A (zh) * 2021-03-20 2021-06-08 杭州知存智能科技有限公司 基于多通道数据特征融合的人脸识别活体检测方法和装置
CN112926498B (zh) * 2021-03-20 2024-05-24 杭州知存智能科技有限公司 基于多通道融合和深度信息局部动态生成的活体检测方法及装置
CN113256709A (zh) * 2021-04-13 2021-08-13 杭州飞步科技有限公司 目标检测方法、装置、计算机设备以及存储介质
CN113643228B (zh) * 2021-05-26 2024-01-19 四川大学 一种基于改进的CenterNet网络的核电站设备表面缺陷检测方法

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107644204A (zh) * 2017-09-12 2018-01-30 南京凌深信息科技有限公司 一种用于安防系统的人体识别与跟踪方法
CN108268869A (zh) * 2018-02-13 2018-07-10 北京旷视科技有限公司 目标检测方法、装置及系统
CN108491786A (zh) * 2018-03-20 2018-09-04 南京邮电大学 一种基于分级网络和聚类合并的人脸检测方法
US20180315213A1 (en) * 2017-04-28 2018-11-01 Vitaly Surazhsky Calibrating texture cameras using features extracted from depth images
CN109784145A (zh) * 2018-12-05 2019-05-21 北京华捷艾米科技有限公司 基于深度图的目标检测方法及存储介质
CN111145239A (zh) * 2019-12-30 2020-05-12 南京航空航天大学 一种基于深度学习的飞机油箱多余物自动检测方法
CN111179340A (zh) * 2019-12-30 2020-05-19 苏宁云计算有限公司 一种物体的定位方法、装置及计算机系统
CN111738995A (zh) * 2020-06-10 2020-10-02 苏宁云计算有限公司 一种基于rgbd图像的目标检测方法、装置及计算机设备

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107274678B (zh) * 2017-08-14 2019-05-03 河北工业大学 一种基于Kinect的夜间车流量统计及车型识别方法
CN109993086B (zh) * 2019-03-21 2021-07-27 北京华捷艾米科技有限公司 人脸检测方法、装置、系统及终端设备
CN110334639B (zh) * 2019-06-28 2021-08-10 北京精英系统科技有限公司 一种过滤图像分析检测算法的错误检测结果的装置和方法

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180315213A1 (en) * 2017-04-28 2018-11-01 Vitaly Surazhsky Calibrating texture cameras using features extracted from depth images
CN107644204A (zh) * 2017-09-12 2018-01-30 南京凌深信息科技有限公司 一种用于安防系统的人体识别与跟踪方法
CN108268869A (zh) * 2018-02-13 2018-07-10 北京旷视科技有限公司 目标检测方法、装置及系统
CN108491786A (zh) * 2018-03-20 2018-09-04 南京邮电大学 一种基于分级网络和聚类合并的人脸检测方法
CN109784145A (zh) * 2018-12-05 2019-05-21 北京华捷艾米科技有限公司 基于深度图的目标检测方法及存储介质
CN111145239A (zh) * 2019-12-30 2020-05-12 南京航空航天大学 一种基于深度学习的飞机油箱多余物自动检测方法
CN111179340A (zh) * 2019-12-30 2020-05-19 苏宁云计算有限公司 一种物体的定位方法、装置及计算机系统
CN111738995A (zh) * 2020-06-10 2020-10-02 苏宁云计算有限公司 一种基于rgbd图像的目标检测方法、装置及计算机设备

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114693612A (zh) * 2022-03-16 2022-07-01 深圳大学 一种基于深度学习的膝关节骨肿瘤检测方法及相关装置
CN114821676A (zh) * 2022-06-29 2022-07-29 珠海视熙科技有限公司 客流人体检测方法、装置、存储介质及客流统计相机

Also Published As

Publication number Publication date
CN111738995A (zh) 2020-10-02
CN111738995B (zh) 2023-04-14

Similar Documents

Publication Publication Date Title
WO2021249351A1 (zh) 一种基于rgbd图像的目标检测方法、装置及计算机设备
WO2022126377A1 (zh) 检测车道线的方法、装置、终端设备及可读存储介质
WO2021254205A1 (zh) 一种目标检测方法及装置
CN111046880B (zh) 一种红外目标图像分割方法、系统、电子设备及存储介质
CN108416789A (zh) 图像边缘检测方法及系统
DE112011103690T5 (de) Erkennung und Verfolgung sich bewegender Objekte
CN107944403B (zh) 一种图像中的行人属性检测方法及装置
CN112926410A (zh) 目标跟踪方法、装置、存储介质及智能视频系统
CN112102409A (zh) 目标检测方法、装置、设备及存储介质
WO2020131134A1 (en) Systems and methods for determining depth information in two-dimensional images
CN110827312A (zh) 一种基于协同视觉注意力神经网络的学习方法
CN107194946B (zh) 一种基于fpga的红外显著物体检测方法
US20210056307A1 (en) Object tracking across multiple images
WO2017120796A1 (zh) 路面病害的检测方法及其装置、电子设备
CN109784171A (zh) 车辆定损图像筛选方法、装置、可读存储介质及服务器
CN113449606A (zh) 一种目标对象识别方法、装置、计算机设备及存储介质
CN116279592A (zh) 一种用于无人物流车的可行驶区域划分方法
Dimitrievski et al. Semantically aware multilateral filter for depth upsampling in automotive lidar point clouds
Kiran et al. Edge preserving noise robust deep learning networks for vehicle classification
CN116246119A (zh) 3d目标检测方法、电子设备及存储介质
He et al. A novel way to organize 3D LiDAR point cloud as 2D depth map height map and surface normal map
CN116052120A (zh) 基于图像增强和多传感器融合的挖掘机夜间物体检测方法
CN114067186B (zh) 一种行人检测方法、装置、电子设备及存储介质
Buckel et al. RB-Dust-A Reference-based Dataset for Vision-based Dust Removal
Liu et al. Research on lane detection method with shadow interference

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21823160

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21823160

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 21823160

Country of ref document: EP

Kind code of ref document: A1