WO2022036567A1 - Target detection method and device, and vehicle-mounted radar - Google Patents

Target detection method and device, and vehicle-mounted radar Download PDF

Info

Publication number
WO2022036567A1
WO2022036567A1 PCT/CN2020/109879 CN2020109879W WO2022036567A1 WO 2022036567 A1 WO2022036567 A1 WO 2022036567A1 CN 2020109879 W CN2020109879 W CN 2020109879W WO 2022036567 A1 WO2022036567 A1 WO 2022036567A1
Authority
WO
WIPO (PCT)
Prior art keywords
map
feature map
frame
historical
attention
Prior art date
Application number
PCT/CN2020/109879
Other languages
French (fr)
Chinese (zh)
Inventor
郝智翔
李延召
Original Assignee
深圳市大疆创新科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市大疆创新科技有限公司 filed Critical 深圳市大疆创新科技有限公司
Priority to CN202080006536.8A priority Critical patent/CN114450720A/en
Priority to PCT/CN2020/109879 priority patent/WO2022036567A1/en
Publication of WO2022036567A1 publication Critical patent/WO2022036567A1/en

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S13/00Systems using the reflection or reradiation of radio waves, e.g. radar systems; Analogous systems using reflection or reradiation of waves whose nature or wavelength is irrelevant or unspecified
    • G01S13/88Radar or analogous systems specially adapted for specific applications
    • G01S13/89Radar or analogous systems specially adapted for specific applications for mapping or imaging
    • G01S13/90Radar or analogous systems specially adapted for specific applications for mapping or imaging using synthetic aperture techniques, e.g. synthetic aperture radar [SAR] techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing

Definitions

  • Vehicle detection is to collect environmental information around the vehicle in real time through sensors deployed on the vehicle platform, such as cameras, lidars, millimeter-wave radars, etc., and on this basis, obtain the location information of other vehicles in the surrounding environment through detection algorithms. Based on this information, the autonomous driving system can make control decisions to drive the vehicle to operate autonomously.
  • the accuracy of vehicle detection and the robustness of detection results directly affect the safety of autonomous driving. Therefore, how to improve the accuracy of vehicle detection and the robustness of detection results has become an urgent problem to be solved.
  • the first feature map is processed to obtain the second feature map after the attention shift
  • the detection result of the current frame is determined.
  • FIG. 1 is a schematic flowchart of a method for target detection according to an embodiment of the present application. This method can be applied to object detection, especially vehicle detection in autonomous driving scenarios. As shown in FIG. 1 , the method 100 for object detection includes some or all of the following steps.

Abstract

A target detection method and device and a vehicle-mounted radar capable of enhancing the accuracy of target detection and the robustness of detection results. The method comprises: acquiring a first feature map of a current frame in multiple continuous point cloud frames (110); determining a historical attention map according to the detection results of at least one frame preceding the current frame (120); processing the first feature map according to the historical attention map to obtain a second feature map after the attention is switched (130); and determining the detection results of the current frame according to the second feature map (140).

Description

一种目标检测的方法、装置和车载雷达Method, device and vehicle-mounted radar for target detection
版权申明Copyright notice
本专利文件披露的内容包含受版权保护的材料。该版权为版权所有人所有。版权所有人不反对任何人复制专利与商标局的官方记录和档案中所存在的该专利文件或者该专利披露。The disclosure of this patent document contains material that is subject to copyright protection. This copyright belongs to the copyright owner. The copyright owner has no objection to the reproduction by anyone of the patent document or the patent disclosure as it exists in the official records and archives of the Patent and Trademark Office.
技术领域technical field
本申请涉及雷达应用领域,并且更为具体地,涉及一种目标检测的方法、装置和车载雷达。The present application relates to the field of radar applications, and more particularly, to a method, device and vehicle-mounted radar for target detection.
背景技术Background technique
具备在行驶过程中检测和感知周围车辆的位置信息的能力,是实现自动驾驶技术的必备条件。车辆检测是通过部署在车载平台上的传感器,例如摄像头、激光雷达、毫米波雷达等,实时采集车辆周边的环境信息,在此基础上通过检测算法得到周围环境中的其他车辆的位置信息。基于这些信息,自动驾驶系统才能做出控制决策,从而驱动车辆自主运行。车辆检测的精度和检测结果的鲁棒性,直接影响自动驾驶的安全性,因此,如何提升车辆检测的精度和检测结果的鲁棒性,成为亟待解决的问题。The ability to detect and perceive the location information of surrounding vehicles during driving is a necessary condition for the realization of autonomous driving technology. Vehicle detection is to collect environmental information around the vehicle in real time through sensors deployed on the vehicle platform, such as cameras, lidars, millimeter-wave radars, etc., and on this basis, obtain the location information of other vehicles in the surrounding environment through detection algorithms. Based on this information, the autonomous driving system can make control decisions to drive the vehicle to operate autonomously. The accuracy of vehicle detection and the robustness of detection results directly affect the safety of autonomous driving. Therefore, how to improve the accuracy of vehicle detection and the robustness of detection results has become an urgent problem to be solved.
发明内容SUMMARY OF THE INVENTION
本申请提供一种目标检测的方法、装置和车载雷达,能够提升目标检测的精度和检测结果的鲁棒性。The present application provides a method, device and vehicle-mounted radar for target detection, which can improve the accuracy of target detection and the robustness of detection results.
第一方面,提供一种目标检测的方法,包括:In a first aspect, a method for target detection is provided, including:
获取连续的多帧点云帧中的当前帧的第一特征图;Obtain the first feature map of the current frame in the consecutive multi-frame point cloud frames;
根据所述当前帧的前至少一帧的检测结果,确定历史注意力图;Determine a historical attention map according to the detection result of at least one frame before the current frame;
根据所述历史注意力图,对所述第一特征图进行处理,得到注意力转移之后的第二特征图;According to the historical attention map, the first feature map is processed to obtain the second feature map after the attention shift;
根据所述第二特征图,确定所述当前帧的检测结果。According to the second feature map, the detection result of the current frame is determined.
第二方面,提供一种用于目标检测的装置,包括:存储器和处理器,In a second aspect, an apparatus for target detection is provided, comprising: a memory and a processor,
所述存储器,用于存储程序;the memory for storing programs;
所述处理器,用于调用所述程序,当所述程序被执行时,用于执行以下操作:The processor is configured to call the program, and when the program is executed, is configured to perform the following operations:
获取连续的多帧点云帧中的当前帧的第一特征图;Obtain the first feature map of the current frame in the consecutive multi-frame point cloud frames;
根据所述当前帧的前至少一帧的检测结果,确定历史注意力图;Determine a historical attention map according to the detection result of at least one frame before the current frame;
根据所述历史注意力图,对所述第一特征图进行处理,得到注意力转移之后的第二特征图;According to the historical attention map, the first feature map is processed to obtain the second feature map after the attention shift;
根据所述第二特征图,确定所述当前帧的检测结果。According to the second feature map, the detection result of the current frame is determined.
第三方面,提供一种车载雷达,包括上述第二方面所述的用于目标检测的装置。In a third aspect, a vehicle-mounted radar is provided, including the device for target detection according to the second aspect.
第四方面,本申请实施例提供了一种计算机可读存储介质,该计算机可读存储介质存储有计算机程序,该计算机程序被处理器执行时实现如上述第一方面所述的方法。In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, implements the method described in the first aspect.
基于上述技术方案,在对多帧点云帧中的当前帧进行处理时,利用了其前至少一帧的检测结果,因此,通过高效利用历史检测信息对当前帧的目标检测过程进行指导,能够明显提升目标检测的精度和检测结果的鲁棒性。Based on the above technical solution, when processing the current frame in the multi-frame point cloud frame, the detection result of at least one previous frame is used. Therefore, by efficiently using the historical detection information to guide the target detection process of the current frame, it is possible to Significantly improve the accuracy of target detection and the robustness of detection results.
附图说明Description of drawings
图1是本申请实施例提供的一种目标检测的方法的示意性流程图。FIG. 1 is a schematic flowchart of a method for target detection provided by an embodiment of the present application.
图2是根据历史注意力图对当前帧的特征图进行处理的示意图。Fig. 2 is a schematic diagram of processing the feature map of the current frame according to the historical attention map.
图3是基于图1所示的方法的一种可能的实现方式的流程图。FIG. 3 is a flowchart based on a possible implementation of the method shown in FIG. 1 .
图4是未加入历史检测信息的现有的深度神经网络对目标的检测结果。Figure 4 is the detection result of the existing deep neural network without adding historical detection information.
图5是加入历史检测信息的本申请的深度神经网络对目标的检测结果。FIG. 5 is the detection result of the target by the deep neural network of the present application with historical detection information added.
图6是本申请实施例提供的一种用于目标检测的装置的结构示意图。FIG. 6 is a schematic structural diagram of an apparatus for target detection provided by an embodiment of the present application.
具体实施方式detailed description
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行描述。The technical solutions in the embodiments of the present application will be described below with reference to the accompanying drawings in the embodiments of the present application.
下面结合附图,对本申请的一些实施方式作详细说明。在不冲突的情况下,下述的实施例及实施例中的特征可以相互组合。Some embodiments of the present application will be described in detail below with reference to the accompanying drawings. The embodiments described below and features in the embodiments may be combined with each other without conflict.
应理解,本文中的具体的例子只是为了帮助本领域技术人员更好地理解本申请实施例,而非限制本申请实施例的范围。It should be understood that the specific examples herein are only for helping those skilled in the art to better understand the embodiments of the present application, rather than limiting the scope of the embodiments of the present application.
还应理解,在本申请的各种实施例中,各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。It should also be understood that, in various embodiments of the present application, the size of the sequence numbers of each process does not imply the sequence of execution, and the execution sequence of each process should be determined by its function and internal logic, rather than the embodiment of the present application. implementation constitutes any limitation.
还应理解,本说明书中描述的各种实施方式,既可以单独实施,也可以组合实施,本申请实施例对此并不限定。It should also be understood that the various implementation manners described in this specification may be implemented individually or in combination, which are not limited by the embodiments of the present application.
除非另有说明,本申请实施例所使用的所有技术和科学术语与本申请的技术领域的技术人员通常理解的含义相同。本申请中所使用的术语只是为了描述具体的实施例的目的,不是旨在限制本申请的范围。Unless otherwise specified, all technical and scientific terms used in the embodiments of the present application have the same meaning as commonly understood by those skilled in the technical field of the present application. The terminology used in this application is for the purpose of describing specific embodiments only and is not intended to limit the scope of the application.
通常,采用激光雷达进行目标检测时,仅以单帧点云帧作为目标检测算法的输入。然而,激光雷达可以连续记录环境点云,因此,在连续多帧点云帧中,当前帧之前的历史检测信息对于当前帧的目标检测过程是具有很大意义的。本申请提出一种目标检测的方案,通过对历史检测信息加以利用,以提高目标检测算法的精度以及目标检测结果的鲁棒性,例如在自动驾驶场景下,能够明显提升车辆检测的精度以及检测结果的鲁棒性,从而保证自动驾驶的安全性。Usually, when using lidar for target detection, only a single frame of point cloud frame is used as the input of the target detection algorithm. However, lidar can continuously record the environmental point cloud, so in the continuous multi-frame point cloud frame, the historical detection information before the current frame is of great significance to the target detection process of the current frame. This application proposes a target detection solution, which utilizes historical detection information to improve the accuracy of target detection algorithms and the robustness of target detection results. The robustness of the results, thus guaranteeing the safety of autonomous driving.
图1是本申请实施例的目标检测的方法的示意性流程图。该方法可以应用于目标检测,特别是自动驾驶场景下的车辆检测。如图1所示,目标检测的方法100包括以下步骤中的部分或全部。FIG. 1 is a schematic flowchart of a method for target detection according to an embodiment of the present application. This method can be applied to object detection, especially vehicle detection in autonomous driving scenarios. As shown in FIG. 1 , the method 100 for object detection includes some or all of the following steps.
在步骤110中,获取连续的多帧点云帧中的当前帧的第一特征图。In step 110, the first feature map of the current frame in the consecutive multi-frame point cloud frames is acquired.
在采用深度神经网络进行目标检测时,会连续采集多帧点云帧,并对每帧点云帧进行处理。深度神经网络本质上是由一系列堆叠的二维卷积操作所组成,也称为卷积神经网络或者深度卷积神经网络。在深度神经网络中,需要进行卷积操作的当前帧的第一特征图包括该深度神经网络输入的当前帧中的空间信息以及对应的特征信息。When a deep neural network is used for target detection, multiple point cloud frames are continuously collected, and each point cloud frame is processed. A deep neural network is essentially composed of a series of stacked two-dimensional convolutional operations, also known as convolutional neural networks or deep convolutional neural networks. In a deep neural network, the first feature map of the current frame that needs to be subjected to a convolution operation includes spatial information and corresponding feature information in the current frame input by the deep neural network.
以该深度神经网络输入为点云俯视图为例,在该深度神经网络的中间过程中产生的第一特征图中的每个元素,对应于二维俯视图中一个区域的像素点以及对应的特征信息,其中,第一特征图中的前两维对应于二维俯视图中的长和宽两个方向上的位置。第一特征图中各个元素的元素值表示该深度神经网络对每个元素对应的区域的感兴趣程度,例如,某个元素的绝对值较大,则表示该深度神经网络对该元素对应的区域较为感兴趣,该区域拥有的该深度神经网络所需要的特征信息对后续输出具有较大的权值贡献。Taking the input of the deep neural network as a point cloud top view as an example, each element in the first feature map generated in the intermediate process of the deep neural network corresponds to a pixel point and corresponding feature information of an area in the two-dimensional top view. , wherein the first two dimensions in the first feature map correspond to the positions in the length and width directions in the two-dimensional top view. The element value of each element in the first feature map indicates the degree of interest of the deep neural network to the region corresponding to each element. For example, if the absolute value of an element is large, it indicates the region corresponding to the element by the deep neural network. More interesting, the feature information required by the deep neural network possessed by this area has a greater contribution to the subsequent output.
在步骤120中,根据当前帧的前至少一帧的检测结果,确定历史注意力图。In step 120, a historical attention map is determined according to the detection result of at least one frame before the current frame.
该实施例利用注意力转移机制对当前帧的第一特征图进行处理,因此需要获取注意力图,而该注意力图是根据当前帧的前至少一帧的检测结果确定的历史注意力图。优选地,可以根据当前帧的前一帧的检测结果,确定该历史注意力图。This embodiment uses the attention transfer mechanism to process the first feature map of the current frame, so an attention map needs to be obtained, and the attention map is a historical attention map determined according to the detection result of at least one frame before the current frame. Preferably, the historical attention map can be determined according to the detection result of the previous frame of the current frame.
当前帧的该第一特征图中的不同元素对应场景中的不同区域,其中每个元素的元素值表示深度神经网络对每个元素对应的区域的感兴趣程度,这时,该历史注意力图包括与第一特征图中每个元素对应的权值,或者说,该历史注意力图包括与第一特征图中不同区域对应的权值。Different elements in the first feature map of the current frame correspond to different areas in the scene, and the element value of each element represents the degree of interest of the deep neural network in the area corresponding to each element. At this time, the historical attention map includes A weight corresponding to each element in the first feature map, or in other words, the historical attention map includes weights corresponding to different regions in the first feature map.
可选地,在步骤120的一种实现方式中,可以根据当前帧的前至少一帧的检测结果中的目标分布情况,确定该历史注意力图中与当前帧的第一特征图的各个元素对应的权值。Optionally, in an implementation manner of step 120, it can be determined that the historical attention map corresponds to each element of the first feature map of the current frame according to the target distribution in the detection results of at least one frame before the current frame. weight value.
例如,当前帧的前至少一帧的检测结果中包含有目标的区域在该历史注意力图中对应的权值,大于不包含目标的区域在该历史注意力图中对应的权值。For example, the weight corresponding to the region containing the target in the historical attention map in the detection result of at least one frame before the current frame is greater than the corresponding weight of the region not containing the target in the historical attention map.
在车辆检测的场景中,当知道哪些区域包含车辆等目标时,在历史注意力图中增大车辆所在区域对应的权值大小,减小不包含车辆的区域对应的权值大小,从而利用该历史注意力图完成对第一特征图的注意力转移,以引导深度神经网络对可能包含有车辆的区域进行关注,并减小对其他区域的注意。这样,可以在增大检测出目标的概率的同时,减小检测结果中对目标的误检概率。In the scene of vehicle detection, when it is known which areas contain objects such as vehicles, the weights corresponding to the areas where the vehicles are located are increased in the historical attention map, and the weights corresponding to areas that do not contain vehicles are reduced, so as to use the historical attention map. The attention map completes the transfer of attention to the first feature map to guide the deep neural network to pay attention to areas that may contain vehicles, and reduce attention to other areas. In this way, while increasing the probability of detecting the target, the probability of false detection of the target in the detection result can be reduced.
又例如,不同类型的目标在该历史注意力图中对应的权值不同。其中,不同类型的目标例可以是车辆、人物和环境信息等。For another example, different types of targets have different corresponding weights in the historical attention map. Among them, different types of target examples can be vehicles, people, and environmental information.
其中,不同类型的目标在该历史注意力图中对应的权值,例如可以是根据对不同类型的目标的感兴趣程度确定的,对哪种类型的目标感兴趣,就可以对包含有该类型目标的区域赋予相对较大的权值。比如,对车辆最感兴趣、人物次之、对环境信息最不感兴趣,那么,在生成该历史注意力图时,可以为车辆所在区域对应的权值分配一较大值,为人物所在区域对应的权值分配一中间值,并为环境所在区域对应的权值分配一较小值。Among them, the weights corresponding to different types of targets in the historical attention map can be determined according to the degree of interest in different types of targets. The area of is assigned a relatively large weight. For example, if you are most interested in vehicles, followed by people, and least interested in environmental information, then when generating the historical attention map, you can assign a larger value to the weight corresponding to the area where the vehicle is located, and the weight corresponding to the area where the person is located can be assigned a larger value. The weight is assigned an intermediate value, and the weight corresponding to the area where the environment is located is assigned a smaller value.
除了根据目标分布情况自主设置历史注意力图中与第一特征图中各个 元素对应的权值外,不同类型的目标在所述历史注意力图中对应的权值,也可以是根据当前帧的前至少一帧的检测结果进行深度学习得到的,即通过深度学习的方式利用几层堆叠的卷积从前至少一帧的检测结果中学习得到历史注意力图中与第一特征图中各个元素对应的权值。In addition to independently setting the weights corresponding to each element in the first feature map in the historical attention map according to the target distribution, the corresponding weights of different types of targets in the historical attention map can also be based on the previous at least The detection results of one frame are obtained by deep learning, that is, the weights corresponding to the elements in the first feature map in the historical attention map are learned from the detection results of at least one previous frame by using several layers of stacked convolutions through deep learning. .
在这两种方式中,对于自主设置历史注意力图中各个权值的方式,方便调整不同情况下的各个权值的相对大小;而通过深度学习确定历史注意力图中各个权值的方式,可以利用训练过程自动优化权值生成的过程,便于提升深度神经网络对检测结果的优化效果。在实际应用中,可以根据情况选择合适的方式来获取历史注意力图。此外,也可以采用其他方式获取该历史注意力图,本申请对此不做限定。In these two methods, it is convenient to adjust the relative size of each weight value in different situations for the way of independently setting each weight value in the historical attention map; and the method of determining each weight value in the historical attention map through deep learning can use The training process automatically optimizes the weight generation process, which is convenient to improve the optimization effect of the deep neural network on the detection results. In practical applications, an appropriate method can be selected according to the situation to obtain the historical attention map. In addition, the historical attention map can also be obtained in other ways, which is not limited in this application.
在步骤130中,根据该历史注意力图,对该第一特征图进行处理,得到注意力转移之后的第二特征图。In step 130, according to the historical attention map, the first feature map is processed to obtain the second feature map after the attention shift.
在获得历史注意力图后,需要根据该历史注意力图,对当前帧的第一特征图进行处理,从而得到注意力转移后的第二特征图。After obtaining the historical attention map, it is necessary to process the first feature map of the current frame according to the historical attention map, so as to obtain the second feature map after the attention has been shifted.
可选地,在步骤130的一种可能的实现方式中,可以根据该历史注意力图中与该第一特征图的各个元素对应的权值,对该第一特征图进行卷积操作,得到该第二特征图。Optionally, in a possible implementation manner of step 130, a convolution operation may be performed on the first feature map according to the weights corresponding to each element of the first feature map in the historical attention map to obtain the Second feature map.
应理解,当前帧的第一特征图为注意力转移之前的特征图,而当前帧的第二特征图为注意力转移之后的特征图。在本申请实施例的深度神经网络中,是基于该第二特征图获取当前帧的目标检测结果。It should be understood that the first feature map of the current frame is the feature map before the attention shift, and the second feature map of the current frame is the feature map after the attention shift. In the deep neural network of the embodiment of the present application, the target detection result of the current frame is obtained based on the second feature map.
在根据历史注意力图对第一特征图进行处理时,例如可以将当前帧的第一特征图的每个元素的元素值,与该历史注意力图中与每个元素对应的权值相乘,得到注意力转移后的第二特征图的每个元素的元素值。When processing the first feature map according to the historical attention map, for example, the element value of each element of the first feature map of the current frame can be multiplied by the weight corresponding to each element in the historical attention map to obtain The element value of each element of the second feature map after attention shift.
以图2为例进行说明,图2中示出了当前帧的第一特征图、历史注意力图、以及注意力转移之后的当前帧的第二特征图。该历史注意力图可以是基于当前帧的前一帧或者前几帧的检测结果获得的,其中,该检测结果中包含有目标的区域在该历史注意力图中对应的权值相对较大,而没有包含有目标的区域在该历史注意力图中对应的权值相对较小。Taking FIG. 2 as an example for illustration, FIG. 2 shows the first feature map of the current frame, the historical attention map, and the second feature map of the current frame after the attention is shifted. The historical attention map may be obtained based on the detection results of the previous frame or the previous several frames of the current frame, wherein, the area that contains the target in the detection result has a relatively large corresponding weight in the historical attention map, and there is no Regions containing objects have relatively small weights in the historical attention map.
如图2所示,以3×3的特征图为例,第一特征图包括与场景中多个区域对应的多个元素,这里每个元素的元素值表示深度神经网络对每个元素对应的区域的感兴趣程度。注意力图中与第一特征图中各个元素对应的权值范 围在0至1之间,值越大表示深度神经网络越应当将注意力集中在对应区域,而该区域是基于前至少一帧的检测结果确定的可能包含目标的区域。将第一特征图中各个元素的元素值,与历史注意力图中对应位置的元素值相乘,可以得到第二特征图中各个元素的元素值。这里采用了元素相乘的数学操作来进行注意力的转移,从图2中可以看出,在经过注意力转移之后,深度神经网络的注意力,从第一特征图第2行第3列,转移至了第二特征图的第1行第1列和第3行第3列。并且可以看出,经过注意力转移之后的第二特征图中各个元素的元素值之间的差异变大,说明深度神经网络对当前帧的注意力变得更加集中。As shown in Figure 2, taking a 3×3 feature map as an example, the first feature map includes multiple elements corresponding to multiple areas in the scene, where the element value of each element represents the depth neural network corresponding to each element. The level of interest in the area. The weights in the attention map corresponding to each element in the first feature map range from 0 to 1. The larger the value, the more the deep neural network should focus on the corresponding area, which is based on at least one previous frame. The area identified by the detection results that may contain the target. By multiplying the element value of each element in the first feature map with the element value of the corresponding position in the historical attention map, the element value of each element in the second feature map can be obtained. Here, the mathematical operation of element multiplication is used to transfer attention. As can be seen from Figure 2, after the attention is transferred, the attention of the deep neural network starts from the second row and third column of the first feature map. Moved to row 1, column 1 and row 3, column 3 of the second feature map. And it can be seen that the difference between the element values of each element in the second feature map after the attention shift becomes larger, indicating that the deep neural network has become more focused on the current frame.
在步骤140中,根据该第二特征图,确定当前帧的检测结果。In step 140, the detection result of the current frame is determined according to the second feature map.
该实施例中,根据前至少一帧的检测结果获得历史注意力图,并根据该历史注意力图,对当前帧的第一特征图进行注意力转移,得到当前帧的第二特征图,从而根据该第二特征图确定当前帧的目标检测结果。In this embodiment, a historical attention map is obtained according to the detection result of the previous at least one frame, and according to the historical attention map, attention is transferred to the first feature map of the current frame to obtain the second feature map of the current frame, so as to obtain the second feature map of the current frame according to the historical attention map. The second feature map determines the target detection result of the current frame.
由于自动驾驶的车辆在运行时处于连续变化的环境中,车辆周边的其他车辆或物体等不会突然地小时或者出现。因此,在车辆以正常速度行驶时,高速采样例如采样频率10Hz的激光雷达所产生的相邻几帧点云帧之间具有很高的相似性,前一帧或前几帧中的目标位置信息对于后一帧的目标检测具有很强的参考性。通过利用历史检测信息,可以提升目标检测的检测效果,减少帧间异常的出现。Since the self-driving vehicle operates in a continuously changing environment, other vehicles or objects around the vehicle will not suddenly decrease or appear. Therefore, when the vehicle is running at a normal speed, the point cloud frames generated by high-speed sampling, such as lidar with a sampling frequency of 10 Hz, have high similarity, and the target position information in the previous frame or in the previous frames. It has a strong reference for target detection in the next frame. By using historical detection information, the detection effect of target detection can be improved and the occurrence of inter-frame anomalies can be reduced.
该方法适用于任何目标检测的场景中,特别地,适用于处理新型的非重复扫描激光雷达所产生的点云帧,这时,用于产生该多帧点云帧的探测装置在相邻两帧点云帧中的扫描轨迹不同。The method is suitable for any target detection scene, especially, suitable for processing point cloud frames generated by a new type of non-repetitive scanning lidar. At this time, the detection device used to generate the multi-frame point cloud frames The scan trajectories in the frame point cloud frame are different.
可选地,在一种实现方式中,该方法100还包括:根据当前帧的检测结果,更新所缓存的历史注意力图,以用于对当前帧的下一帧的特征图进行处理。Optionally, in an implementation manner, the method 100 further includes: updating the cached historical attention map according to the detection result of the current frame, so as to be used for processing the feature map of the next frame of the current frame.
也就是说,当利用该历史注意力图对当前帧的第一特征图进行处理得到第二特征图后,基于该第二特征图得到的当前帧的检测结果还用于更新该历史注意力图,其中,更新后的该历史注意力图是基于当前帧的检测结果确定的,并且用于对当前帧的下一帧的特征图进行注意力转移处理。That is to say, after using the historical attention map to process the first feature map of the current frame to obtain a second feature map, the detection result of the current frame obtained based on the second feature map is also used to update the historical attention map, where , the updated historical attention map is determined based on the detection result of the current frame, and is used to perform attention transfer processing on the feature map of the next frame of the current frame.
例如图3所示的目标检测的流程图,第N帧的检测结果会受到之前帧的检测结果的影响,所以采用历史注意力图是引入历史检测信息的一个有效的 方法。这里假设基于第N-1帧的检测结果对第N帧的特征图进行注意力转移。在对第N帧的进行目标检测的过程中,判断历史注意力图中每个权值对应的区域在第N-1帧的检测结果中是否包含感兴趣的目标例如车辆等,若包含目标,则将历史注意力图中与该区域对应的权值设置为较大的值,若不包含目标,则将历史注意力图中与该区域对应的权值设置为较小的值。这样,就从第N-1帧的检测结果中获取到了第N帧检测时需要使用到的历史注意力图。For example, in the flow chart of target detection shown in Figure 3, the detection result of the Nth frame will be affected by the detection result of the previous frame, so using the historical attention map is an effective method to introduce historical detection information. Here it is assumed that attention is shifted to the feature map of the Nth frame based on the detection results of the N-1th frame. In the process of target detection for the Nth frame, it is determined whether the area corresponding to each weight in the historical attention map contains the target of interest, such as a vehicle, in the detection result of the N-1th frame. If it contains a target, then The weight corresponding to this area in the historical attention map is set to a larger value, and if the target is not included, the weight corresponding to this area in the historical attention map is set to a small value. In this way, the historical attention map that needs to be used in the detection of the Nth frame is obtained from the detection result of the N-1th frame.
如图3所示,在步骤301中,获取第N帧点云帧;在步骤302中,根据第N-1帧点云帧的检测结果得到历史注意力图;在步骤303中,将第N帧点云帧的数据以及根据第N-1帧检测结果得到的历史注意力图,输入用于目标检测的深度神经网络;在步骤304中,在深度神经网络对第N帧点云帧进行检测的中间阶段,根据第N-1帧对应的历史注意力图完成注意力转移并引入历史检测信息,从而得到经过历史检测信息帮助的第N帧点云帧的检测结果;在步骤305中,根据第N帧点云帧的检测结果更新该历史注意力图,并用于对接下来的第N+1帧点云帧的处理做准备。As shown in FIG. 3, in step 301, the Nth frame of point cloud frame is obtained; in step 302, the historical attention map is obtained according to the detection result of the N-1th point cloud frame; in step 303, the Nth frame The data of the point cloud frame and the historical attention map obtained according to the detection result of the N-1th frame are input to the deep neural network for target detection; in step 304, in the middle of the detection of the Nth frame of point cloud frame by the deep neural network stage, complete the attention transfer according to the historical attention map corresponding to the N-1th frame and introduce the historical detection information, so as to obtain the detection result of the Nth frame point cloud frame helped by the historical detection information; in step 305, according to the Nth frame The detection result of the point cloud frame updates the historical attention map and is used to prepare for the processing of the next N+1th point cloud frame.
在图3中,在第N帧检测时所使用的历史注意力图是基于第N-1帧的检测结果生成的,若不存在第N-1帧,则可以采用所有区域对应的权值均为0.5的历史注意力图。In Figure 3, the historical attention map used in the detection of the Nth frame is generated based on the detection results of the N-1th frame. If the N-1th frame does not exist, the weights corresponding to all regions can be used as 0.5 historical attention map.
本申请实施例中,利用注意力转移机制引入历史检测信息的方法,可以与采用现有的用于检测目标的深度神经网络相结合,该过程可以包括训练和推测两个部分,这样,通过较少的改动便可以赋予该深度神经网络利用历史检测信息的能力。In the embodiment of the present application, the method of introducing historical detection information by using the attention transfer mechanism can be combined with the use of an existing deep neural network for detecting targets, and the process can include two parts: training and speculation. Few changes can give the deep neural network the ability to utilize historical detection information.
由于引入了历史检测信息,增加注意力转移机制之后的深度神经网络需要重新训练。在训练时,可以随机从训练数据中抽取一帧点云帧,同时将该帧的前一帧作为历史注意力图的生成依据,通过前述方法得到该历史注意力图,并在深度神经网络中完成对该帧的注意力转移,得到深度神经网络的输出。通常,可以在深度神经网络的后部分进行注意力转移,将深度神经网络中的该帧的特征图提取出来,并根据历史注意力图按照前述方法对该特征图进行处理,得到注意力转移之后的引入历史检测信息的该特征图,再接着将该特征图继续送入深度神经网络从而得到最终的检测结果。最后还可以利用例如梯度下降优化器等对网络损失函数(loss)进行优化,从而完成训练的过程。Due to the introduction of historical detection information, the deep neural network after adding the attention shifting mechanism needs to be retrained. During training, a frame of point cloud frame can be randomly extracted from the training data, and the previous frame of the frame can be used as the basis for the generation of the historical attention map. The attention shift of this frame, resulting in the output of the deep neural network. Usually, attention can be transferred in the latter part of the deep neural network, the feature map of the frame in the deep neural network can be extracted, and the feature map can be processed according to the previous method according to the historical attention map, and the result after the attention transfer can be obtained. The feature map of historical detection information is introduced, and then the feature map is sent to the deep neural network to obtain the final detection result. Finally, the network loss function (loss) can also be optimized by using, for example, a gradient descent optimizer to complete the training process.
当用于目标检测的深度神经网络训练完毕后,就可以进行推测。此时需要激光雷达不断提供连续的点云帧,并由该深度神经网络利用历史检测信息给出实时检测结果。在实现时,例如可以维护一个注意力图的矩阵缓存,每当一帧检测完成时,就利用该帧的检测结果生成新的历史注意力图,并用新的历史注意力图更新注意力图的矩阵缓存,该新的历史注意力图将在下一帧检测时指导深度神经网络的注意力转移并引入历史检测信息。When the deep neural network for object detection is trained, it can be inferred. At this time, the lidar needs to continuously provide continuous point cloud frames, and the deep neural network uses the historical detection information to give real-time detection results. In implementation, for example, a matrix cache of attention map can be maintained. Whenever the detection of a frame is completed, a new historical attention map is generated using the detection result of the frame, and the matrix cache of attention map is updated with the new historical attention map. The new historical attention map will guide the attention shift of the deep neural network and introduce historical detection information when the next frame is detected.
这样,对深度神经网络进行较为简单的改动,就可以在不明显增加计算量的基础上向深度神经网络引入历史检测信息,从而提高目标检测的精度和检测结果的鲁棒性。In this way, by making relatively simple changes to the deep neural network, historical detection information can be introduced into the deep neural network without significantly increasing the amount of calculation, thereby improving the accuracy of target detection and the robustness of the detection results.
图4示出了未加入历史检测信息的现有的深度神经网络对目标的检测结果,图5示出了加入历史检测信息的本申请的深度神经网络对目标的检测结果。其中,圆圈所圈出为检测到的车辆。特别对于非重复扫描的激光雷达而言,由于对相邻两帧点云帧中的扫描轨迹不同,很容易出现目标漏检的情况。图5最上方的车辆即为图4中漏检的车辆。可以看出,利用本申请的方案进行车辆检测时,在对点云帧的特征图进行处理时引入历史检测信息后,对远处车辆的漏检有了明显改善,整体提升了车辆检测的结果。FIG. 4 shows the detection result of the target by the existing deep neural network without adding the historical detection information, and FIG. 5 shows the detection result of the target by the deep neural network of the present application adding the historical detection information. Among them, the circles are the detected vehicles. Especially for the non-repetitive scanning lidar, due to the different scanning trajectories in two adjacent point cloud frames, it is easy to miss the target detection. The vehicle at the top of Figure 5 is the vehicle that was missed in Figure 4. It can be seen that when the solution of the present application is used for vehicle detection, after the historical detection information is introduced in the processing of the feature map of the point cloud frame, the missed detection of distant vehicles is significantly improved, and the overall vehicle detection result is improved. .
图6是本申请实施例提供的一种用于目标检测的装置的结构示意图,具体的,所述装置600包括:存储器601、处理器602、以及数据接口603。FIG. 6 is a schematic structural diagram of an apparatus for target detection provided by an embodiment of the present application. Specifically, the apparatus 600 includes: a memory 601 , a processor 602 , and a data interface 603 .
存储器601可以包括易失性存储器(Volatile Memory);存储器601也可以包括非易失性存储器(Non-Volatile Memory);存储器601还可以包括上述种类的存储器的组合。处理器602可以是中央处理器(Central Processing Unit,CPU)。处理器602还可以进一步包括硬件目标检测设备。上述硬件目标检测设备可以是专用集成电路(Application-Specific Integrated Circuit,ASIC),可编程逻辑器件(Programmable Logic Device,PLD)、或者二者的组合。具体例如可以是复杂可编程逻辑器件(Complex Programmable Logic Device,CPLD)、现场可编程逻辑门阵列(Field-Programmable Gate Array,FPGA)、或者其中的任意组合。The memory 601 may include a volatile memory (Volatile Memory); the memory 601 may also include a non-volatile memory (Non-Volatile Memory); the memory 601 may also include a combination of the above-mentioned types of memories. The processor 602 may be a central processing unit (Central Processing Unit, CPU). The processor 602 may further include a hardware object detection device. The above-mentioned hardware target detection device may be an application-specific integrated circuit (Application-Specific Integrated Circuit, ASIC), a programmable logic device (Programmable Logic Device, PLD), or a combination of the two. For example, it can be a complex programmable logic device (Complex Programmable Logic Device, CPLD), a Field-Programmable Gate Array (Field-Programmable Gate Array, FPGA), or any combination thereof.
在一种实现方式中,存储器601用于存储程序,当程序被执行时所述处理器602可以调用存储器601中存储的程序,用于执行如下步骤:In an implementation manner, the memory 601 is used to store a program, and when the program is executed, the processor 602 can call the program stored in the memory 601 to perform the following steps:
获取连续的多帧点云帧中的当前帧的第一特征图;Obtain the first feature map of the current frame in the consecutive multi-frame point cloud frames;
根据所述当前帧的前至少一帧的检测结果,确定历史注意力图;Determine a historical attention map according to the detection result of at least one frame before the current frame;
根据所述历史注意力图,对所述第一特征图进行处理,得到注意力转移之后的第二特征图;According to the historical attention map, the first feature map is processed to obtain the second feature map after the attention shift;
根据所述第二特征图,确定所述当前帧的检测结果。According to the second feature map, the detection result of the current frame is determined.
在一种实现方式中,所述特征图的不同元素对应场景中的不同区域,其中每个元素的元素值表示深度神经网络对每个元素对应的区域的感兴趣程度,所述注意力图包括与每个元素对应的权值。其中,所述根据所述历史注意力图中与所述第一特征图的各个元素对应的权值,对所述第一特征图进行处理,包括:根据所述历史注意力图,对所述第一特征图进行卷积操作。In an implementation manner, different elements of the feature map correspond to different regions in the scene, wherein the element value of each element represents the degree of interest of the deep neural network in the region corresponding to each element, and the attention map includes The weight corresponding to each element. The processing of the first feature map according to the weights corresponding to each element of the first feature map in the historical attention map includes: according to the historical attention map, processing the first feature map on the first feature map. The feature map is convolved.
在一种实现方式中,所述根据所述历史注意力图中与所述第一特征图的各个元素对应的权值,对所述第一特征图进行卷积操作,包括:将所述当前帧的第一特征图的每个元素的元素值,与所述历史注意力图中与每个元素对应的权值相乘,得到注意力转移后的第二特征图的每个元素的元素值。In an implementation manner, performing a convolution operation on the first feature map according to the weights corresponding to each element of the first feature map in the historical attention map includes: converting the current frame The element value of each element of the first feature map is multiplied by the weight corresponding to each element in the historical attention map to obtain the element value of each element of the second feature map after attention has been shifted.
在一种实现方式中,所述根据所述当前帧的前至少一帧的检测结果,确定历史注意力图,包括:根据所述前至少一帧的检测结果中的目标分布情况,确定所述历史注意力图中与所述第一特征图的各个元素对应的权值。In an implementation manner, the determining the historical attention map according to the detection result of the previous at least one frame of the current frame includes: determining the historical attention map according to the target distribution in the detection result of the previous at least one frame Weights corresponding to each element of the first feature map in the attention map.
在一种实现方式中,所述前至少一帧的检测结果中包含有目标的区域在所述历史注意力图中对应的权值,大于不包含目标的区域在所述注意力图中对应的权值。In an implementation manner, the weight corresponding to the area containing the target in the historical attention map in the detection result of the previous at least one frame is greater than the corresponding weight of the area not containing the target in the attention map .
在一种实现方式中,不同类型的目标在所述注意力图中对应的权值不同。In an implementation manner, different types of targets have different corresponding weights in the attention map.
在一种实现方式中,不同类型的目标在所述注意力图中对应的权值是根据对不同类型的目标的感兴趣程度确定的。In an implementation manner, the corresponding weights of different types of objects in the attention map are determined according to the degree of interest in different types of objects.
在一种实现方式中,不同类型的目标在所述注意力图中对应的权值是根据所述前一帧的检测结果进行深度学习得到的。In an implementation manner, weights corresponding to different types of targets in the attention map are obtained by deep learning according to the detection result of the previous frame.
在一种实现方式中,所述不同类型的目标包括车辆、人物和环境信息。In one implementation, the different types of objects include vehicle, person, and environmental information.
在一种实现方式中,用于产生所述多帧点云帧的探测装置在相邻两帧点云帧中的扫描轨迹不同。In an implementation manner, the scanning trajectories of the detection device for generating the multi-frame point cloud frames are different in two adjacent point cloud frames.
在一种实现方式中,所述处理器602还用于执行:根据所述当前帧的检测结果,更新所缓存的所述历史注意力图,以用于对所述当前帧的下一帧的特征图进行处理。In an implementation manner, the processor 602 is further configured to perform: according to the detection result of the current frame, update the cached historical attention map, so as to be used for the feature of the next frame of the current frame image processing.
本申请实施例中,用于目标检测的装置在对多帧点云帧中的当前帧进行 处理时,利用了其前至少一帧的检测结果,因此,通过高效利用历史检测信息对当前帧的目标检测过程进行指导,能够明显提升目标检测的精度和检测结果的鲁棒性。In the embodiment of the present application, when the device for target detection processes the current frame in the multi-frame point cloud frame, the detection result of at least one previous frame is used. The guidance of the target detection process can significantly improve the accuracy of target detection and the robustness of detection results.
本申请实施例中还提供了一种车载雷达,包括图6中所述的用于目标检测的装置。An embodiment of the present application also provides a vehicle-mounted radar, including the apparatus for target detection described in FIG. 6 .
本申请实施例中还提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时实现本申请实施例图1中描述的用于目标检测的方法,也可实现图6中所述的用于目标检测的装置,在此不再赘述。The embodiment of the present application also provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and when the computer program is executed by the processor, the application for the target described in FIG. 1 of the embodiment of the present application is realized The detection method can also implement the apparatus for target detection described in FIG. 6 , which will not be repeated here.
所述计算机可读存储介质可以是前述任一项实施例所述的设备的内部存储单元,例如设备的硬盘或内存。所述计算机可读存储介质也可以是所述设备的外部存储设备,例如所述设备上配备的插接式硬盘、智能存储卡(Smart Media Card,SMC)、安全数字(Secure Digital,SD)卡、闪存卡(Flash Card)等。进一步地,所述计算机可读存储介质还可以既包括所述设备的内部存储单元也包括外部存储设备。所述计算机可读存储介质用于存储所述计算机程序以及所述设备所需的其他程序和数据。所述计算机可读存储介质还可以用于暂时地存储已经输出或者将要输出的数据。The computer-readable storage medium may be an internal storage unit of the device described in any of the foregoing embodiments, such as a hard disk or a memory of the device. The computer-readable storage medium may also be an external storage device of the device, such as a plug-in hard disk, a smart memory card (Smart Media Card, SMC), a secure digital (Secure Digital, SD) card equipped on the device , Flash Card, etc. Further, the computer-readable storage medium may also include both an internal storage unit of the device and an external storage device. The computer-readable storage medium is used to store the computer program and other programs and data required by the apparatus. The computer-readable storage medium can also be used to temporarily store data that has been or will be output.
本领域普通技术人员可以理解,实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于一计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,所述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)、或者随机存储记忆体(Random Access Memory,RAM)等。Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be completed by instructing relevant hardware through a computer program, and the program can be stored in a computer-readable storage medium. When the program is executed, it may include the flow of the embodiments of the above-mentioned methods. The storage medium may be a magnetic disk, an optical disk, a read-only memory (Read-Only Memory, ROM), or a random access memory (Random Access Memory, RAM), and the like.
以上所揭露的仅为本申请部分实施例而已,当然不能以此来限定本申请之权利范围,因此依本申请权利要求所作的等同变化,仍属本申请所涵盖的范围。The above disclosure is only a part of the embodiments of the present application, of course, the scope of the rights of the present application cannot be limited by this, so the equivalent changes made according to the claims of the present application are still within the scope of the present application.

Claims (24)

  1. 一种目标检测的方法,其特征在于,包括:A method for target detection, comprising:
    获取连续的多帧点云帧中的当前帧的第一特征图;Obtain the first feature map of the current frame in the consecutive multi-frame point cloud frames;
    根据所述当前帧的前至少一帧的检测结果,确定历史注意力图;Determine a historical attention map according to the detection result of at least one frame before the current frame;
    根据所述历史注意力图,对所述第一特征图进行处理,得到注意力转移之后的第二特征图;According to the historical attention map, the first feature map is processed to obtain the second feature map after the attention shift;
    根据所述第二特征图,确定所述当前帧的检测结果。According to the second feature map, the detection result of the current frame is determined.
  2. 根据权利要求1所述的方法,其特征在于,所述第一特征图的不同元素对应场景中的不同区域,其中每个元素的元素值表示深度神经网络对每个元素对应的区域的感兴趣程度,所述历史注意力图包括与每个元素对应的权值,The method according to claim 1, wherein different elements of the first feature map correspond to different regions in the scene, wherein the element value of each element represents the deep neural network's interest in the region corresponding to each element degree, the historical attention map includes weights corresponding to each element,
    其中,所述根据所述历史注意力图,对所述第一特征图进行处理,包括:Wherein, the processing of the first feature map according to the historical attention map includes:
    根据所述历史注意力图中与所述第一特征图的各个元素对应的权值,对所述第一特征图进行卷积操作。According to the weights corresponding to each element of the first feature map in the historical attention map, a convolution operation is performed on the first feature map.
  3. 根据权利要求2所述的方法,其特征在于,所述根据所述历史注意力图中与所述第一特征图的各个元素对应的权值,对所述第一特征图进行卷积操作,包括:The method according to claim 2, wherein, performing a convolution operation on the first feature map according to the weights corresponding to each element of the first feature map in the historical attention map, comprising: :
    将所述当前帧的第一特征图的每个元素的元素值,与所述历史注意力图中与每个元素对应的权值相乘,得到注意力转移后的第二特征图的每个元素的元素值。Multiply the element value of each element of the first feature map of the current frame by the weight corresponding to each element in the historical attention map to obtain each element of the second feature map after attention has been shifted element value of .
  4. 根据权利要求2或3所述的方法,其特征在于,所述根据所述当前帧的前至少一帧的检测结果,确定历史注意力图,包括:The method according to claim 2 or 3, wherein the determining a historical attention map according to the detection result of at least one frame before the current frame comprises:
    根据所述前至少一帧的检测结果中的目标分布情况,确定所述历史注意力图中与所述第一特征图的各个元素对应的权值。According to the target distribution in the detection result of the previous at least one frame, the weight corresponding to each element of the first feature map in the historical attention map is determined.
  5. 根据权利要求4所述的方法,其特征在于,所述前至少一帧的检测结果中包含有目标的区域在所述历史注意力图中对应的权值,大于不包含目标的区域在所述历史注意力图中对应的权值。The method according to claim 4, wherein the weight corresponding to the area containing the target in the historical attention map in the detection result of the previous at least one frame is greater than that of the area not containing the target in the historical attention map. The corresponding weights in the attention map.
  6. 根据权利要求5所述的方法,其特征在于,不同类型的目标在所述历史注意力图中对应的权值不同。The method according to claim 5, wherein different types of targets have different corresponding weights in the historical attention map.
  7. 根据权利要求6所述的方法,其特征在于,不同类型的目标在所述历史注意力图中对应的权值是根据对不同类型的目标的感兴趣程度确定的。The method according to claim 6, wherein the weights corresponding to different types of targets in the historical attention map are determined according to the degree of interest in different types of targets.
  8. 根据权利要求6所述的方法,其特征在于,不同类型的目标在所述历史注意力图中对应的权值是根据所述前至少一帧的检测结果进行深度学习得到的。The method according to claim 6, wherein the weights corresponding to different types of targets in the historical attention map are obtained by deep learning according to the detection result of the previous at least one frame.
  9. 根据权利要求6至8中任一项所述的方法,其特征在于,所述不同类型的目标包括车辆、人物和环境信息。8. The method according to any one of claims 6 to 8, wherein the different types of targets include vehicles, people and environmental information.
  10. 根据权利要求1至9中任一项所述的方法,其特征在于,用于产生所述多帧点云帧的探测装置在相邻两帧点云帧中的扫描轨迹不同。The method according to any one of claims 1 to 9, wherein the scanning trajectories of the detection device for generating the multi-frame point cloud frames are different in two adjacent point cloud frames.
  11. 根据权利要求1至10中任一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 1 to 10, wherein the method further comprises:
    根据所述当前帧的检测结果,更新所缓存的所述历史注意力图,以用于对所述当前帧的下一帧的特征图进行处理。According to the detection result of the current frame, the cached historical attention map is updated for processing the feature map of the next frame of the current frame.
  12. 一种用于目标检测的装置,其特征在于,包括:A device for target detection, comprising:
    存储器,用于存储程序;memory for storing programs;
    处理器,用于调用所述程序,其中,当所述程序被执行时,用于执行以下操作:a processor for invoking the program, wherein when the program is executed, for performing the following operations:
    获取连续的多帧点云帧中的当前帧的第一特征图;Obtain the first feature map of the current frame in the consecutive multi-frame point cloud frames;
    根据所述当前帧的前至少一帧的检测结果,确定历史注意力图;Determine a historical attention map according to the detection result of at least one frame before the current frame;
    根据所述历史注意力图,对所述第一特征图进行处理,得到注意力转移之后的第二特征图;According to the historical attention map, the first feature map is processed to obtain the second feature map after the attention shift;
    根据所述第二特征图,确定所述当前帧的检测结果。According to the second feature map, the detection result of the current frame is determined.
  13. 根据权利要求12所述的装置,其特征在于,所述特征图的不同元素对应场景中的不同区域,其中每个元素的元素值表示深度神经网络对每个元素对应的区域的感兴趣程度,所述注意力图包括与每个元素对应的权值,The device according to claim 12, wherein different elements of the feature map correspond to different regions in the scene, wherein the element value of each element represents the degree of interest of the deep neural network in the region corresponding to each element, The attention map includes weights corresponding to each element,
    其中,所述根据所述历史注意力图中与所述第一特征图的各个元素对应的权值,对所述第一特征图进行处理,包括:Wherein, the processing of the first feature map according to the weights corresponding to each element of the first feature map in the historical attention map includes:
    根据所述历史注意力图,对所述第一特征图进行卷积操作。According to the historical attention map, a convolution operation is performed on the first feature map.
  14. 根据权利要求13所述的装置,其特征在于,所述根据所述历史注意力图中与所述第一特征图的各个元素对应的权值,对所述第一特征图进行卷积操作,包括:The apparatus according to claim 13, wherein, performing a convolution operation on the first feature map according to the weights corresponding to each element of the first feature map in the historical attention map, comprising: :
    将所述当前帧的第一特征图的每个元素的元素值,与所述历史注意力图中与每个元素对应的权值相乘,得到注意力转移后的第二特征图的每个元素 的元素值。Multiply the element value of each element of the first feature map of the current frame by the weight corresponding to each element in the historical attention map to obtain each element of the second feature map after attention has been shifted element value of .
  15. 根据权利要求14所述的装置,其特征在于,所述根据所述当前帧的前至少一帧的检测结果,确定历史注意力图,包括:The apparatus according to claim 14, wherein the determining of the historical attention map according to the detection result of at least one frame before the current frame comprises:
    根据所述前至少一帧的检测结果中的目标分布情况,确定所述历史注意力图中与所述第一特征图的各个元素对应的权值。According to the target distribution in the detection result of the previous at least one frame, the weights corresponding to each element of the first feature map in the historical attention map are determined.
  16. 根据权利要求15所述的装置,其特征在于,所述前至少一帧的检测结果中包含有目标的区域在所述历史注意力图中对应的权值,大于不包含目标的区域在所述注意力图中对应的权值。The device according to claim 15, wherein the weight corresponding to the area containing the target in the historical attention map in the detection result of the previous at least one frame is greater than that of the area not containing the target in the attention map The corresponding weights in the force map.
  17. 根据权利要求16所述的装置,其特征在于,不同类型的目标在所述注意力图中对应的权值不同。The apparatus according to claim 16, wherein different types of targets have different corresponding weights in the attention map.
  18. 根据权利要求17所述的装置,其特征在于,不同类型的目标在所述注意力图中对应的权值是根据对不同类型的目标的感兴趣程度确定的。The apparatus according to claim 17, wherein the corresponding weights of different types of objects in the attention map are determined according to the degree of interest in different types of objects.
  19. 根据权利要求17所述的装置,其特征在于,不同类型的目标在所述注意力图中对应的权值是根据所述前一帧的检测结果进行深度学习得到的。The apparatus according to claim 17, wherein the corresponding weights of different types of targets in the attention map are obtained by deep learning according to the detection result of the previous frame.
  20. 根据权利要求17至19中任一项所述的装置,其特征在于,所述不同类型的目标包括车辆、人物和环境信息。19. The apparatus of any one of claims 17 to 19, wherein the different types of objects include vehicle, person and environmental information.
  21. 根据权利要求12至19中任一项所述的装置,其特征在于,用于产生所述多帧点云帧的探测装置在相邻两帧点云帧中的扫描轨迹不同。The device according to any one of claims 12 to 19, characterized in that, the scanning trajectories of the detection device for generating the multi-frame point cloud frames are different in two adjacent point cloud frames.
  22. 根据权利要求12至19中任一项所述的装置,其特征在于,所述处理器还用于执行:The apparatus according to any one of claims 12 to 19, wherein the processor is further configured to execute:
    根据所述当前帧的检测结果,更新所缓存的所述历史注意力图,以用于对所述当前帧的下一帧的特征图进行处理。According to the detection result of the current frame, the cached historical attention map is updated for processing the feature map of the next frame of the current frame.
  23. 一种车载雷达,其特征在于,包括如权利要求12至22中任一项所述的用于目标检测的装置。A vehicle-mounted radar, characterized by comprising the device for target detection according to any one of claims 12 to 22.
  24. 一种计算机可读存储介质,存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1至11中任一项所述的方法。A computer-readable storage medium storing a computer program, wherein when the computer program is executed by a processor, the method according to any one of claims 1 to 11 is implemented.
PCT/CN2020/109879 2020-08-18 2020-08-18 Target detection method and device, and vehicle-mounted radar WO2022036567A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202080006536.8A CN114450720A (en) 2020-08-18 2020-08-18 Target detection method and device and vehicle-mounted radar
PCT/CN2020/109879 WO2022036567A1 (en) 2020-08-18 2020-08-18 Target detection method and device, and vehicle-mounted radar

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/109879 WO2022036567A1 (en) 2020-08-18 2020-08-18 Target detection method and device, and vehicle-mounted radar

Publications (1)

Publication Number Publication Date
WO2022036567A1 true WO2022036567A1 (en) 2022-02-24

Family

ID=80322389

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/109879 WO2022036567A1 (en) 2020-08-18 2020-08-18 Target detection method and device, and vehicle-mounted radar

Country Status (2)

Country Link
CN (1) CN114450720A (en)
WO (1) WO2022036567A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114882024A (en) * 2022-07-07 2022-08-09 深圳市信润富联数字科技有限公司 Target object defect detection method and device, electronic equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120328161A1 (en) * 2011-06-22 2012-12-27 Palenychka Roman Method and multi-scale attention system for spatiotemporal change determination and object detection
CN108171141A (en) * 2017-12-25 2018-06-15 淮阴工学院 The video target tracking method of cascade multi-pattern Fusion based on attention model
CN108509949A (en) * 2018-02-05 2018-09-07 杭州电子科技大学 Object detection method based on attention map
CN109740416A (en) * 2018-11-19 2019-05-10 深圳市华尊科技股份有限公司 Method for tracking target and Related product
CN110287826A (en) * 2019-06-11 2019-09-27 北京工业大学 A kind of video object detection method based on attention mechanism
CN111259940A (en) * 2020-01-10 2020-06-09 杭州电子科技大学 Target detection method based on space attention map

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120328161A1 (en) * 2011-06-22 2012-12-27 Palenychka Roman Method and multi-scale attention system for spatiotemporal change determination and object detection
CN108171141A (en) * 2017-12-25 2018-06-15 淮阴工学院 The video target tracking method of cascade multi-pattern Fusion based on attention model
CN108509949A (en) * 2018-02-05 2018-09-07 杭州电子科技大学 Object detection method based on attention map
CN109740416A (en) * 2018-11-19 2019-05-10 深圳市华尊科技股份有限公司 Method for tracking target and Related product
CN110287826A (en) * 2019-06-11 2019-09-27 北京工业大学 A kind of video object detection method based on attention mechanism
CN111259940A (en) * 2020-01-10 2020-06-09 杭州电子科技大学 Target detection method based on space attention map

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114882024A (en) * 2022-07-07 2022-08-09 深圳市信润富联数字科技有限公司 Target object defect detection method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN114450720A (en) 2022-05-06

Similar Documents

Publication Publication Date Title
Gerdzhev et al. Tornado-net: multiview total variation semantic segmentation with diamond inception module
KR102141163B1 (en) Neural network learning method and apparatus for generating synthetic aperture radar image
US10509987B1 (en) Learning method and learning device for object detector based on reconfigurable network for optimizing customers' requirements such as key performance index using target object estimating network and target object merging network, and testing method and testing device using the same
CN109964237A (en) Picture depth prediction neural network
CN113284054A (en) Image enhancement method and image enhancement device
US11657475B2 (en) Machine learned registration and multi-modal regression
US20220262002A1 (en) Feedbackward decoder for parameter efficient semantic image segmentation
EP3686785A1 (en) Learning method and learning device for fluctuation-robust object detector based on cnn using target object estimating network and testing method and testing device using the same
CN111914997A (en) Method for training neural network, image processing method and device
CN110415280B (en) Remote sensing image and building vector registration method and system under multitask CNN model
CN112215332A (en) Searching method of neural network structure, image processing method and device
CN112287824A (en) Binocular vision-based three-dimensional target detection method, device and system
WO2022036567A1 (en) Target detection method and device, and vehicle-mounted radar
CN116486288A (en) Aerial target counting and detecting method based on lightweight density estimation network
CN113610087A (en) Image small target detection method based on prior super-resolution and storage medium
CN112734931A (en) Method and system for assisting point cloud target detection
KR20220089602A (en) Method and apparatus for learning variable CNN based on non-correcting wide-angle image
JP6992099B2 (en) Information processing device, vehicle, vehicle control method, program, information processing server, information processing method
CN112132780A (en) Reinforcing steel bar quantity detection method and system based on deep neural network
CN110880003A (en) Image matching method and device, storage medium and automobile
CN115346184A (en) Lane information detection method, terminal and computer storage medium
CN114998630A (en) Ground-to-air image registration method from coarse to fine
EP3736730A1 (en) Convolutional neural network with reduced complexity
CN113963178A (en) Method, device, equipment and medium for detecting infrared dim and small target under ground-air background
CN112967399A (en) Three-dimensional time sequence image generation method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20949779

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20949779

Country of ref document: EP

Kind code of ref document: A1