WO2023213083A1 - 目标检测方法、装置和无人车 - Google Patents

目标检测方法、装置和无人车 Download PDF

Info

Publication number
WO2023213083A1
WO2023213083A1 PCT/CN2022/140352 CN2022140352W WO2023213083A1 WO 2023213083 A1 WO2023213083 A1 WO 2023213083A1 CN 2022140352 W CN2022140352 W CN 2022140352W WO 2023213083 A1 WO2023213083 A1 WO 2023213083A1
Authority
WO
WIPO (PCT)
Prior art keywords
detection model
point cloud
data
processed
detection
Prior art date
Application number
PCT/CN2022/140352
Other languages
English (en)
French (fr)
Inventor
王丹
刘浩
徐卓然
张宝丰
王冠
Original Assignee
北京京东乾石科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京京东乾石科技有限公司 filed Critical 北京京东乾石科技有限公司
Publication of WO2023213083A1 publication Critical patent/WO2023213083A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle

Definitions

  • the present disclosure relates to the field of computer vision technology, and in particular, to a target detection method, device and unmanned vehicle.
  • Object detection is an important task in autonomous driving. For example, when a vehicle is driving on the road, it needs to detect obstacles based on data collected by sensors, and perform autonomous control and path planning of the vehicle based on the detection results. Because of the limitations of the vehicle-end computing power, the entire detection framework needs to be designed rationally to achieve the highest accuracy possible with limited computing power.
  • a target detection method including: acquiring sensor data to be processed, wherein the sensor data to be processed includes point cloud data; and depending on whether the sensor data to be processed also includes point cloud data.
  • the image data corresponding to the data determines the detection model to be activated, wherein the detection model includes a first detection model and a second detection model, the first detection model is trained based on point cloud sample data, and the second detection model
  • the detection model is trained based on point cloud sample data and image sample data; based on the detection model to be activated, the sensor data to be processed is processed to obtain the detection result of the target to be identified.
  • determining the detection model to be enabled based on whether the sensor data to be processed also includes image data corresponding to point cloud data includes: when the sensor data to be processed does not include image data corresponding to point cloud data.
  • the first detection model is used as the detection model to be enabled; in the case where the sensor data to be processed includes image data corresponding to point cloud data, the second detection model is used as the detection model to be enabled.
  • using the first detection model as the detection model to be enabled includes: when the sensor data to be processed does not In the case where image data is included, or when the timestamps of the image data and point cloud data included in the sensor data to be processed are inconsistent, the first detection model is used as the detection model to be activated.
  • using the second detection model as the detection model to be enabled includes: in the image data included in the sensor data to be processed If the timestamp of the point cloud data is consistent with that of the point cloud data, the second detection model is used as the detection model to be enabled.
  • the model to be activated is a first detection model
  • processing the sensor data to be processed based on the detection model to be activated to obtain the detection result of the target to be identified includes: The data is feature encoded to obtain a first feature map; the first feature map is input into the first detection model to obtain a detection result of the target to be identified.
  • the feature encoding of the point cloud data to obtain the point cloud feature map includes: performing voxelization encoding on the point cloud data to obtain the voxel feature map; and generating according to the voxel feature map.
  • Bird's-eye view feature map input the bird's-eye view feature map into the point cloud feature extraction network model to obtain the point cloud feature map.
  • the model to be activated is a second detection model
  • processing the sensor data to be processed based on the detection model to be activated to obtain the detection result of the target to be identified includes: Perform feature encoding on the data to obtain a first feature map; perform feature encoding on the image data to obtain a second feature map; fuse the first feature map and the second feature map to obtain a fused feature map ; Input the fused feature map into the second detection model to obtain the detection result of the target to be identified.
  • the feature encoding of the image data to obtain the second feature map includes: performing semantic segmentation on the image data to obtain semantic information of each pixel in the image data; according to each pixel in the image data The semantic information of the point and the coordinate system conversion relationship are used to determine the semantic information of the point cloud point corresponding to the pixel point; feature encoding is performed on the semantic information of the point cloud point to obtain a second feature map.
  • performing feature encoding on the semantic information of the point cloud points to obtain the second feature map includes: performing voxelization encoding on the semantic information of the point cloud points to obtain a voxel feature map. ; Generate a bird's-eye view feature map based on the voxel feature map; downsample the bird's-eye view feature map to obtain a second feature map, where the size of the second feature map is consistent with the first feature map.
  • a target detection device including: an acquisition module configured to acquire sensor data to be processed, wherein the sensor data to be processed includes point cloud data; and a determination module configured to acquire the sensor data according to the Whether the sensor data to be processed includes image data corresponding to the point cloud data, a detection model to be enabled is determined, wherein the detection model includes a first detection model and a second detection model, and the first detection model is based on The second detection model is trained based on point cloud sample data and image sample data; the detection module is configured to perform on the sensor data to be processed based on the detection model to be enabled. Process to obtain the detection results of the target to be identified.
  • the determining module is configured to: use the first detection model as a detection model to be enabled if the sensor data to be processed does not include image data corresponding to point cloud data; in When the sensor data to be processed includes image data corresponding to point cloud data, the second detection model is used as the detection model to be activated.
  • the determination module is configured to: when the sensor data to be processed does not include image data, or the timestamps of the image data and point cloud data included in the sensor data to be processed are inconsistent, The first detection model serves as the detection model to be activated.
  • a target detection device including: a memory; and a processor coupled to the memory, the processor being configured to execute the above target detection method based on instructions stored in the memory.
  • a computer-readable storage medium is also proposed, on which computer program instructions are stored.
  • the instructions are executed by a processor, the above-mentioned target detection method is implemented.
  • an unmanned vehicle including the above-mentioned target detection device.
  • a computer program including: instructions that, when executed by a processor, cause the processor to perform the above-mentioned target detection method.
  • Figure 1 is a schematic flowchart of a target detection method according to some embodiments of the present disclosure.
  • Figure 2 is a schematic flowchart of determining a detection model to be enabled according to some embodiments of the present disclosure.
  • Figure 3 is a schematic flowchart of target detection based on the first detection model according to some embodiments of the present disclosure.
  • Figure 4 is a schematic flowchart of target detection based on the second detection model according to some embodiments of the present disclosure.
  • Figure 5 is a schematic structural diagram of a target detection device according to some embodiments of the present disclosure.
  • Figure 6 is a schematic structural diagram of a target detection device according to other embodiments of the present disclosure.
  • Figure 7 is a schematic structural diagram of a computer system according to some embodiments of the present disclosure.
  • Figure 8 is a schematic structural diagram of an autonomous vehicle according to some embodiments of the present disclosure.
  • any specific values are to be construed as illustrative only and not as limiting. Accordingly, other examples of the exemplary embodiments may have different values.
  • a technical problem to be solved by this disclosure is to provide a solution that can improve the accuracy and efficiency of target detection and improve the safety of unmanned driving.
  • Figure 1 is a schematic flowchart of a target detection method according to some embodiments of the present disclosure. As shown in Figure 1, the target detection method according to the embodiment of the present disclosure includes:
  • Step S110 Obtain sensor data to be processed.
  • the object detection method is performed by an object detection device.
  • the target detection device can be installed in a vehicle-mounted electronic device or in a server that controls vehicle driving.
  • the target detection device regularly acquires sensor data to be processed.
  • the target detection device regularly pulls sensor data to be processed from external modules.
  • the target detection device acquires sensor data to be processed in response to a request from an external module. For example, the target detection device receives a detection request sent by an external module and obtains sensor data to be processed according to the detection request.
  • point cloud data and image data are collected based on sensors such as vehicle-mounted radar and cameras, and target detection is performed based on the collected sensor data.
  • sensors such as vehicle-mounted radar and cameras
  • target detection is performed based on the collected sensor data.
  • Step S120 Determine the detection model to be enabled based on whether the sensor data to be processed includes image data corresponding to the point cloud data.
  • the detection model includes a first detection model and a second detection model.
  • the first detection model is trained based on point cloud sample data
  • the second detection model is trained based on point cloud sample data and image sample data.
  • the sensor data to be processed includes point cloud data.
  • the first detection model is used as the detection model to be activated; when the sensor data to be processed also includes image data corresponding to point cloud data, Use the second detection model as the detection model to be enabled.
  • the sensor data to be processed is image data
  • it is considered that the detected data is abnormal it is considered that the detected data is abnormal, and an abnormality prompt of the detected data is provided or the abnormality of the detected data is recorded.
  • Step S130 Process the sensor data to be processed based on the detection model to be activated to obtain the detection result of the target to be identified.
  • the point cloud data to be processed is processed based on the first detection model to obtain the detection result of the target to be identified; when the detection model to be activated is In the case of a second detection model, the point cloud data and image data to be processed are processed based on the second detection model to obtain a detection result of the target to be identified.
  • the targets to be identified are obstacles, traffic lights, etc. in the vehicle driving environment.
  • a detection model trained based on the same type of sample data can be selected for target detection based on the actually acquired sensor data to be processed, thereby improving the accuracy and detection efficiency of the target detection results and solving the problem Due to the inconsistency between the sensor data to be processed and the sample data used to train the detection model, the model detection accuracy is reduced, or even cannot be detected, and the model detection efficiency is reduced. This improves the accuracy and detection efficiency of the target detection results, which in turn helps Improve the safety of autonomous driving.
  • Figure 2 is a schematic flowchart of determining a detection model to be enabled according to some embodiments of the present disclosure. As shown in Figure 2, the process of determining the detection model to be enabled in this embodiment of the present disclosure includes:
  • Step S121 Determine the type of sensor data to be processed.
  • the sensor data to be processed includes at least one of point cloud data and image data.
  • the type of sensor data to be processed is determined according to different input channels of the sensor data to be processed. For example, when receiving the sensor data to be processed from the first input channel, it is determined that the sensor data to be processed is point cloud data, and when the sensor data to be processed is received from the second input channel, it is determined that the sensor data to be processed is image data. , when receiving the sensor data to be processed from the first and second input channels, it is determined that the sensor data to be processed is point cloud data and image data.
  • the type of sensor data to be processed is determined based on the type identifier carried by the sensor data to be processed. For example, when receiving sensor data to be processed that carries a first type of identifier, it is confirmed that the sensor data to be processed is point cloud data; when receiving sensor data to be processed that carries a second type of identifier, it is determined that the sensor data to be processed is image data. , when receiving the sensor data to be processed carrying the first type identifier and the second type identifier, confirm that the sensor data to be processed is point cloud data and image data.
  • Step S122 When the sensor data to be processed includes point cloud data and image data, determine whether the time stamps of the point cloud data and the image data are consistent.
  • the sensor data to be processed carries the timestamp of the point cloud data and the timestamp of the image data.
  • the timestamp of the point cloud data is compared with the timestamp of the image data. When the absolute value of the difference between the two is less than the preset threshold, it is confirmed that the timestamps of the two are consistent. If the absolute value of the difference is greater than or equal to the preset threshold, confirm that the timestamps of the two are inconsistent.
  • the time when the target detection device receives the point cloud data is used as the timestamp of the point cloud data
  • the time when the target detection device receives the image data is used as the timestamp of the image data.
  • the time when the point cloud data is received is compared with the time when the image data is received. When the absolute value of the difference between the two is less than the preset threshold, it is confirmed that the timestamps of the two are consistent. If the absolute value of the difference between the two is greater than or equal to the preset threshold, confirm that the timestamps of the two are inconsistent.
  • step S123 is executed; if the time stamps of the point cloud data and the image data are consistent, step S124 is executed.
  • Step S123 Use the first detection model as the detection model to be activated.
  • the first detection model is a detection model trained based on point cloud sample data.
  • Step S124 Use the second detection model as the detection model to be activated.
  • the second detection model is a detection model trained based on point cloud sample data and image sample data.
  • Step S125 When the sensor data to be processed is point cloud data, use the first detection model as the detection model to be activated.
  • a detection model that better matches the sensor data to be processed can be determined based on the type of sensor data to be processed and the time stamps of point cloud data and image data, which helps to improve The accuracy and efficiency of subsequent target detection based on the detection model.
  • Figure 3 is a schematic flowchart of target detection based on the first detection model according to some embodiments of the present disclosure.
  • the process shown in Figure 3 is executed.
  • the process of target detection based on the first detection model includes:
  • Step S131 Perform feature encoding on the point cloud data to obtain the first feature map.
  • step S131 includes: voxelizing the point cloud data to obtain a voxel feature map; generating a bird's-eye view feature map based on the voxel feature map; inputting the bird's-eye view feature map into the point cloud feature extraction network model, To obtain the point cloud feature map.
  • the point cloud data is voxelized as follows: each point cloud point in the point cloud data is assigned to a voxel unit in a voxel grid, and the point cloud in the voxel unit is Feature encoding is performed on the points to obtain voxel features; next, the voxel feature map is determined based on the voxel features.
  • point cloud data can be voxelized and encoded based on the method proposed by the PointPillar model or the VoxelNet model.
  • the voxel feature map is mapped to a bird's-eye view perspective, thereby obtaining a bird's-eye view feature map.
  • a bird's-eye view is a three-dimensional view drawn from a high point looking down at the undulations of the ground based on the principle of perspective and using high-angle perspective.
  • the point cloud feature extraction network model is a two-dimensional convolutional neural network. Input the bird's-eye view feature map into the two-dimensional convolutional neural network to obtain the point cloud feature map.
  • the features of the point cloud data can be quickly and accurately extracted for subsequent target detection.
  • Step S132 Input the first feature map into the first detection model to obtain the detection result of the target to be recognized.
  • the first detection model is a detection model trained based on point cloud sample data.
  • target detection when the arrival time of multiple sensor data is inconsistent or the image data is missing, target detection can be performed quickly and accurately based on the detection model matching the point cloud data, ensuring that the image data is missing.
  • the target detection effect is lower, which solves the problem of reduced detection efficiency and reduced detection accuracy caused by inconsistent arrival times of multiple sensor data or missing image data in the actual application process of autonomous vehicles.
  • Figure 4 is a schematic flowchart of target detection based on the second detection model according to some embodiments of the present disclosure.
  • the process shown in Figure 4 is executed.
  • the process of target detection based on the second detection model includes:
  • Step S131' Perform feature encoding on the point cloud data to obtain the first feature map.
  • step S131' includes: voxelizing the point cloud data to obtain a voxel feature map; generating a bird's-eye view feature map based on the voxel feature map; inputting the bird's-eye view feature map into the point cloud feature extraction network model , to obtain the point cloud feature map.
  • the point cloud data is voxelized as follows: each point cloud point in the point cloud data is assigned to a voxel unit in a voxel grid, and the point cloud in the voxel unit is Feature encoding is performed on the points to obtain voxel features; next, the voxel feature map is determined based on the voxel features.
  • point cloud data can be voxelized and encoded based on the method proposed by the PointPillar model or the VoxelNet model.
  • the voxel feature map is mapped to a bird's-eye view perspective, thereby obtaining a bird's-eye view feature map.
  • a bird's-eye view is a three-dimensional view drawn from a high point looking down at the undulations of the ground based on the principle of perspective and using high-angle perspective.
  • the point cloud feature extraction network model is a two-dimensional convolutional neural network. Input the bird's-eye view feature map into the two-dimensional convolutional neural network to obtain the point cloud feature map.
  • the features of the point cloud data can be quickly and accurately extracted for subsequent target detection.
  • Step S132' Perform feature encoding on the image data to obtain a second feature map.
  • step S132' includes: step a, perform semantic segmentation on the image data to obtain the semantic information of each pixel in the image data; step b, based on the semantic information and coordinates of each pixel in the image data The system conversion relationship is used to determine the semantic information of the point cloud points corresponding to the pixel points; in step c, feature encoding is performed on the semantic information of the point cloud points to obtain the second feature map.
  • a two-dimensional image segmentation network such as MaskRNN, is used to segment the image data to obtain semantic information of each pixel in the image data.
  • the semantic information of a pixel is the score of the category to which the pixel belongs.
  • step b the point cloud data is projected into the image coordinate system according to the coordinate system transformation relationship between the camera coordinate system and the radar coordinate system to determine the point cloud corresponding to each pixel in the image. point, and further, based on the semantic information of each pixel point in the image data and the corresponding relationship between the pixel point and the point cloud point, the semantic information of the point cloud point corresponding to the pixel point is determined.
  • step c the semantic information of the point cloud points is voxelized and encoded to obtain a voxel feature map; a bird's-eye view feature map is generated based on the voxel feature map; the bird's-eye view feature map is downsampled, To obtain a second feature map, where the size of the second feature map is consistent with that of the first feature map.
  • the size of the downsampled feature map is consistent with the first feature map, thereby achieving feature alignment and facilitating subsequent feature fusion.
  • Step S133' Fusion of the first feature map and the second feature map to obtain a fused feature map.
  • the first feature map and the second feature map are spliced, and the spliced feature map is used as the fused feature map.
  • Step S134' Input the fused feature map into the second detection model to obtain the detection result of the target to be recognized.
  • the second detection model is a detection model trained based on point cloud sample data and image sample data.
  • the fused feature map is sent to different detection networks, including a detection network for the category to which the target belongs and a detection network for the target location, to obtain a three-dimensional target detection result including the category to which the target belongs and the target location.
  • the sensor data to be processed includes point cloud data and image data corresponding to the point cloud data
  • target detection can be performed efficiently and accurately based on the second detection model, so that the target can be detected in Improve the accuracy of target detection within the range allowed by the vehicle-side computing power.
  • one is a first detection model based on point cloud data
  • the other is a second detection model based on point cloud data and image data
  • the first detection model based on point cloud data is enabled.
  • the second detection model based on point cloud and image data is enabled.
  • Figure 5 is a schematic structural diagram of a target detection device according to some embodiments of the present disclosure.
  • the target detection device according to the embodiment of the present disclosure includes: an acquisition module 510 , a determination module 520 , and a detection module 530 .
  • the acquisition module 510 is configured to acquire sensor data to be processed.
  • the application scenario is an autonomous driving scenario
  • the target detection device can be provided in a vehicle-mounted electronic device or in a server that controls vehicle driving.
  • the acquisition module 510 regularly acquires sensor data to be processed.
  • the acquisition module 510 regularly pulls sensor data to be processed from external modules.
  • the acquisition module 510 acquires sensor data to be processed in response to a request from an external module. For example, the acquisition module 510 receives a detection request sent by an external module, and acquires sensor data to be processed according to the detection request.
  • point cloud data and image data are collected based on sensors such as vehicle-mounted radar and cameras, and target detection is performed based on the collected sensor data.
  • sensors such as vehicle-mounted radar and cameras
  • target detection is performed based on the collected sensor data.
  • the determination module 520 is configured to determine the detection model to be enabled based on whether the sensor data to be processed includes image data corresponding to the point cloud data.
  • the detection model includes a first detection model and a second detection model.
  • the first detection model is trained based on point cloud sample data
  • the second detection model is trained based on point cloud sample data and image sample data.
  • the sensor data to be processed includes point cloud data.
  • the determination module 520 uses the first detection model as the detection model to be enabled; when the sensor data to be processed also includes image data corresponding to the point cloud data, In this case, the determination module 520 uses the second detection model as the detection model to be enabled.
  • the determination module 520 determines whether the sensor data to be processed includes image data corresponding to the point cloud data in the following manner: the sensor data to be processed does not include image data, or the sensor data to be processed includes image data and point cloud data. If the timestamps of the data are inconsistent, the determination module 520 determines that the sensor data to be processed does not include image data corresponding to the point cloud data; if the timestamps of the image data included in the sensor data to be processed and the point cloud data are consistent, determine Module 520 determines that the sensor data to be processed includes image data corresponding to point cloud data.
  • the determination module 520 is further configured to confirm that the detection data is abnormal, and prompt the detection data to be abnormal or record the abnormality of the detection data.
  • the detection module 530 is configured to process the sensor data to be processed based on the detection model to be enabled to obtain the detection result of the target to be identified.
  • the detection module 530 processes the point cloud data to be processed based on the first detection model to obtain the detection result of the target to be identified; when the detection model to be activated is When the detection model is the second detection model, the detection module 530 processes the point cloud data and image data to be processed based on the second detection model to obtain the detection result of the target to be identified.
  • the targets to be identified are obstacles, traffic lights, etc. in the vehicle driving environment.
  • a detection model trained based on the same type of sample data can be selected for target detection based on the actually acquired sensor data to be processed, thereby improving the accuracy and detection efficiency of the target detection results and solving the problem Due to the inconsistency between the sensor data to be processed and the sample data used to train the detection model, the model detection accuracy is reduced, or even cannot be detected, and the model detection efficiency is reduced. This improves the accuracy and detection efficiency of the target detection results, which in turn helps Improve the safety of autonomous driving.
  • Figure 6 is a schematic structural diagram of a target detection device according to other embodiments of the present disclosure.
  • the target detection device 600 includes a memory 610; and a processor 620 coupled to the memory 610.
  • the memory 610 is used to store instructions for executing corresponding embodiments of the target detection method.
  • the processor 620 is configured to perform the target detection method in any embodiments of the present disclosure based on instructions stored in the memory 610 .
  • Figure 7 is a schematic structural diagram of a computer system according to some embodiments of the present disclosure.
  • Computer system 700 may be embodied in the form of a general purpose computing device.
  • Computer system 700 includes memory 710, a processor 720, and a bus 930 that connects various system components.
  • Memory 710 may include, for example, system memory, non-volatile storage media, or the like.
  • System memory stores, for example, operating systems, applications, boot loaders, and other programs.
  • System memory may include volatile storage media such as random access memory (RAM) and/or cache memory.
  • RAM random access memory
  • the non-volatile storage medium stores, for example, instructions for performing corresponding embodiments of at least one of the target detection methods.
  • Non-volatile storage media includes but is not limited to disk storage, optical storage, flash memory, etc.
  • Processor 720 may be implemented as a general purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable logic device, discrete hardware components such as discrete gates or transistors.
  • each module such as the acquisition module, the determination module and the detection module, can be implemented by a central processing unit (CPU) running instructions in a memory that performs corresponding steps, or by a dedicated circuit that performs corresponding steps.
  • CPU central processing unit
  • Bus 730 may use any of a variety of bus structures.
  • bus structures include, but are not limited to, Industry Standard Architecture (ISA) bus, Micro Channel Architecture (MCA) bus, and Peripheral Component Interconnect (PCI) bus.
  • ISA Industry Standard Architecture
  • MCA Micro Channel Architecture
  • PCI Peripheral Component Interconnect
  • the interfaces 740, 750, 760, the memory 710 and the processor 720 of the computer system 700 may be connected through a bus 730.
  • the input and output interface 740 can provide a connection interface for input and output devices such as a monitor, mouse, and keyboard.
  • Network interface 750 provides connection interfaces for various networked devices.
  • the storage interface 760 provides a connection interface for external storage devices such as floppy disks, USB disks, and SD cards.
  • These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable device to produce a machine, such that execution of the instructions by the processor produces implementations in one or more blocks of the flowcharts and/or block diagrams.
  • a device with specified functions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable device to produce a machine, such that execution of the instructions by the processor produces implementations in one or more blocks of the flowcharts and/or block diagrams.
  • Computer-readable program instructions which may also be stored in computer-readable memory, cause the computer to operate in a specific manner to produce an article of manufacture, including implementing the functions specified in one or more blocks of the flowcharts and/or block diagrams. instructions.
  • Figure 8 is a schematic structural diagram of an autonomous vehicle according to some embodiments of the present disclosure. As shown in FIG. 8 , the unmanned vehicle 800 includes a target detection device 810 .
  • the unmanned vehicle 800 also includes a variety of sensors, such as one or more of lidar sensors, millimeter wave sensors, cameras, and other sensors.
  • the unmanned vehicle 800 collects sensor data required for target detection through vehicle-mounted sensors.
  • the target detection device 810 is configured to obtain the sensor data to be processed, determine the detection model to be activated based on whether the sensor data to be processed also includes image data corresponding to the point cloud data, and process the sensor data to be processed based on the detection model to be activated. , to obtain the detection results of the target to be identified.
  • the target to be identified is an obstacle in the vehicle driving environment, or a traffic light, etc.
  • the detection model includes a first detection model and a second detection model.
  • the first detection model is trained based on point cloud sample data
  • the second detection model is trained based on point cloud sample data and image sample data.
  • the sensor data to be processed includes point cloud data.
  • the first detection model is used as the detection model to be activated; when the sensor data to be processed also includes image data corresponding to point cloud data, Use the second detection model as the detection model to be enabled.
  • the sensor data to be processed is image data
  • it is considered that the detected data is abnormal it is considered that the detected data is abnormal, and an abnormality prompt of the detected data is provided or the abnormality of the detected data is recorded.
  • the operation of the unmanned vehicle can be further controlled, the driving path of the unmanned vehicle is planned, etc. based on the target detection result.
  • the above unmanned vehicle can support two detection models, one is a first detection model based on point cloud data, and the other is a second detection model based on point cloud data and image data.
  • the first detection model based on point cloud data is enabled.
  • the sensor data to be processed includes point cloud data and corresponding image data
  • the first detection model based on point cloud and image data is enabled.
  • Second detection model. This method can not only ensure the target detection effect when image data is missing, but also maintain a high target detection accuracy when image data corresponding to point cloud data is available, thus improving the safety of autonomous driving. .
  • the disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment that combines software and hardware aspects.
  • the accuracy and detection efficiency of the target detection results can be improved, which helps to improve the safety of unmanned driving.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)
  • Traffic Control Systems (AREA)

Abstract

本公开提出了一种目标检测方法、装置和无人车,涉及计算机视觉技术领域。其中,目标检测方法包括:获取待处理传感器数据,其中,待处理传感器数据包括点云数据;根据待处理传感器数据是否还包括与点云数据对应的图像数据,确定待启用的检测模型,其中,检测模型包括第一检测模型和第二检测模型,第一检测模型是基于点云样本数据训练得到的,第二检测模型是基于点云样本数据和图像样本数据训练得到的;基于待启用的检测模型对待处理传感器数据进行处理,以得到待识别目标的检测结果。通过以上步骤,能够提高目标检测结果的准确率和检测效率,提高自动驾驶的安全性。

Description

目标检测方法、装置和无人车
相关申请的交叉引用
本申请是以CN申请号为202210480445.9,申请日为2022年5月5日的申请为基础,并主张其优先权,该CN申请的公开内容在此作为整体引入本申请中。
技术领域
本公开涉及计算机视觉技术领域,尤其涉及一种目标检测方法、装置和无人车。
背景技术
目标检测是自动驾驶中的重要任务。例如,车辆在道路上行驶,需要根据传感器采集的数据对障碍物进行检测,根据检测结果对车辆进行自主控制和路径规划。因为车端算力的局限性,所以需要合理的设计整个检测框架,尽可能在有限的算力下达到最高的准确率。
发明内容
根据本公开的第一方面,提出了一种目标检测方法,包括:获取待处理传感器数据,其中,所述待处理传感器数据包括点云数据;根据所述待处理传感器数据是否还包括与点云数据对应的图像数据,确定待启用的检测模型,其中,所述检测模型包括第一检测模型和第二检测模型,所述第一检测模型是基于点云样本数据训练得到的,所述第二检测模型是基于点云样本数据和图像样本数据训练得到的;基于所述待启用的检测模型,对所述待处理传感器数据进行处理,以得到待识别目标的检测结果。
在一些实施例中,所述根据所述待处理传感器数据是否还包括与点云数据对应的图像数据,确定待启用的检测模型包括:在所述待处理传感器数据不包括与点云数据对应的图像数据的情况下,将所述第一检测模型作为待启用的检测模型;在所述待处理传感器数据包括与点云数据对应的图像数据的情况下,将第二检测模型作为待启用的检测模型。
在一些实施例中,在所述待处理传感器数据不包括与点云数据对应的图像数据的情况下,将所述第一检测模型作为待启用的检测模型包括:在所述待处理传感器数据不包括图像数据、或者所述待处理传感器数据包括的图像数据和点云数据的时间戳不 一致的情况下,将所述第一检测模型作为待启用的检测模型。
在一些实施例中,在所述待处理传感器数据包括与点云数据对应的图像数据的情况下,将第二检测模型作为待启用的检测模型包括:在所述待处理传感器数据包括的图像数据和点云数据的时间戳一致的情况下,将第二检测模型作为待启用的检测模型。
在一些实施例中,所述待启用模型为第一检测模型,所述基于所述待启用的检测模型对所述待处理传感器数据进行处理,以得到待识别目标的检测结果包括:对点云数据进行特征编码,以得到第一特征图;将所述第一特征图输入所述第一检测模型,以得到待识别目标的检测结果。
在一些实施例中,所述对点云数据进行特征编码,以得到点云特征图包括:对点云数据进行体素化编码,以得到体素特征图;根据所述体素特征图,生成鸟瞰特征图;将所述鸟瞰特征图输入点云特征提取网络模型,以得到点云特征图。
在一些实施例中,所述待启用模型为第二检测模型,所述基于所述待启用的检测模型对所述待处理传感器数据进行处理,以得到待识别目标的检测结果包括:对点云数据进行特征编码,以得到第一特征图;对图像数据进行特征编码,以得到第二特征图;对所述第一特征图和所述第二特征图进行融合,以得到融合后的特征图;将所述融合后的特征图输入第二检测模型,以得到待识别目标的检测结果。
在一些实施例中,所述对图像数据进行特征编码,以得到第二特征图包括:对图像数据进行语义分割,以得到图像数据中每个像素点的语义信息;根据图像数据中每个像素点的语义信息和坐标系转换关系,确定与所述像素点对应的点云点的语义信息;对所述点云点的语义信息进行特征编码,以得到第二特征图。
在一些实施例中,所述对所述点云点的语义信息进行特征编码,以得到第二特征图包括:对所述点云点的语义信息进行体素化编码,以得到体素特征图;根据所述体素特征图,生成鸟瞰特征图;对所述鸟瞰特征图进行下采样,以得到第二特征图,其中,所述第二特征图与所述第一特征图的尺寸一致。
根据本公开的第二方面,提出一种目标检测装置,包括:获取模块,被配置为获取待处理传感器数据,其中,所述待处理传感器数据包括点云数据;确定模块,被配置为根据所述待处理传感器数据是否包括与所述点云数据对应的图像数据,确定待启用的检测模型,其中,所述检测模型包括第一检测模型和第二检测模型,所述第一检测模型是基于点云样本数据训练得到的,所述第二检测模型是基于点云样本数据和图像样本数据训练得到的;检测模块,被配置为基于所述待启用的检测模型对所述待处 理传感器数据进行处理,以得到待识别目标的检测结果。
在一些实施例中,所述确定模块被配置为:在所述待处理传感器数据不包括与点云数据对应的图像数据的情况下,将所述第一检测模型作为待启用的检测模型;在所述待处理传感器数据包括与点云数据对应的图像数据的情况下,将第二检测模型作为待启用的检测模型。
在一些实施例中,所述确定模块被配置为:在所述待处理传感器数据不包括图像数据、或者所述待处理传感器数据包括的图像数据和点云数据的时间戳不一致的情况下,将所述第一检测模型作为待启用的检测模型。
根据本公开的第三方面,还提出一种目标检测装置,包括:存储器;以及耦接至存储器的处理器,处理器被配置为基于存储在存储器的指令执行如上述的目标检测方法。
根据本公开的第四方面,还提出一种计算机可读存储介质,其上存储有计算机程序指令,该指令被处理器执行时实现上述的目标检测方法。
根据本公开的第五方面,还提出一种无人车,包括如上述的目标检测装置。
根据本公开的第六方面,还提出一种计算机程序,包括:指令,所述指令当由处理器执行时使所述处理器执行上述的目标检测方法。
通过以下参照附图对本公开的示例性实施例的详细描述,本公开的其它特征及其优点将会变得清楚。
附图说明
构成说明书的一部分的附图描述了本公开的实施例,并且连同说明书一起用于解释本公开的原理。
参照附图,根据下面的详细描述,可以更加清楚地理解本公开,其中:
图1为根据本公开一些实施例的目标检测方法的流程示意图。
图2为根据本公开一些实施例的确定待启用检测模型的流程示意图。
图3为根据本公开一些实施例的基于第一检测模型进行目标检测的流程示意图。
图4为根据本公开一些实施例的基于第二检测模型进行目标检测的流程示意图。
图5为根据本公开一些实施例的目标检测装置的结构示意图。
图6为根据本公开另一些实施例的目标检测装置的结构示意图。
图7为根据本公开一些实施例的计算机系统的结构示意图。
图8为根据本公开一些实施例的无人车的结构示意图。
具体实施方式
现在将参照附图来详细描述本公开的各种示例性实施例。应注意到:除非另外具体说明,否则在这些实施例中阐述的部件和步骤的相对布置、数字表达式和数值不限制本公开的范围。
同时,应当明白,为了便于描述,附图中所示出的各个部分的尺寸并不是按照实际的比例关系绘制的。
以下对至少一个示例性实施例的描述实际上仅仅是说明性的,决不作为对本公开及其应用或使用的任何限制。
对于相关领域普通技术人员已知的技术、方法和设备可能不作详细讨论,但在适当情况下,所述技术、方法和设备应当被视为授权说明书的一部分。
在这里示出和讨论的所有示例中,任何具体值应被解释为仅仅是示例性的,而不是作为限制。因此,示例性实施例的其它示例可以具有不同的值。
应注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步讨论。
为使本公开的目的、技术方案和优点更加清楚明白,以下结合具体实施例,并参照附图,对本公开进一步详细说明。
相关技术中,在将基于样本数据训练得到的检测模型实际部署到自动驾驶的车端后,经常会出现因为带宽的延迟或者是传感器本身的问题,致使输入的待检测数据与样本数据的数据类型不一致的情况,这会严重影响检测模型的性能,降低目标检测的准确率和检测效率,进而严重影响了无人驾驶的安全性。
本公开要解决的一个技术问题是,提供一种解决方案,能够提高目标检测的准确率和检测效率,提高无人驾驶的安全性。
图1为根据本公开一些实施例的目标检测方法的流程示意图。如图1所示,本公开实施例的目标检测方法包括:
步骤S110:获取待处理传感器数据。
在一些实施例中,目标检测方法由目标检测装置执行。例如,在自动驾驶场景中,目标检测装置可以设置于车载电子设备中,也可以设置于控制车辆行驶的服务器中。
在一些实施例中,目标检测装置定时获取待处理传感器数据。例如,目标检测装 置定时从外部模块中拉取待处理传感器数据。
在另一些实施例中,目标检测装置响应于外部模块的请求,获取待处理传感器数据。例如,目标检测装置接收外部模块发送的检测请求,并根据检测请求获取待处理传感器数据。
在一些实施例中,基于车载雷达和相机等传感器采集点云数据和图像数据,并依据采集的传感器数据进行目标检测。在实际场景中,由于带宽的延迟或者是传感器本身的问题等,容易导致图像和点云数据无法同时到达或者缺失图像数据等情况出现,进而导致目标检测装置获取的待处理传感器数据有可能出现以下三种情况:只有点云数据、只有图像数据、包括点云数据和图像数据。
步骤S120:根据待处理传感器数据是否包括与点云数据对应的图像数据,确定待启用的检测模型。
在一些实施例中,检测模型包括第一检测模型和第二检测模型。其中,第一检测模型是基于点云样本数据训练得到的,第二检测模型是基于点云样本数据和图像样本数据训练得到的。
在一些实施例中,待处理传感器数据包括点云数据。在待处理传感器数据不包括与点云数据对应的图像数据的情况下,将第一检测模型作为待启用的检测模型;在待处理传感器数据还包括与点云数据对应的图像数据的情况下,将第二检测模型作为待启用的检测模型。
在一些实施例中,在待处理传感器数据为图像数据的情况下,认为检测数据异常,进行检测数据异常提示或对检测数据异常情况进行记录。
步骤S130:基于待启用的检测模型对待处理传感器数据进行处理,以得到待识别目标的检测结果。
在一些实施例中,在待启用的检测模型为第一检测模型的情况下,基于第一检测模型对待处理的点云数据进行处理,以得到待识别目标的检测结果;在待启用的检测模型为第二检测模型的情况下,基于第二检测模型对待处理的点云数据和图像数据进行处理,以得到待识别目标的检测结果。
示例性地,在自动驾驶场景中,待识别目标为车辆行驶环境中的障碍物、或者红绿灯等。
在本公开实施例中,通过以上步骤,能够根据实际获取的待处理传感器数据选择基于同类型样本数据训练得到的检测模型进行目标检测,从而提高了目标检测结果的 准确率和检测效率,解决了由于待处理传感器数据与训练检测模型所用的样本数据不一致所导致的降低模型检测准确率、甚至无法检测,降低模型检测效率等问题,提高了目标检测结果的准确率和检测效率,进而有助于提高无人驾驶的安全性。
图2为根据本公开一些实施例的确定待启用检测模型的流程示意图。如图2所示,本公开实施例的确定待启用检测模型的流程包括:
步骤S121:确定待处理传感器数据的类型。
其中,待处理传感器数据包括点云数据和图像数据的至少一种。
在一些实施例中,根据待处理传感器数据的输入通道的不同,确定待处理传感器数据的类型。例如,在接收到来自第一输入通道的待处理传感器数据时,确定待处理传感器数据为点云数据,在接收到来自第二输入通道的待处理传感器数据时,确定待处理传感器数据为图像数据,在接收到来自第一、第二输入通道的待处理传感器数据时,确定待处理传感器数据为点云数据和图像数据。
在另一些实施例中,根据待处理传感器数据携带的类型标识的不同,确定待处理传感器数据的类型。例如,在接收到携带第一类型标识的待处理传感器数据时,确认待处理传感器数据为点云数据;在接收到携带第二类型标识的待处理传感器数据时,确定待处理传感器数据为图像数据,在接收到携带第一类型标识和第二类型标识的待处理传感器数据时,确认待处理传感器数据为点云数据和图像数据。
步骤S122:在待处理传感器数据包括点云数据和图像数据的情况下,判断点云数据和图像数据的时间戳是否一致。
在一些实施例中,待处理传感器数据携带点云数据的时间戳和图像数据的时间戳。在这些实施例中,将点云数据的时间戳与图像数据的时间戳进行比较,在两者的差值绝对值小于预设阈值的情况下,确认两者的时间戳一致,在两者的差值绝对值大于或等于预设阈值的情况下,确认两者的时间戳不一致。
在另一些实施例中,将目标检测装置接收到点云数据的时间作为点云数据的时间戳,将目标检测装置接收到图像数据的时间作为图像数据的时间戳。在这些实施例中,将接收到点云数据的时间与接收到图像数据的时间进行比较,在两者的差值绝对值小于预设阈值的情况下,确认两者的时间戳一致,在两者的差值绝对值大于或等于预设阈值的情况下,确认两者的时间戳不一致。
在点云数据和图像数据的时间戳不一致的情况下,执行步骤S123;在点云数据和图像数据的时间戳一致的情况下,执行步骤S124。
步骤S123:将第一检测模型作为待启用的检测模型。
其中,第一检测模型为基于点云样本数据训练得到的检测模型。
步骤S124:将第二检测模型作为待启用的检测模型。
其中,第二检测模型为基于点云样本数据和图像样本数据训练得到的检测模型。
步骤S125:在待处理传感器数据为点云数据的情况下,将第一检测模型作为待启用的检测模型。
在本公开实施例中,通过以上步骤,能够根据待处理传感器数据的类型、以及点云数据和图像数据的时间戳的不同,确定与待处理传感器数据更为匹配的检测模型,有助于提高后续基于检测模型进行目标检测的准确率和检测效率。
图3为根据本公开一些实施例的基于第一检测模型进行目标检测的流程示意图。在待启用的检测模型为第一检测模型的情况下,执行图3所示流程。如图3所示,基于第一检测模型进行目标检测的流程包括:
步骤S131:对点云数据进行特征编码,以得到第一特征图。
在一些实施例中,步骤S131包括:对点云数据进行体素化编码,以得到体素特征图;根据体素特征图,生成鸟瞰特征图;将鸟瞰特征图输入点云特征提取网络模型,以得到点云特征图。
在一些实施例中,根据如下方式对点云数据进行体素化编码:将点云数据中的各个点云点分配到体素网格中的体素单元中,对体素单元中的点云点进行特征编码,以得到体素特征;接下来,根据体素特征,确定体素特征图。例如,可基于PointPillar模型或者VoxelNet模型所提出的方式对点云数据进行体素化编码。
在一些实施例中,将体素特征图映射到鸟瞰图视角下,从而得到鸟瞰特征图。鸟瞰图,是根据透视原理,用高视点透视法从高处某一点俯视地面起伏绘制成的立体图。
在一些实施例中,点云特征提取网络模型为二维卷积神经网络。将鸟瞰特征图输入该二维卷积神经网络,以得到点云特征图。
在本公开实施例中,通过以上步骤能够在待处理传感器数据为点云数据的情况下,能够快速、准确地提取点云数据的特征,以用于后续的目标检测。
步骤S132:将第一特征图输入第一检测模型,以得到待识别目标的检测结果。
其中,第一检测模型为基于点云样本数据训练得到的检测模型。
在本公开实施例中,通过以上步骤能够在多种传感器数据到达时间不一致或者缺失图像数据的情况下,基于与点云数据匹配的检测模型快速、准确地进行目标检测, 保证了图像数据缺失情况下的目标检测效果,解决了自动驾驶车辆在实际应用过程中因为多种传感器数据到达时间不一致或者图像数据缺失所引起的检测效率降低、检测准确率降低的问题。
图4为根据本公开一些实施例的基于第二检测模型进行目标检测的流程示意图。在待启用的检测模型为第二检测模型的情况下,执行图4所示流程。如图4所示,基于第二检测模型进行目标检测的流程包括:
步骤S131':对点云数据进行特征编码,以得到第一特征图。
在一些实施例中,步骤S131'包括:对点云数据进行体素化编码,以得到体素特征图;根据体素特征图,生成鸟瞰特征图;将鸟瞰特征图输入点云特征提取网络模型,以得到点云特征图。
在一些实施例中,根据如下方式对点云数据进行体素化编码:将点云数据中的各个点云点分配到体素网格中的体素单元中,对体素单元中的点云点进行特征编码,以得到体素特征;接下来,根据体素特征,确定体素特征图。例如,可基于PointPillar模型或者VoxelNet模型所提出的方式对点云数据进行体素化编码。
在一些实施例中,将体素特征图映射到鸟瞰图视角下,从而得到鸟瞰特征图。鸟瞰图,是根据透视原理,用高视点透视法从高处某一点俯视地面起伏绘制成的立体图。
在一些实施例中,点云特征提取网络模型为二维卷积神经网络。将鸟瞰特征图输入该二维卷积神经网络,以得到点云特征图。
在本公开实施例中,通过以上步骤能够在待处理传感器数据包括点云数据的情况下,快速、准确地提取点云数据的特征,以用于后续的目标检测。
步骤S132':对图像数据进行特征编码,以得到第二特征图。
在一些实施例中,步骤S132'包括:步骤a,对图像数据进行语义分割,以得到图像数据中每个像素点的语义信息;步骤b,根据图像数据中每个像素点的语义信息和坐标系转换关系,确定与像素点对应的点云点的语义信息;步骤c,对点云点的语义信息进行特征编码,以得到第二特征图。
在一些实施例中,在步骤a中,使用二维的图像分割网络,例如MaskRNN,对图像数据进行分割,以得到图像数据中每个像素点的语义信息。示例性地,像素点的语义信息为像素点所属类别的分数。
在一些实施例中,在步骤b中,根据相机坐标系和雷达坐标系之间的坐标系转换关系,将点云数据投影到图像坐标系中,以确定图像中每个像素点对应的点云点,进 而,根据图像数据中每个像素点的语义信息、以及像素点与点云点的对应关系,确定与像素点对应的点云点的语义信息。通过上述操作,实现了图像数据与点云数据之间的数据对齐、以及在数据对齐的基础上进行数据融合,有助于提高后续的目标检测准确率。
在一些实施例中,在步骤c中,对点云点的语义信息进行体素化编码,以得到体素特征图;根据体素特征图,生成鸟瞰特征图;对鸟瞰特征图进行下采样,以得到第二特征图,其中,第二特征图与第一特征图的尺寸一致。
在本公开实施例中,通过对鸟瞰特征图进行下采样操作,使其下采样后的特征图尺寸和第一特征图保持一致,实现了特征对齐,便于后续的特征融合。
步骤S133':对第一特征图和第二特征图进行融合,以得到融合后的特征图。
在一些实施例中,对第一特征图与第二特征图进行拼接,并将拼接后的特征图作为融合后的特征图。
步骤S134':将融合后的特征图输入第二检测模型,以得到待识别目标的检测结果。
其中,第二检测模型为基于点云样本数据和图像样本数据训练得到的检测模型。
示例性地,将融合特征图送入不同的检测网络,包括目标所属类别地检测网络和目标位置的检测网络,以得到包括目标所属类别、目标位置在内的三维目标检测结果。
在本公开实施例中,通过以上步骤能够在待处理传感器数据包括点云数据、以及与点云数据对应的图像数据的情况下,基于第二检测模型高效、精准地进行目标检测,从而能够在车端算力允许的范围内,提高目标检测的准确率。在本公开实施例中,通过支持两种检测模型,一种是基于点云数据的第一检测模型,另外一种是基于点云数据和图像数据的第二检测模型,并且通过在待处理传感器数据中的图像数据缺失或者有延迟时,启用基于点云数据的第一检测模型,在待处理传感器数据包括点云数据和对应的图像数据时,启用基于点云和图像数据的第二检测模型,这种方式既能够保证在图像数据缺失情况下的目标检测效果,也能够在有与点云数据对应的图像数据的情况下,保持较高的目标检测准确率。
图5为根据本公开一些实施例的目标检测装置的结构示意图。如图5所示,本公开实施例的目标检测装置包括:获取模块510、确定模块520、检测模块530。
获取模块510,被配置为获取待处理传感器数据。
在一些实施例中,应用场景为自动驾驶场景,目标检测装置可以设置于车载电子 设备中,也可以设置于控制车辆行驶的服务器中。
在一些实施例中,获取模块510定时获取待处理传感器数据。例如,获取模块510定时从外部模块中拉取待处理传感器数据。
在另一些实施例中,获取模块510响应于外部模块的请求,获取待处理传感器数据。例如,获取模块510接收外部模块发送的检测请求,并根据检测请求获取待处理传感器数据。
在一些实施例中,基于车载雷达和相机等传感器采集点云数据和图像数据,并依据采集的传感器数据进行目标检测。在实际场景中,由于带宽的延迟或者是传感器本身的问题等,容易导致图像和点云数据无法同时到达或者缺失图像数据等情况出现,进而导致目标检测装置获取的待处理传感器数据有可能出现以下三种情况:只有点云数据、只有图像数据、包括点云数据和图像数据。
确定模块520,被配置为根据待处理传感器数据是否包括与点云数据对应的图像数据,确定待启用的检测模型。
在一些实施例中,检测模型包括第一检测模型和第二检测模型。其中,第一检测模型是基于点云样本数据训练得到的,第二检测模型是基于点云样本数据和图像样本数据训练得到的。
在一些实施例中,待处理传感器数据包括点云数据。在待处理传感器数据不包括与点云数据对应的图像数据的情况下,确定模块520将第一检测模型作为待启用的检测模型;在待处理传感器数据还包括与点云数据对应的图像数据的情况下,确定模块520将第二检测模型作为待启用的检测模型。
在一些实施例中,确定模块520根据如下方式判断待处理传感器数据是否包括与点云数据对应的图像数据:在待处理传感器数据不包括图像数据、或者待处理传感器数据包括的图像数据和点云数据的时间戳不一致的情况下,确定模块520确定待处理传感器数据不包括与点云数据对应的图像数据;在待处理传感器数据包括的图像数据和点云数据的时间戳一致的情况下,确定模块520确定待处理传感器数据包括与点云数据对应的图像数据。
在一些实施例中,在待处理传感器数据为图像数据的情况下,确定模块520还被配置为,确认检测数据异常,并进行检测数据异常提示或对检测数据异常情况进行记录。
检测模块530,被配置为基于待启用的检测模型对待处理传感器数据进行处理, 以得到待识别目标的检测结果。
在一些实施例中,在待启用的检测模型为第一检测模型的情况下,检测模块530基于第一检测模型对待处理的点云数据进行处理,以得到待识别目标的检测结果;在待启用的检测模型为第二检测模型的情况下,检测模块530基于第二检测模型对待处理的点云数据和图像数据进行处理,以得到待识别目标的检测结果。
示例性地,在自动驾驶场景中,待识别目标为车辆行驶环境中的障碍物、或者红绿灯等。
在本公开实施例中,通过以上装置,能够根据实际获取的待处理传感器数据选择基于同类型样本数据训练得到的检测模型进行目标检测,从而提高了目标检测结果的准确率和检测效率,解决了由于待处理传感器数据与训练检测模型所用的样本数据不一致所导致的降低模型检测准确率、甚至无法检测,降低模型检测效率等问题,提高了目标检测结果的准确率和检测效率,进而有助于提高无人驾驶的安全性。
图6为根据本公开另一些实施例的目标检测装置的结构示意图。
如图6所示,目标检测装置600包括存储器610;以及耦接至该存储器610的处理器620。存储器610用于存储执行目标检测方法对应实施例的指令。处理器620被配置为基于存储在存储器610中的指令,执行本公开中任意一些实施例中的目标检测方法。
图7为根据本公开一些实施例的计算机系统的结构示意图。
如图7所示,计算机系统700可以通用计算设备的形式表现。计算机系统700包括存储器710、处理器720和连接不同系统组件的总线930。
存储器710例如可以包括系统存储器、非易失性存储介质等。系统存储器例如存储有操作系统、应用程序、引导装载程序(Boot Loader)以及其他程序等。系统存储器可以包括易失性存储介质,例如随机存取存储器(RAM)和/或高速缓存存储器。非易失性存储介质例如存储有执行目标检测方法中的至少一种的对应实施例的指令。非易失性存储介质包括但不限于磁盘存储器、光学存储器、闪存等。
处理器720可以用通用处理器、数字信号处理器(DSP)、应用专用集成电路(ASIC)、现场可编程门阵列(FPGA)或其它可编程逻辑设备、分立门或晶体管等分立硬件组件方式来实现。相应地,诸如获取模块、确定模块和检测模块的每个模块,可以通过中央处理器(CPU)运行存储器中执行相应步骤的指令来实现,也可以通过执行相应步骤的专用电路来实现。
总线730可以使用多种总线结构中的任意总线结构。例如,总线结构包括但不限于工业标准体系结构(ISA)总线、微通道体系结构(MCA)总线、外围组件互连(PCI)总线。
计算机系统700这些接口740、750、760以及存储器710和处理器720之间可以通过总线730连接。输入输出接口740可以为显示器、鼠标、键盘等输入输出设备提供连接接口。网络接口750为各种联网设备提供连接接口。存储接口760为软盘、U盘、SD卡等外部存储设备提供连接接口。
这里,参照根据本公开实施例的方法、装置和计算机程序产品的流程图和/或框图描述了本公开的各个方面。应当理解,流程图和/或框图的每个框以及各框的组合,都可以由计算机可读程序指令实现。
这些计算机可读程序指令可提供到通用计算机、专用计算机或其他可编程装置的处理器,以产生一个机器,使得通过处理器执行指令产生实现在流程图和/或框图中一个或多个框中指定的功能的装置。
这些计算机可读程序指令也可存储在计算机可读存储器中,这些指令使得计算机以特定方式工作,从而产生一个制造品,包括实现在流程图和/或框图中一个或多个框中指定的功能的指令。
图8为根据本公开一些实施例的无人车的结构示意图。如图8所示,无人车800包括目标检测装置810。
无人车800还包括多种传感器,例如,激光雷达传感器、毫米波传感器、相机等传感器中的一种或多种。无人车800,通过车载的传感器采集目标检测所需的传感器数据。
目标检测装置810,被配置为获取待处理传感器数据,根据待处理传感器数据是否还包括与点云数据对应的图像数据,确定待启用的检测模型,基于待启用的检测模型对待处理传感器数据进行处理,以得到待识别目标的检测结果。
示例性地,待识别目标为车辆行驶环境中的障碍物、或者红绿灯等。
其中,检测模型包括第一检测模型和第二检测模型,第一检测模型是基于点云样本数据训练得到的,第二检测模型是基于点云样本数据和图像样本数据训练得到的。
在一些实施例中,待处理传感器数据包括点云数据。在待处理传感器数据不包括与点云数据对应的图像数据的情况下,将第一检测模型作为待启用的检测模型;在待处理传感器数据还包括与点云数据对应的图像数据的情况下,将第二检测模型作为待 启用的检测模型。
在一些实施例中,在待处理传感器数据为图像数据的情况下,认为检测数据异常,进行检测数据异常提示或对检测数据异常情况进行记录。
在一些实施例中,在通过目标检测装置810得到目标检测结果之后,可基于目标检测结果进一步控制无人车的运行、规划无人车的行驶路径等。
在本公开实施例中,通过以上无人车,能够支持两种检测模型,一种是基于点云数据的第一检测模型,另外一种是基于点云数据和图像数据的第二检测模型。在待处理传感器数据中的图像数据缺失或者有延迟时,启用基于点云数据的第一检测模型,在待处理传感器数据包括点云数据和对应的图像数据时,启用基于点云和图像数据的第二检测模型。这种方式既能够保证在图像数据缺失情况下的目标检测效果,也能够在有与点云数据对应的图像数据的情况下,保持较高的目标检测准确率,进而提高了自动驾驶的安全性。
本公开可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。
通过上述实施例中的目标检测方法、装置和无人车,能够提高目标检测结果的准确性和检测效率,有助于提高无人驾驶的安全性。
至此,已经详细描述了根据本公开的目标检测方法、装置和无人车。为了避免遮蔽本公开的构思,没有描述本领域所公知的一些细节。本领域技术人员根据上面的描述,完全可以明白如何实施这里公开的技术方案。

Claims (16)

  1. 一种目标检测方法,包括:
    获取待处理传感器数据,其中,所述待处理传感器数据包括点云数据;
    根据所述待处理传感器数据是否还包括与点云数据对应的图像数据,确定待启用的检测模型,其中,所述检测模型包括第一检测模型和第二检测模型,所述第一检测模型是基于点云样本数据训练得到的,所述第二检测模型是基于点云样本数据和图像样本数据训练得到的;
    基于所述待启用的检测模型,对所述待处理传感器数据进行处理,以得到待识别目标的检测结果。
  2. 根据权利要求1所述的目标检测方法,其中,所述根据所述待处理传感器数据是否还包括与点云数据对应的图像数据,确定待启用的检测模型包括:
    在所述待处理传感器数据不包括与点云数据对应的图像数据的情况下,将所述第一检测模型作为待启用的检测模型;
    在所述待处理传感器数据包括与点云数据对应的图像数据的情况下,将第二检测模型作为待启用的检测模型。
  3. 根据权利要求2所述的目标检测方法,在所述待处理传感器数据不包括与点云数据对应的图像数据的情况下,将所述第一检测模型作为待启用的检测模型包括:
    在所述待处理传感器数据不包括图像数据、或者所述待处理传感器数据包括的图像数据和点云数据的时间戳不一致的情况下,将所述第一检测模型作为待启用的检测模型。
  4. 根据权利要求2所述的目标检测方法,其中,在所述待处理传感器数据包括与点云数据对应的图像数据的情况下,将第二检测模型作为待启用的检测模型包括:
    在所述待处理传感器数据包括的图像数据和点云数据的时间戳一致的情况下,将第二检测模型作为待启用的检测模型。
  5. 根据权利要求1所述的目标检测方法,其中,所述待启用模型为第一检测模 型,所述基于所述待启用的检测模型对所述待处理传感器数据进行处理,以得到待识别目标的检测结果包括:
    对点云数据进行特征编码,以得到第一特征图;
    将所述第一特征图输入所述第一检测模型,以得到待识别目标的检测结果。
  6. 根据权利要求5所述的目标检测方法,其中,所述对点云数据进行特征编码,以得到点云特征图包括:
    对点云数据进行体素化编码,以得到体素特征图;
    根据所述体素特征图,生成鸟瞰特征图;
    将所述鸟瞰特征图输入点云特征提取网络模型,以得到点云特征图。
  7. 根据权利要求1所述的目标检测方法,其中,所述待启用模型为第二检测模型,所述基于所述待启用的检测模型对所述待处理传感器数据进行处理,以得到待识别目标的检测结果包括:
    对点云数据进行特征编码,以得到第一特征图;
    对图像数据进行特征编码,以得到第二特征图;
    对所述第一特征图和所述第二特征图进行融合,以得到融合后的特征图;
    将所述融合后的特征图输入第二检测模型,以得到待识别目标的检测结果。
  8. 根据权利要求7所述的目标检测方法,其中,所述对图像数据进行特征编码,以得到第二特征图包括:
    对图像数据进行语义分割,以得到图像数据中每个像素点的语义信息;
    根据图像数据中每个像素点的语义信息和坐标系转换关系,确定与所述像素点对应的点云点的语义信息;
    对所述点云点的语义信息进行特征编码,以得到第二特征图。
  9. 根据权利要求8所述的目标检测方法,其中,所述对所述点云点的语义信息进行特征编码,以得到第二特征图包括:
    对所述点云点的语义信息进行体素化编码,以得到体素特征图;
    根据所述体素特征图,生成鸟瞰特征图;
    对所述鸟瞰特征图进行下采样,以得到第二特征图,其中,所述第二特征图与所述第一特征图的尺寸一致。
  10. 一种目标检测装置,包括:
    获取模块,被配置为获取待处理传感器数据,其中,所述待处理传感器数据包括点云数据;
    确定模块,被配置为根据所述待处理传感器数据是否包括与所述点云数据对应的图像数据,确定待启用的检测模型,其中,所述检测模型包括第一检测模型和第二检测模型,所述第一检测模型是基于点云样本数据训练得到的,所述第二检测模型是基于点云样本数据和图像样本数据训练得到的;
    检测模块,被配置为基于所述待启用的检测模型对所述待处理传感器数据进行处理,以得到待识别目标的检测结果。
  11. 根据权利要求10所述的目标检测装置,其中,所述确定模块被配置为:
    在所述待处理传感器数据不包括与点云数据对应的图像数据的情况下,将所述第一检测模型作为待启用的检测模型;
    在所述待处理传感器数据包括与点云数据对应的图像数据的情况下,将第二检测模型作为待启用的检测模型。
  12. 根据权利要求11所述的目标检测装置,其中,所述确定模块被配置为:
    在所述待处理传感器数据不包括图像数据、或者所述待处理传感器数据包括的图像数据和点云数据的时间戳不一致的情况下,将所述第一检测模型作为待启用的检测模型。
  13. 一种目标检测装置,包括:
    存储器;以及
    耦接至所述存储器的处理器,所述处理器被配置为基于存储在所述存储器的指令执行如权利要求1至9任一项所述的目标检测方法。
  14. 一种计算机可读存储介质,其上存储有计算机程序指令,该指令被处理器执 行时实现权利要求1至9任一项所述的目标检测方法。
  15. 一种无人车,包括:
    如权利要求10至13任一所述的目标检测装置。
  16. 一种计算机程序,包括:
    指令,所述指令当由处理器执行时使所述处理器执行根据权利要求1-9中任一项所述的目标检测方法。
PCT/CN2022/140352 2022-05-05 2022-12-20 目标检测方法、装置和无人车 WO2023213083A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210480445.9 2022-05-05
CN202210480445.9A CN114821131A (zh) 2022-05-05 2022-05-05 目标检测方法、装置和无人车

Publications (1)

Publication Number Publication Date
WO2023213083A1 true WO2023213083A1 (zh) 2023-11-09

Family

ID=82511990

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/140352 WO2023213083A1 (zh) 2022-05-05 2022-12-20 目标检测方法、装置和无人车

Country Status (2)

Country Link
CN (1) CN114821131A (zh)
WO (1) WO2023213083A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114821131A (zh) * 2022-05-05 2022-07-29 北京京东乾石科技有限公司 目标检测方法、装置和无人车

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111862101A (zh) * 2020-07-15 2020-10-30 西安交通大学 一种鸟瞰图编码视角下的3d点云语义分割方法
CN113256740A (zh) * 2021-06-29 2021-08-13 湖北亿咖通科技有限公司 一种雷达与相机的标定方法、电子设备及存储介质
CN113378760A (zh) * 2021-06-25 2021-09-10 北京百度网讯科技有限公司 训练目标检测模型和检测目标的方法及装置
CN113887349A (zh) * 2021-09-18 2022-01-04 浙江大学 一种基于图像和点云融合网络的道路区域图像识别方法
CN114821131A (zh) * 2022-05-05 2022-07-29 北京京东乾石科技有限公司 目标检测方法、装置和无人车

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111862101A (zh) * 2020-07-15 2020-10-30 西安交通大学 一种鸟瞰图编码视角下的3d点云语义分割方法
CN113378760A (zh) * 2021-06-25 2021-09-10 北京百度网讯科技有限公司 训练目标检测模型和检测目标的方法及装置
CN113256740A (zh) * 2021-06-29 2021-08-13 湖北亿咖通科技有限公司 一种雷达与相机的标定方法、电子设备及存储介质
CN113887349A (zh) * 2021-09-18 2022-01-04 浙江大学 一种基于图像和点云融合网络的道路区域图像识别方法
CN114821131A (zh) * 2022-05-05 2022-07-29 北京京东乾石科技有限公司 目标检测方法、装置和无人车

Also Published As

Publication number Publication date
CN114821131A (zh) 2022-07-29

Similar Documents

Publication Publication Date Title
EP3627180B1 (en) Sensor calibration method and device, computer device, medium, and vehicle
WO2020052540A1 (zh) 对象标注方法、移动控制方法、装置、设备及存储介质
US10317901B2 (en) Low-level sensor fusion
US10740658B2 (en) Object recognition and classification using multiple sensor modalities
CN106255899B (zh) 用于将对象用信号通知给配备有此装置的车辆的导航模块的装置
US20180067463A1 (en) Sensor event detection and fusion
Aeberhard et al. High-level sensor data fusion architecture for vehicle surround environment perception
US10369993B2 (en) Method and device for monitoring a setpoint trajectory to be traveled by a vehicle for being collision free
US11935250B2 (en) Method, device and computer-readable storage medium with instructions for processing sensor data
EP3564853A2 (en) Obstacle classification method and apparatus based on unmanned vehicle, device, and storage medium
WO2020258901A1 (zh) 传感器数据处理方法、装置、电子设备及系统
CN110873879A (zh) 一种多源异构传感器特征深度融合的装置及方法
US11443151B2 (en) Driving assistant system, electronic device, and operation method thereof
WO2023213083A1 (zh) 目标检测方法、装置和无人车
CN113643431A (zh) 一种用于视觉算法迭代优化的系统及方法
WO2022206414A1 (zh) 三维目标检测方法及装置
US10974730B2 (en) Vehicle perception system on-line diangostics and prognostics
Li et al. Occupancy grid map formation and fusion in cooperative autonomous vehicle sensing
CN114049767A (zh) 一种边缘计算方法、装置及可读存储介质
JP2023539643A (ja) 車両の確認および検証のためのクリティカルシナリオの識別
WO2022237210A1 (zh) 障碍物信息生成
Gruyer et al. PerSEE: A central sensors fusion electronic control unit for the development of perception-based ADAS
Harshalatha et al. LiDAR-Based Advanced Driver Assistance for Vehicles
CN115482679B (zh) 一种自动驾驶盲区预警方法、装置和消息服务器
US20220262136A1 (en) Method and system for estimating a drivable surface

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22940777

Country of ref document: EP

Kind code of ref document: A1