CN116844128A - Target object detection method and device, electronic equipment and storage medium - Google Patents

Target object detection method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN116844128A
CN116844128A CN202310897182.6A CN202310897182A CN116844128A CN 116844128 A CN116844128 A CN 116844128A CN 202310897182 A CN202310897182 A CN 202310897182A CN 116844128 A CN116844128 A CN 116844128A
Authority
CN
China
Prior art keywords
target
point cloud
cloud data
feature
bev
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310897182.6A
Other languages
Chinese (zh)
Inventor
李广敬
王晓东
张天雷
王超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Zhuxian Technology Co Ltd
Original Assignee
Beijing Zhuxian Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Zhuxian Technology Co Ltd filed Critical Beijing Zhuxian Technology Co Ltd
Priority to CN202310897182.6A priority Critical patent/CN116844128A/en
Publication of CN116844128A publication Critical patent/CN116844128A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S17/00Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
    • G01S17/88Lidar systems specially adapted for specific applications
    • G01S17/89Lidar systems specially adapted for specific applications for mapping or imaging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Electromagnetism (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Traffic Control Systems (AREA)

Abstract

The application provides a target object detection method, a target object detection device, electronic equipment and a storage medium, and relates to the technical field of automatic driving. The method comprises the following steps: acquiring target point cloud data, service information and calculation force values of a target vehicle in the surrounding environment of a body of the target vehicle; wherein the service information includes a service type; determining corresponding feature types according to the calculated force value of the target vehicle, and respectively extracting the features of the cloud data of the target point under each feature type when the feature types are multiple; wherein different feature types correspond to different viewing angles; carrying out fusion processing on the characteristics of the cloud data of the target point under all characteristic types to obtain fusion characteristics; and processing the fusion characteristics according to the network model corresponding to the service type to detect the target object in the surrounding environment of the vehicle body and obtain a detection result. The method improves the accuracy of detection of the target object, and further improves the safety of automatic driving of the vehicle.

Description

Target object detection method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of autopilot technology, and in particular, to a method and apparatus for detecting a target object, an electronic device, and a storage medium.
Background
The laser radar is widely applied to high-level automatic driving environment sensing by the characteristics of high ranging precision, high stability and the like. In recent years, along with the reduction of the cost of the laser radar sensor and the promotion of the vehicle-scale products, the laser radar sensor is also gradually applied to consumer-scale mass-production vehicle types.
Currently, there are network algorithms based on features such as single Point, voxel volume, bird's Eye View (BEV), and depth map (RV) in the lidar detection algorithm. Each algorithm has its own drawbacks, in which: the algorithm based on the Point and the pixel features has high demand on computing resources due to the adoption of a 3D convolution network; to solve this problem, it is a common method to project three-dimensional point cloud data to two-dimensional viewing angles such as BEV or RV and to use a 2D convolution network. While BEV features more closely match the motion scene of a vehicle on a road plane, their sparsity limits the detection of distant objects; whereas RV features, while being dense and capable of quickly querying neighborhood relationships, face the problem of varying near-far dimensions.
Therefore, in the prior art, the detection accuracy of the target object is poor, and the problems that the speed limit signpost, the right turn prohibition signpost and the like cause that the vehicle continues to run at overspeed or at right turn, so that the target vehicle and other vehicles are easy to collide, and the automatic driving safety of the vehicle is poor are solved.
Disclosure of Invention
The application provides a target object detection method, a target object detection device, electronic equipment and a storage medium, which are used for solving the problem of poor automatic driving safety of a vehicle caused by poor detection accuracy of a target object.
According to a first aspect of the present application, there is provided a method of detecting a target object, comprising:
acquiring target point cloud data, service information and calculation force values of a target vehicle in the surrounding environment of a body of the target vehicle; wherein the service information comprises a service type;
determining corresponding feature types according to the calculated force value of the target vehicle, and respectively extracting the features of the target point cloud data under each feature type when the feature types are multiple; wherein different feature types correspond to different viewing angles;
carrying out fusion processing on the characteristics of the cloud data of the target point under all characteristic types to obtain fusion characteristics;
and processing the fusion characteristics according to the network model corresponding to the service type so as to detect the target object in the surrounding environment of the vehicle body and obtain a detection result.
Optionally, the determining the corresponding feature type according to the calculated force value of the target vehicle includes any one of the following:
When the calculated force value of the target vehicle is not smaller than a first preset threshold value, determining that the corresponding feature type is a single-point feature type, a bird-eye view BEV feature type and a depth map RV feature type;
when the calculated force value of the target vehicle is larger than a second preset threshold value and smaller than a first preset threshold value, determining that the corresponding feature type is at least two of a single-point feature type, a BEV feature type and an RV feature type; wherein the first preset threshold is higher than the second preset threshold;
and when the calculated force value of the target vehicle is not greater than a second preset threshold value, determining that the corresponding feature type is at least one of a single-point feature type, a BEV feature type and an RV feature type.
Optionally, the extracting the features of the target point cloud data under each feature type includes at least one of the following:
performing point feature extraction on the target point cloud data by adopting a preset single point feature extraction network to obtain single point features;
performing projection processing of a BEV view angle on the target point cloud data to obtain a first BEV feature of the target point cloud data under the BEV view angle, and inputting the first BEV feature into a preset BEV feature extraction network to obtain a second BEV feature of the target point cloud data under a BEV feature type;
And carrying out projection processing on the target point cloud data at an RV view angle to obtain a first RV characteristic of the target point cloud data at the RV view angle, and inputting the first RV characteristic into a preset RV characteristic extraction network to obtain a second RV characteristic of the target point cloud data at the RV characteristic type.
Optionally, the performing the projection processing of the BEV view angle on the target point cloud data includes:
performing projection processing on the target point cloud data under the BEV view angle by using a first target projection mode; wherein the first target projection mode comprises at least one of the following: grid projection mode and pilar projection mode.
Optionally, the performing the projection processing of the RV perspective on the target point cloud data includes:
carrying out projection processing on the cloud data of the target point under the RV view angle by using a second target projection mode; wherein the second target projection mode comprises at least one of the following: cylindrical view projection, spherical view projection, and annular azimuth projection.
Optionally, the fusing the features of the target point cloud data under all feature types to obtain fused features includes:
performing back projection processing on the second BEV characteristic and/or the second RV characteristic to obtain a BEV point cloud characteristic and/or an RV point cloud characteristic;
And splicing the BEV point cloud features, the RV point cloud features and/or the single-point features to obtain fusion features.
Optionally, the acquiring the target point cloud data of the surrounding environment of the body of the target vehicle includes:
acquiring original point cloud data of the surrounding environment of a target vehicle body; the original point cloud data are obtained by scanning the surrounding environment of a target vehicle body by laser radar equipment arranged on the vehicle body;
preprocessing the original point cloud data to obtain the target point cloud data; the target point cloud data comprise coordinate data and reflection intensity values of all target points; the target point is a reflection point of a plurality of laser beams emitted by laser radar equipment contained in the original point cloud data, and each reflection point has a respective reflection intensity value.
According to a second aspect of the present application, there is provided a detection apparatus for a target object, comprising:
the acquisition module is used for acquiring target point cloud data of the surrounding environment of the body of the target vehicle, service information and a calculated force value of the target vehicle; wherein the service information comprises a service type;
the determining and extracting module is used for determining corresponding feature types according to the calculated force value of the target vehicle, and extracting the features of the cloud data of the target point under each feature type when the feature types are multiple; wherein different feature types correspond to different viewing angles;
The fusion processing module is used for carrying out fusion processing on the characteristics of the target point cloud data under all characteristic types to obtain fusion characteristics;
and the detection module is used for processing the fusion characteristics according to the network model corresponding to the service type so as to detect the target object in the surrounding environment of the vehicle body and obtain a detection result.
According to a third aspect of the present application, there is provided an electronic device comprising: at least one processor and memory;
the memory stores computer-executable instructions;
the at least one processor executes computer-executable instructions stored in the memory, causing the at least one processor to perform the method of detecting an object as described in the first aspect above.
According to a fourth aspect of the present application, there is provided a computer-readable storage medium having stored therein computer-executable instructions for implementing the method of detecting an object as described in the first aspect above when executed by a processor.
According to a fifth aspect of the present application, there is provided a computer program product comprising a computer program which, when executed by a processor, implements the method of detecting an object according to the first aspect.
The application provides a method for detecting a target object, which comprises the following steps: acquiring target point cloud data, service information and calculation force values of a target vehicle in the surrounding environment of a body of the target vehicle; wherein the service information includes a service type; determining corresponding feature types according to the calculated force value of the target vehicle, and respectively extracting the features of the cloud data of the target point under each feature type when the feature types are multiple; wherein different feature types correspond to different viewing angles; carrying out fusion processing on the characteristics of the cloud data of the target point under all characteristic types to obtain fusion characteristics; and processing the fusion characteristics according to the network model corresponding to the service type to detect the target object in the surrounding environment of the vehicle body and obtain a detection result.
According to the application, under the condition of considering the calculated force value of the target vehicle, the feature types corresponding to different visual angles are provided, so that the features of the target point cloud data under each feature type are respectively extracted, and as the number of the feature types is multiple, the fusion features obtained after the features of the target point cloud data under each feature type are fused can have the advantages of the features of the target point cloud data under each feature type, the detection accuracy of the target object is improved, and the safety of automatic driving of the vehicle is further improved.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the application or to delineate the scope of the application. Other features of the present application will become apparent from the description that follows.
Drawings
Fig. 1 is a schematic flow chart of a method for detecting a target object according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a BEV view provided by an embodiment of the present application;
FIG. 3 is a schematic view of an RV view according to an embodiment of the present application;
FIG. 4 is a schematic flow chart of another method for detecting a target object according to an embodiment of the present application;
FIG. 5 is a schematic diagram of a multi-feature fusion process provided in an embodiment of the present application;
fig. 6 is a schematic structural diagram of a detection device for a target object according to an embodiment of the present application;
fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the application.
In the prior art, three-dimensional point cloud data are projected to two-dimensional view angles such as BEV or RV, and a 2D convolution network is adopted as a common method. While BEV features more closely match the motion scene of a vehicle on a road plane, their sparsity limits the detection of distant targets; whereas RV features, while being dense and capable of quickly querying neighborhood relationships, face the problem of varying near-far dimensions.
However, the prior art has the problems of poor detection accuracy of the target object and poor automatic driving safety of the vehicle.
In order to solve the technical problems, the application provides a target object detection method applied to the field of automatic driving and used for improving the detection accuracy of a target object and further improving the safety of automatic driving of a vehicle.
The following describes the technical scheme of the present application and how the technical scheme of the present application solves the above technical problems in detail with specific embodiments. The following embodiments may be combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments. Embodiments of the present application will be described below with reference to the accompanying drawings.
Fig. 1 is a flow chart of a method for detecting a target object according to an embodiment of the present application. As shown in fig. 1, the method of the present embodiment includes the following steps:
S10, acquiring target point cloud data of the surrounding environment of a body of a target vehicle, service information and a calculated force value of the target vehicle; wherein the service information includes a service type.
In the embodiment of the application, the vehicle body surrounding environment is the vehicle body surrounding environment of the target vehicle. The service information may be data segmentation, and/or detection of the target object. The calculated force value may be obtained from a processing device on the target vehicle.
S20, determining corresponding feature types according to the calculated force value of the target vehicle, and respectively extracting the features of the cloud data of the target point under each feature type when the feature types are multiple; wherein different feature types correspond to different viewing angles.
It should be understood that feature types include, but are not limited to: single point feature type, BEV feature type, RV feature type, etc. The detailed description of this step S20 is as follows, and will not be repeated here.
S30, fusing the characteristics of the cloud data of the target point under all characteristic types to obtain fused characteristics.
In the embodiment of the application, the above-mentioned fusion Feature is also called a Multi-View network Feature, and the image formed by the fusion Feature is Multi-View-Feature. The fusion features are fused with the features of the target point cloud data under different view angles, and compared with the features of the target point cloud data under each view angle, the fusion features have the advantages of being richer in information and stronger in robustness, and provide data support for the features of richer in information and stronger in robustness extracted by the network model in the subsequent step S40, so that the overall performance of the network can be improved.
And S40, processing the fusion characteristics according to the network model corresponding to the service type so as to detect the target object in the surrounding environment of the vehicle body and obtain a detection result.
It should be appreciated that after step S30 is performed, the fusion features may be subsequently fed into the corresponding network model as input information according to specific task requirements. Wherein the service types include, but are not limited to: a target detection type, a point cloud segmentation type, and the like. When the service type is a target detection type, the network model corresponding to the target detection type may be a target detection network PointPillars, centerPoint or the like; such objects include, but are not limited to: other vehicles, lane lines, etc. When the service type is a point cloud segmentation type, the network model corresponding to the point cloud segmentation type may be a point cloud segmentation network Cylinder3D, polarNet or the like.
Taking the above target detection network PointPicloras as an example, the following analysis is performed in step S40: in this embodiment, the image Multi-View-Feature formed by the fusion Feature is used as a network input, and the accurate detection of the target object is realized through the execution steps of the structures including a network pilarogen, a point cloud Feature extraction network (Pillar Feature Network, PFN), a Backbone network backhaul, a detection head and the like contained in the target detection network pointpilars.
The embodiment of the application is not limited by distance and has the same distance and near scale, and can adjust the automatic driving strategy in time under the condition that the speed limit signpost, the right turn prohibition signpost and other targets are detected in time, thereby not only achieving the detection precision of the targets, but also achieving the application.
According to the embodiment of the application, under the condition of considering the calculated force value of the target vehicle, the feature types corresponding to different visual angles are provided, the features of the target point cloud data under each feature type are extracted respectively, and as the number of the feature types is multiple, the fused features obtained after the features of the target point cloud data under each feature type are fused can have the advantages of the features of the target point cloud data under each feature type, the detection accuracy of the target object is improved, and the safety of automatic driving of the vehicle is further improved.
In a possible implementation manner, on the basis of fig. 1, the specific implementation manner of step S10 in fig. 1 is described in more detail in this embodiment. Specifically, in step S10, target point cloud data of the surrounding environment of the body of the target vehicle is acquired, including the steps of:
S101, acquiring original point cloud data of the surrounding environment of a body of a target vehicle; the original point cloud data are obtained by scanning the surrounding environment of the vehicle body through laser radar equipment arranged on the vehicle body of the target vehicle. It should be understood that the above-described lidar apparatus may be simply referred to as a lidar.
S102, preprocessing original point cloud data to obtain target point cloud data; the target point cloud data comprise coordinate data and reflection intensity values of all target points; the target point is a reflection point of a plurality of laser beams emitted by laser radar equipment contained in the original point cloud data, and each reflection point has a respective reflection intensity value.
It should be understood that preprocessing may refer to performing formal unification. The type of pretreatment in the embodiment of the application is not particularly limited.
In the embodiment of the present application, step S102 may obtain the cloud data of the target point by establishing a three-dimensional coordinate system and projecting the cloud data of the original point to the three-dimensional coordinate system.
According to the embodiment of the application, the original point cloud data of the laser radar is read, and the data characteristics of the laser radar in the forms of machinery, semi-solid state, solid state and the like are not relied on, wherein the point cloud data is represented by three-dimensional coordinates (x, y, z) and reflection intensity i thereof under an oxyz coordinate system. The target point cloud data is symbolized as: Where 3 denotes three dimensions and 1 denotes the dimension of the reflected intensity i.
The embodiment of the application describes the acquisition process of the cloud data of the target point in detail, and provides data support for subsequent operation.
In a possible implementation manner, on the basis of fig. 1, the specific implementation manner of step S20 in fig. 1 is described in more detail in this embodiment. Specifically, in step S20, according to the calculated force value of the target vehicle, a corresponding feature type is determined, including any one of the following:
and when the calculated force value of the target vehicle is not smaller than a first preset threshold value, determining that the corresponding feature type is a single-point feature type, a bird-eye view BEV feature type and a depth map RV feature type.
When the calculated force value of the target vehicle is larger than a second preset threshold value and smaller than a first preset threshold value, determining that the corresponding feature type is at least two of a single-point feature type, a BEV feature type and an RV feature type; wherein the first preset threshold is higher than the second preset threshold.
And when the calculated force value of the target vehicle is not greater than a second preset threshold value, determining that the corresponding feature type is at least one of a single-point feature type, a BEV feature type and an RV feature type.
It should be appreciated that the determination of different feature types can ensure that embodiments of the present application can freely combine features at different viewing angles. Due to the free combination of different feature types, the embodiment can ensure the subsequent fusion of different view angle projection modes, thereby improving the detection accuracy of the target object.
In a possible implementation manner, on the basis of fig. 1, the specific implementation manner of step S20 in fig. 1 is described in more detail in this embodiment. Specifically, in step S20, features of the target point cloud data under each feature type are extracted respectively, including at least one of the following:
the method comprises the steps of performing point feature extraction on target point cloud data by adopting a preset single point feature extraction network to obtain single point features. It should be appreciated that the preset single Point feature extraction network is also referred to as Point-NET. The embodiment of the application can extract the single-Point characteristic P on each reflection Point in the target Point cloud data by utilizing the Point-NET c′
The second item is that the target point cloud data is subjected to projection processing of the BEV view angle to obtain a first BEV characteristic of the target point cloud data under the BEV view angle, and the first BEV characteristic is input into a preset BEV characteristic extraction network to obtain a second BEV characteristic of the target point cloud data under the BEV characteristic type. It should be appreciated that the preset BEV feature extraction network, also known as BEV-NET, may employ a two-dimensional convolutional network such as res NET or the like for feature extraction.
Specifically, the cloud data of the target point is projected onto the BEV viewing angle, and the obtained first BEV characteristic is recorded asWherein chw represents the characteristics of three dimensions of channel number channel, height and width, and records the coordinate position Coars of each reflection point on the BEV view BEV . First BEV feature->As a network characteristic input to a two-dimensional convolution network, which extracts the second BEV characteristic +.>
And thirdly, carrying out projection processing on the target point cloud data at the RV view angle to obtain a first RV characteristic of the target point cloud data at the RV view angle, and inputting the first RV characteristic into a preset RV characteristic extraction network to obtain a second RV characteristic of the target point cloud data at the RV characteristic type. It should be appreciated that the pre-set RV feature extraction network, also referred to as RV-NET, may employ a two-dimensional convolutional network for feature extraction.
Specifically, the cloud data of the target point is projected onto the RV viewing angle, and the obtained first RV characteristic is recorded asWherein c' hw represents the characteristics of three dimensions of channel number channel, height and width, and records the coordinate position Coars of each reflection point on RV view RV . First RV feature->As a network characteristic input to a two-dimensional convolutional network, which extracts the second RV characteristic +.>
For the same application scenario, the BEV view formed by the first BEV feature is shown in fig. 2, and includes a road, a vehicle, a height limiting device, a vehicle in front of the vehicle, etc., where these objects are small enough to be considered as a point, and if they have a certain length, they are considered as a line. The RV view formed by the first RV feature is shown in fig. 3, with roads, vehicles, height limiting devices, vehicles ahead, etc. being observable by the human eye in fig. 3.
The features extracted under different views in the embodiment can provide data support for subsequent fusion processing, so that the accuracy of target object detection is improved.
In a possible implementation manner, in the second item, performing a projection process of the BEV perspective on the target point cloud data, including:
carrying out projection processing on the cloud data of the target point under the BEV view angle by utilizing a first target projection mode; the first target projection mode comprises at least one of the following: grid projection mode and pilar projection mode.
It should be appreciated that the first target projection means includes, but is not limited to: grid projection, pilar projection, and the like. The grid projection mode allows only the point in the BEV view where the z-value is highest to remain in the pixel.
The embodiment of the application refines the projection processing mode of the BEV visual angle, thereby providing data support for the subsequent fusion processing.
In a possible implementation manner, in a third item, performing projection processing of RV perspective on the target point cloud data includes:
carrying out projection processing on the cloud data of the target point under the RV view angle by using a second target projection mode; the second target projection mode comprises at least one of the following: cylindrical view projection, spherical view projection, and annular azimuth projection.
It should be appreciated that the second target projection means includes, but is not limited to: cylindrical View projection mode Cylindrical-View, spherical View projection mode Spherical-View, ring-Azimuth projection mode Ring-Azimuth-View, etc.
The pixel sitting mark in the RV View obtained by the Cylindrical View projection mode Cylindrical-View isWherein (1)>Z i =z i And (x) i ,y i ,z i )∈(x,y,z)。
The pixel sitting mark in RV View obtained by the Spherical View projection mode spatial-View is as followsWherein (1)>
The pixel sitting mark in the RV View obtained by the Ring-Azimuth-View projection mode isWherein r is i For the ring-id to which the point belongs, < ->
According to the embodiment of the application, the projection processing mode of the RV visual angle is thinned, and then data support is provided for subsequent fusion processing.
In a possible implementation manner, on the basis of fig. 1, the specific implementation manner of step S30 in fig. 1 is described in more detail in this embodiment. Specifically, step S30, performing fusion processing on the characteristics of the target point cloud data under all the characteristic types to obtain fusion characteristics, including the following steps:
and step S301, performing back projection processing on the second BEV characteristic and/or the second RV characteristic to obtain the BEV point cloud characteristic and/or the RV point cloud characteristic.
During the back projection process, a second BEV featureInterpolation is carried out on the point cloud sequence corresponding to the target point cloud data, and BEV point cloud characteristics under the BEV view angle are obtained>Similarly, second RV signature +.>Interpolation is carried out on the point cloud sequence corresponding to the target point cloud data, and RV point cloud characteristics +.>
In the embodiment of the application, the BEV point cloud view formed by the BEV point cloud features is similar to the representation form of the BEV view in FIG. 2, and the difference is that the BEV point cloud view is more concise. Similarly, the RV point cloud view formed by RV point cloud features is similar to the representation of the RV view in fig. 3, except that the RV point cloud view is more concise.
And step S302, the BEV point cloud features, the RV point cloud features and/or the single-point features are spliced to obtain fusion features.
It should be appreciated that BEV point cloud features, RV point cloud features, single point features may all be view features. In this embodiment, in order to realize fusion of multi-view features, features under each view are interpolated back into a point cloud sequence corresponding to the target point cloud data, and then are spliced and fused. In particular, BEV point cloud featuresRV Point cloud characteristics->And single point feature P c′ Fusing according to the point sequence to obtain fusion characteristics +.>The fusion feature can be added according to the task requirement >And the processed data are input into a network model corresponding to the service type.
The BEV point cloud features are more suitable for representing the movement condition of the vehicle on the plane, and under the condition of combining RV point cloud features, the fusion features can meet the application requirements and simultaneously can meet the precision of automatic detection. Therefore, under the condition of fusing point cloud features under various different visual angles, the fusion features are not constrained by the visual angle projection mode, so that the detection accuracy of the target object is improved, and the safety of automatic driving is further improved.
On the basis of the above embodiments, the technical scheme of the present application will be described in more detail below in conjunction with specific embodiments.
Fig. 4 is a flow chart of another method for detecting a target object according to an embodiment of the present application. As can be seen from fig. 4, the method of the present embodiment includes the following steps:
s1, acquiring point cloud data, and inputting the point cloud data into a target feature fusion network for preprocessing, projecting, feature extraction, back projecting and fusion processing of the point cloud data by the target feature fusion network to obtain fusion features.
The present embodiment relates generally to a target feature fusion network, which mainly includes a plurality of view feature extraction networks and a fusion network, where the plurality of view feature extraction networks include: BEV feature extraction network, RV feature extraction network, and single point feature extraction network. After the multiple view feature extraction networks respectively perform corresponding feature extraction, the fusion network can fuse the features of the point cloud data under each view angle, and provide better feature input for a network model corresponding to a subsequent service type.
S2, inputting the fusion features into a network model corresponding to the service type, so that the network model corresponding to the service type can perform various processes on the fusion features to obtain a detection result.
In the embodiment of the present application, the network model corresponding to the service type may be understood as a deep network, and the deep network may include a network backbone.
It should be noted that different traffic types correspond to different network models, for example: the data segmentation service corresponds to the segmentation network, and the detection task of the target object corresponds to the detection network.
The step S1 in the method for detecting the target object is used as a method for fusion of characteristics of a multi-view network of the laser radar, and can ensure the rapidness and high efficiency of an automatic driving system by freely combining the characteristics of point cloud data under different characteristic types through a calculation value, and on the basis, the advantage of the point cloud characteristics under different view angles is combined, so that better characteristics are provided for a deep network.
In the embodiment of the present application, the target feature fusion network in the step S1 may implement a multi-feature fusion processing flow, where the multi-feature fusion processing flow is shown in fig. 5 and includes operations such as preprocessing, projection under different views, extraction of different features through different feature extraction networks, and fusion. As can be seen in fig. 5, the multi-feature fusion processing method includes the following steps:
S11, acquiring point cloud data, and preprocessing the point cloud data to obtain preprocessed point cloud data.
And S12, respectively carrying out projection under the BEV view and the RV view on the preprocessed point cloud data to obtain a first BEV feature and a first RV feature.
And S13, performing feature extraction by using the BEV feature extraction network, the RV feature extraction network and the single-point feature extraction network respectively to obtain a second BEV feature, a second RV feature and a single-point feature.
As can be seen from the above embodiment 1, the BEV feature extraction network is also called BEV-NET, the RV feature extraction network is also called RV-NET, and the single Point feature extraction network is also called Point-NET.
And S14, performing BEV back projection on the second BEV characteristic, and performing RV back projection on the second RV characteristic to obtain a BEV point cloud characteristic and an RV point cloud characteristic.
And S15, fusing the single-point feature in the step S13 with the BEV point cloud feature and the RV point cloud feature in the step S14 to obtain a fused feature.
As can be seen from example 1, the above-described fusion Feature, or referred to as a Multi-View network Feature, forms an image that is Multi-View-Feature.
According to the embodiment of the application, under the condition of considering the calculated force value of the target vehicle, the feature types corresponding to different visual angles are provided, the features of the target point cloud data under each feature type are extracted respectively, and as the number of the feature types is multiple, the fused features obtained after the features of the target point cloud data under each feature type are fused can have the advantages of the features of the target point cloud data under each feature type, the detection accuracy of the target object is improved, and the safety of automatic driving of the vehicle is further improved.
Fig. 6 is a schematic structural diagram of a detection device for a target object according to an embodiment of the present application. The apparatus of this embodiment may be in the form of software and/or hardware. As shown in fig. 6, the apparatus for detecting a target object provided in this embodiment includes: an acquisition module 61, a determination extraction module 62, a fusion processing module 63 and a detection module 64. Wherein:
the acquiring module 61 is configured to acquire target point cloud data of a surrounding environment of a body of the target vehicle, service information, and an algorithm value of the target vehicle; wherein the service information includes a service type.
The determining and extracting module 62 is configured to determine a corresponding feature type according to the calculated force value of the target vehicle, and extract features of the target point cloud data under each feature type when the feature type is multiple; wherein different feature types correspond to different viewing angles.
And the fusion processing module 63 is used for carrying out fusion processing on the characteristics of the target point cloud data under all characteristic types to obtain fusion characteristics.
And the detection module 64 is used for processing the fusion characteristics according to the network model corresponding to the service type so as to detect the target objects in the surrounding environment of the vehicle body and obtain a detection result.
In a possible implementation manner, the detection device of the target object is further configured to perform any one of the following:
And when the calculated force value of the target vehicle is not smaller than a first preset threshold value, determining that the corresponding feature type is a single-point feature type, a BEV feature type and an RV feature type.
When the calculated force value of the target vehicle is larger than a second preset threshold value and smaller than a first preset threshold value, determining that the corresponding feature type is at least two of a single-point feature type, a BEV feature type and an RV feature type; wherein the first preset threshold is higher than the second preset threshold.
And when the calculated force value of the target vehicle is not greater than a second preset threshold value, determining that the corresponding feature type is at least one of a single-point feature type, a BEV feature type and an RV feature type.
In a possible implementation manner, the detection device of the target object is further configured to perform any one of the following:
the method comprises the steps of performing point feature extraction on target point cloud data by adopting a preset single point feature extraction network to obtain single point features.
The second item is that the target point cloud data is subjected to projection processing of the BEV view angle to obtain a first BEV characteristic of the target point cloud data under the BEV view angle, and the first BEV characteristic is input into a preset BEV characteristic extraction network to obtain a second BEV characteristic of the target point cloud data under the BEV characteristic type.
And thirdly, carrying out projection processing on the target point cloud data at the RV view angle to obtain a first RV characteristic of the target point cloud data at the RV view angle, and inputting the first RV characteristic into a preset RV characteristic extraction network to obtain a second RV characteristic of the target point cloud data at the RV characteristic type.
In a possible implementation manner, the detection device of the target object is further used for:
carrying out projection processing on the cloud data of the target point under the BEV view angle by utilizing a first target projection mode; the first target projection mode comprises at least one of the following: grid projection mode and pilar projection mode.
In a possible implementation manner, the detection device of the target object is further used for:
carrying out projection processing on the cloud data of the target point under the RV view angle by using a second target projection mode; the second target projection mode comprises at least one of the following: cylindrical view projection, spherical view projection, and annular azimuth projection.
In a possible implementation manner, the fusion processing module 63 is further configured to:
and carrying out back projection processing on the second BEV characteristic and/or the second RV characteristic to obtain the BEV point cloud characteristic and/or the RV point cloud characteristic.
And splicing the BEV point cloud features, the RV point cloud features and/or the single-point features to obtain fusion features.
In a possible implementation manner, the obtaining module 61 is further configured to:
acquiring original point cloud data of the surrounding environment of a target vehicle body; the original point cloud data are obtained by scanning the surrounding environment of the vehicle body through laser radar equipment arranged on the vehicle body of the target vehicle.
Preprocessing the original point cloud data to obtain target point cloud data; the target point cloud data comprise coordinate data and reflection intensity values of all target points; the target point is a reflection point of a plurality of laser beams emitted by laser radar equipment contained in the original point cloud data, and each reflection point has a respective reflection intensity value.
The detection device for the target object provided in this embodiment may be used to execute the detection method for the target object provided in any of the above method embodiments, and its implementation principle and technical effects are similar, and are not described here again.
It should be noted that, the user information and data related to the present application (including but not limited to data for analysis, stored data, displayed data, etc.) are all information and data authorized by the user or fully authorized by each party, and the collection, use and processing of the related data need to comply with the related laws and regulations and standards of the related country and region, and provide corresponding operation entries for the user to select authorization or rejection.
That is, in the technical scheme of the application, the related processes of collecting, storing, using, processing, transmitting, providing, disclosing and the like of the personal information of the user accord with the regulations of the related laws and regulations, and the public welfare is not violated.
According to an embodiment of the present application, the present application also provides an electronic device and a readable storage medium.
Fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application. The electronic device comprises a receiver 70, a transmitter 71, at least one processor 72 and a memory 73, and the electronic device formed by the above components may be used to implement the above-mentioned specific embodiments of the present application, which are not described here again.
The embodiment of the application also provides a computer readable storage medium, wherein computer executable instructions are stored in the computer readable storage medium, and when the processor executes the computer executable instructions, the steps of the method in the embodiment are realized.
The embodiments of the present application also provide a computer program product comprising a computer program which, when executed by a processor, implements the steps of the method of the above embodiments.
Various implementations of the above-described systems and techniques of the application may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for carrying out methods of the present application may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or electronic device.
In the context of the present application, a computer-readable storage medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The computer readable storage medium may be a machine readable signal medium or a machine readable storage medium. The computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a computer-readable storage medium would include one or more wire-based electrical connections, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data electronic device), or that includes a middleware component (e.g., an application electronic device), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present disclosure may be performed in parallel, sequentially, or in a different order, provided that the desired results of the technical solutions of the present disclosure are achieved, and are not limited herein.
The above embodiments do not limit the scope of the present application. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the principle of the present application should be included in the protection scope of the present application.

Claims (10)

1. A method for detecting an object, comprising:
acquiring target point cloud data, service information and calculation force values of a target vehicle in the surrounding environment of a body of the target vehicle; wherein the service information comprises a service type;
determining corresponding feature types according to the calculated force value of the target vehicle, and respectively extracting the features of the target point cloud data under each feature type when the feature types are multiple; wherein different feature types correspond to different viewing angles;
Carrying out fusion processing on the characteristics of the cloud data of the target point under all characteristic types to obtain fusion characteristics;
and processing the fusion characteristics according to the network model corresponding to the service type so as to detect the target object in the surrounding environment of the vehicle body and obtain a detection result.
2. The method of claim 1, wherein the determining the corresponding feature type from the calculated force value of the target vehicle comprises any one of:
when the calculated force value of the target vehicle is not smaller than a first preset threshold value, determining that the corresponding feature type is a single-point feature type, a bird-eye view BEV feature type and a depth map RV feature type;
when the calculated force value of the target vehicle is larger than a second preset threshold value and smaller than a first preset threshold value, determining that the corresponding feature type is at least two of a single-point feature type, a BEV feature type and an RV feature type; wherein the first preset threshold is higher than the second preset threshold;
and when the calculated force value of the target vehicle is not greater than a second preset threshold value, determining that the corresponding feature type is at least one of a single-point feature type, a BEV feature type and an RV feature type.
3. The method according to claim 1 or 2, wherein the extracting features of the target point cloud data under each feature type, respectively, comprises at least one of:
performing point feature extraction on the target point cloud data by adopting a preset single point feature extraction network to obtain single point features;
performing projection processing of a BEV view angle on the target point cloud data to obtain a first BEV feature of the target point cloud data under the BEV view angle, and inputting the first BEV feature into a preset BEV feature extraction network to obtain a second BEV feature of the target point cloud data under a BEV feature type;
and carrying out projection processing on the target point cloud data at an RV view angle to obtain a first RV characteristic of the target point cloud data at the RV view angle, and inputting the first RV characteristic into a preset RV characteristic extraction network to obtain a second RV characteristic of the target point cloud data at the RV characteristic type.
4. A method according to claim 3, wherein said subjecting said target point cloud data to BEV perspective projection processing comprises:
performing projection processing on the target point cloud data under the BEV view angle by using a first target projection mode; wherein the first target projection mode comprises at least one of the following: grid projection mode and pilar projection mode.
5. The method of claim 3, wherein the projecting the target point cloud data from RV perspective comprises:
carrying out projection processing on the cloud data of the target point under the RV view angle by using a second target projection mode; wherein the second target projection mode comprises at least one of the following: cylindrical view projection, spherical view projection, and annular azimuth projection.
6. The method of claim 3, wherein the fusing the features of the target point cloud data under all feature types to obtain fused features includes:
performing back projection processing on the second BEV characteristic and/or the second RV characteristic to obtain a BEV point cloud characteristic and/or an RV point cloud characteristic;
and splicing the BEV point cloud features, the RV point cloud features and/or the single-point features to obtain fusion features.
7. The method of claim 1, wherein the acquiring target point cloud data of the surrounding environment of the target vehicle body comprises:
acquiring original point cloud data of the surrounding environment of a target vehicle body; the original point cloud data are obtained by scanning the surrounding environment of a target vehicle body by laser radar equipment arranged on the vehicle body;
Preprocessing the original point cloud data to obtain the target point cloud data; the target point cloud data comprise coordinate data and reflection intensity values of all target points; the target point is a reflection point of a plurality of laser beams emitted by laser radar equipment contained in the original point cloud data, and each reflection point has a respective reflection intensity value.
8. A detection apparatus for an object, comprising:
the acquisition module is used for acquiring target point cloud data of the surrounding environment of the body of the target vehicle, service information and a calculated force value of the target vehicle; wherein the service information comprises a service type;
the determining and extracting module is used for determining corresponding feature types according to the calculated force value of the target vehicle, and extracting the features of the cloud data of the target point under each feature type when the feature types are multiple; wherein different feature types correspond to different viewing angles;
the fusion processing module is used for carrying out fusion processing on the characteristics of the target point cloud data under all characteristic types to obtain fusion characteristics;
and the detection module is used for processing the fusion characteristics according to the network model corresponding to the service type so as to detect the target object in the surrounding environment of the vehicle body and obtain a detection result.
9. An electronic device, comprising: at least one processor and memory;
the memory stores computer-executable instructions;
the at least one processor executing computer-executable instructions stored in the memory causes the at least one processor to perform the method of detecting an object as claimed in any one of claims 1 to 7.
10. A computer-readable storage medium, wherein computer-executable instructions are stored in the computer-readable storage medium, which when executed by a processor, are adapted to carry out the method of detecting an object according to any one of claims 1 to 7.
CN202310897182.6A 2023-07-20 2023-07-20 Target object detection method and device, electronic equipment and storage medium Pending CN116844128A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310897182.6A CN116844128A (en) 2023-07-20 2023-07-20 Target object detection method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310897182.6A CN116844128A (en) 2023-07-20 2023-07-20 Target object detection method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116844128A true CN116844128A (en) 2023-10-03

Family

ID=88167019

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310897182.6A Pending CN116844128A (en) 2023-07-20 2023-07-20 Target object detection method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116844128A (en)

Similar Documents

Publication Publication Date Title
US11632536B2 (en) Method and apparatus for generating three-dimensional (3D) road model
CN111401208B (en) Obstacle detection method and device, electronic equipment and storage medium
CN111192295B (en) Target detection and tracking method, apparatus, and computer-readable storage medium
CN107636680B (en) Obstacle detection method and device
CN113362444B (en) Point cloud data generation method and device, electronic equipment and storage medium
CN110738121A (en) front vehicle detection method and detection system
WO2021072710A1 (en) Point cloud fusion method and system for moving object, and computer storage medium
CN109214348A (en) A kind of obstacle detection method, device, equipment and storage medium
CN113378760A (en) Training target detection model and method and device for detecting target
EP3926360A1 (en) Neural network based methods and systems for object detection using concatenated lidar, radar and camera data sets
JPH07129898A (en) Obstacle detecting device
CN112270272B (en) Method and system for extracting road intersections in high-precision map making
CN111190199B (en) Positioning method, positioning device, computer equipment and readable storage medium
CN109583416B (en) Pseudo lane line identification method and system
CN113366341B (en) Point cloud data processing method and device, storage medium and laser radar system
EP3324359B1 (en) Image processing device and image processing method
US11933884B2 (en) Radar image processing device, radar image processing method, and storage medium
CN113256709A (en) Target detection method, target detection device, computer equipment and storage medium
CN116547562A (en) Point cloud noise filtering method, system and movable platform
CN107563333A (en) A kind of binocular vision gesture identification method and device based on ranging auxiliary
CN116844128A (en) Target object detection method and device, electronic equipment and storage medium
CN116168384A (en) Point cloud target detection method and device, electronic equipment and storage medium
Godfrey et al. Evaluation of Flash LiDAR in Adverse Weather Conditions towards Active Road Vehicle Safety
CN113014899B (en) Binocular image parallax determination method, device and system
CN115236672A (en) Obstacle information generation method, device, equipment and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination