CN116012806B - Vehicle detection method, device, detector, system and model training method - Google Patents

Vehicle detection method, device, detector, system and model training method Download PDF

Info

Publication number
CN116012806B
CN116012806B CN202310318133.2A CN202310318133A CN116012806B CN 116012806 B CN116012806 B CN 116012806B CN 202310318133 A CN202310318133 A CN 202310318133A CN 116012806 B CN116012806 B CN 116012806B
Authority
CN
China
Prior art keywords
map
image
road
feature
vehicle
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310318133.2A
Other languages
Chinese (zh)
Other versions
CN116012806A (en
Inventor
赵云
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202310318133.2A priority Critical patent/CN116012806B/en
Publication of CN116012806A publication Critical patent/CN116012806A/en
Application granted granted Critical
Publication of CN116012806B publication Critical patent/CN116012806B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The application relates to the technical field of automatic driving and discloses a vehicle detection method, a device, a detector, a system and a model training method, which comprise the steps of obtaining a feature depth map corresponding to an image feature map; the image feature map corresponds to an external environment image of the vehicle; acquiring a road layout aerial view of the current position of the automatic driving vehicle; the road layout aerial view comprises road elements; estimating the weight of the image feature belonging to the road element in the image feature map; generating a corrected image of the road element for the characteristic depth map according to the road layout aerial view; correcting the characteristic depth map by using the corrected image and the weight to obtain a corrected characteristic depth map; and detecting the exterior of the automatic driving vehicle according to the corrected feature depth map and the image feature map, and outputting a 3D vehicle detection result. According to the road layout aerial view auxiliary detection method, the road layout aerial view auxiliary detection process is utilized, and due to the fact that the road elements are included, feature conversion accuracy of non-vehicle areas in the feature depth map is improved, and then accuracy of vehicle detection results is improved.

Description

Vehicle detection method, device, detector, system and model training method
Technical Field
The present application relates to the field of autopilot technology, and in particular, to a vehicle detection method, apparatus, detector, system, model training method, and computer readable storage medium.
Background
The automatic driving technology can sense the environment and navigate without manual operation, so as to realize automatic driving of the vehicle. An autonomous vehicle needs to accurately identify vehicles in the surrounding environment and accurately identify, locate and predict the speed. The image data shot by the vehicle-mounted camera is used for detecting the surroundings of the automatic driving vehicle, has the characteristics of low cost and rich information acquisition, and is widely applied to the automatic driving vehicle.
When the automatic driving system detects the vehicle outside the automatic driving vehicle, the automatic driving system reads the image information of the external environment and outputs the position, length, width and height and speed information of the vehicle in the three-dimensional space. The existing detection process comprises the steps of firstly obtaining image features of external environment image information through a deep neural network, mapping the image features into a three-dimensional space through the neural network or camera parameters and the like to form 3D space features or bird view features, and then completing detection of vehicles in the 3D space or under a bird view coordinate system. However, the accurate depth estimation is difficult to be completed only based on the image information, and in the feature mapping process, dislocation and confusion between 3D space features and real objects are often caused, so that the detection result is affected.
Therefore, how to solve the above technical problems should be of great interest to those skilled in the art.
Disclosure of Invention
The purpose of the application is to provide a vehicle detection method, a device, a detector, a system, a model training method and a computer readable storage medium, so as to improve the accuracy of mapping image features to a 3D space, and further enable a vehicle detection result to be more accurate.
In order to solve the above technical problems, the present application provides a vehicle detection method, including:
acquiring a feature depth map corresponding to the image feature map; the image feature map corresponds to an external environment image of the vehicle;
acquiring a road layout aerial view of the current position of the automatic driving vehicle; the road layout aerial view comprises road elements;
estimating the weight of the image feature belonging to the road element in the image feature map;
generating a corrected image of the road element for the characteristic depth map according to the road layout aerial view;
correcting the characteristic depth map by using the corrected image and the weight to obtain a corrected characteristic depth map;
and detecting the exterior of the automatic driving vehicle according to the corrected feature depth map and the image feature map, and outputting a 3D vehicle detection result.
Optionally, generating a corrected image of the road element for the feature depth map according to the road layout aerial view includes:
generating a road element depth feature map and a road element mask map according to the road layout aerial view map;
correspondingly, correcting the feature depth map by using the corrected image and the weight, and obtaining the corrected feature depth map includes:
and correcting the feature depth map through weighted summation and masking by using the road element depth feature map, the road element mask map and the weights to obtain a corrected feature depth map.
Optionally, using the road element depth feature map, the road element mask map, and the weights, correcting the feature depth map through weighted summation and masking, and obtaining a corrected feature depth map includes:
correcting the characteristic depth map according to a preset formula, wherein the preset formula is as follows:
Figure SMS_1
wherein ,
Figure SMS_2
for the depth features of the (u, v) points in the corrected feature depth map +.>
Figure SMS_3
Road element weight map corresponding to image feature map +.>
Figure SMS_4
The value corresponding to point (u, v), is->
Figure SMS_5
Weight map M corresponding to road element mask map r The value corresponding to point (u, v), is- >
Figure SMS_6
For the feature corresponding to the (u, v) point of the road element depth feature map, the +.>
Figure SMS_7
Is the feature corresponding to the feature depth map at the point (u, v).
Optionally, generating the road element depth feature map and the road element mask map according to the road layout aerial view includes:
quantifying the road layout aerial view to obtain a road information map;
determining coordinates of points corresponding to all the road elements in the road information graph in a world coordinate system to form a road element point set;
converting each coordinate in the road element point set into an image coordinate system to obtain a road element depth map and the road element mask map;
and obtaining the road element depth feature map according to the road element depth map.
Optionally, converting each of the coordinates in the set of road element points into an image coordinate system includes:
and converting each coordinate in the road element point set into an image coordinate system through an image acquisition device parameter, a rotation translation relation between the image acquisition device and the automatic driving vehicle, and a rotation translation relation between the pose of the automatic driving vehicle and the world coordinate system.
Optionally, converting each of the coordinates in the set of road element points into an image coordinate system includes:
and converting each coordinate in the road element point set into an image coordinate system by matrix operation.
Optionally, obtaining the road element depth feature map according to the road element depth map includes:
expanding each element value in the road element depth map into a D-dimensional vector; the element values are depth values corresponding to points in the road element depth map in an image acquisition equipment coordinate system;
the D-dimensional vector is normalized such that the sum of all values in the D-dimensional vector is equal to 1.
Optionally, quantifying the road layout aerial view, and obtaining the road information map includes:
screening points in the road layout aerial view, which are located in the road element range, and points which are not located in the road element range;
setting the value of a point in the road element range to be 1, and setting the value of a point which is not in the road element range to be 0, thereby obtaining a road information graph.
Optionally, when the road element is a road surface, determining coordinates of points corresponding to all the road elements in the road information map in a world coordinate system includes:
And determining X values, Y values and Z values of points corresponding to all road surfaces in the road information map in a world coordinate system, wherein the Z value is equal to 0.
Optionally, the method further comprises:
enhancing the image features in the image feature map to obtain an enhanced image feature map;
correspondingly, the vehicle detection according to the corrected feature depth map and the image feature map comprises the following steps:
and detecting the vehicle according to the corrected characteristic depth map and the enhanced image characteristic map.
Optionally, the detecting the vehicle according to the corrected feature depth map and the image feature map includes:
generating a 3D feature map under an image coordinate system by the corrected feature depth map and the image feature map;
converting the 3D feature map to a BEV feature map;
and detecting the vehicle by using the BEV characteristic diagram.
Optionally, converting the 3D feature map to a BEV feature map includes:
converting the characteristics of each point in the 3D characteristic map into a vehicle coordinate system through the parameters of the image acquisition equipment and the rotation translation relation between the image acquisition equipment and the vehicle;
carrying out voxelization on points in the vehicle coordinate system, accumulating the characteristic points falling into the same voxel grid, setting the characteristic of the voxel grid without the characteristic points falling into 0, and forming a 3D characteristic map in the vehicle coordinate system;
And accumulating the features corresponding to the voxel grids at all heights in the height dimension to obtain the BEV feature map.
Optionally, the method further comprises:
acquiring the vehicle external environment image acquired by vehicle-mounted image acquisition equipment;
and extracting image features of the vehicle external environment image to obtain the image feature map, wherein the image feature map comprises the image features.
Optionally, acquiring the feature depth map corresponding to the image feature map includes:
and inputting the image feature map into a first preset neural network model to perform depth estimation, and obtaining the feature depth map.
Optionally, estimating the weights of the image features belonging to the road elements in the image feature map includes:
and inputting the image features into a second preset neural network model to obtain the weights of the image features belonging to the road elements.
Optionally, the method further comprises:
acquiring the external environment image of the vehicle; the number of the external environment images of the vehicle is at least two, and each external environment image of the vehicle is acquired by vehicle-mounted image acquisition equipment at different positions;
correspondingly, correcting the feature depth map by using the corrected image and the weight, and obtaining the corrected feature depth map includes:
Correcting each characteristic depth map by using the corrected image and the weight to obtain a plurality of corrected characteristic depth maps;
according to the corrected feature depth map and the image feature map, vehicle detection is carried out on the outside of the automatic driving vehicle, and the output of the 3D vehicle detection result comprises:
according to each corrected feature depth map and the image feature map corresponding to the corrected feature depth map, vehicle detection is carried out on the exterior of the automatic driving vehicle, and a 3D vehicle detection result to be combined is obtained;
and combining all the 3D vehicle detection results to be combined, and outputting a 3D vehicle detection result.
The application also provides a vehicle detection device, including:
the first acquisition module is used for acquiring a characteristic depth map corresponding to the image characteristic map; the image feature map corresponds to an external environment image of the vehicle;
the second acquisition module is used for acquiring a road layout aerial view of the current position of the automatic driving vehicle; the road layout aerial view comprises road elements;
the estimating module is used for estimating the weight of the image feature belonging to the road element in the image feature map;
the first generation module is used for generating a corrected image of the road element for the characteristic depth map according to the road layout aerial view;
The correction module is used for correcting the characteristic depth map by utilizing the corrected image and the weight to obtain a corrected characteristic depth map;
and the detection module is used for detecting the vehicle outside the automatic driving vehicle according to the corrected characteristic depth map and the image characteristic map and outputting a 3D vehicle detection result.
The application also provides a vehicle detection model training method, wherein the vehicle detection model comprises a first preset neural network model and a second preset neural network model, and the method comprises the following steps:
inputting an image feature map into the first preset neural network model for depth estimation to obtain a feature depth map corresponding to the image feature map; the image feature map corresponds to an external environment image of the vehicle;
acquiring a road layout aerial view of the current position of the automatic driving vehicle; the road layout aerial view comprises road elements;
inputting the image features in the image feature map into the second preset neural network model, and estimating the weights of the image features belonging to the road elements;
generating a corrected image of the road element for the characteristic depth map according to the road layout aerial view;
correcting the characteristic depth map by using the corrected image and the weight to obtain a corrected characteristic depth map;
According to the corrected feature depth map and the image feature map, detecting the exterior of the automatic driving vehicle, and outputting a 3D vehicle detection result;
and training the vehicle detection model based on the loss function to obtain a trained vehicle detection model.
Optionally, the method further comprises:
setting the region with the road element labeling error in the road element mask map to 0, and generating a corrected mask map;
training the second preset neural network model by using the corrected mask map.
Optionally, training the vehicle detection model based on the loss function includes:
based on
Figure SMS_8
Training the vehicle detection model;
in the formula ,
Figure SMS_10
estimating the loss for the depth of the image feature,/->
Figure SMS_13
Loss for vehicle classification->
Figure SMS_15
Estimating losses for vehicle position, length, width, speed,/->
Figure SMS_9
Training loss of road weight corresponding to image feature map, < ->
Figure SMS_12
Estimating the lost weight for the depth of image feature, +.>
Figure SMS_14
Weight lost for vehicle classification, +.>
Figure SMS_16
Weight lost for vehicle position, length, width, speed estimation, < >>
Figure SMS_11
And training the lost weight for the road weight corresponding to the image feature map.
The present application also provides a detector comprising:
a memory for storing a computer program;
And the processor is used for realizing the steps of any one of the vehicle detection methods and the steps of any one of the vehicle detection model training methods when executing the computer program.
The present application also provides a vehicle detection system, comprising:
a detector as described above;
and the image acquisition device is connected with the detector.
The present application also provides a computer-readable storage medium, on which a computer program is stored, which when executed by a processor, implements the steps of any one of the above-mentioned vehicle detection methods, or the steps of any one of the above-mentioned vehicle detection model training methods.
The vehicle detection method provided by the application comprises the following steps: acquiring a feature depth map corresponding to the image feature map; the image feature map corresponds to an external environment image of the vehicle; acquiring a road layout aerial view of the current position of the automatic driving vehicle; the road layout aerial view comprises road elements; estimating the weight of the image feature belonging to the road element in the image feature map; generating a corrected image of the road element for the characteristic depth map according to the road layout aerial view; correcting the characteristic depth map by using the corrected image and the weight to obtain a corrected characteristic depth map; and detecting the exterior of the automatic driving vehicle according to the corrected feature depth map and the image feature map, and outputting a 3D vehicle detection result.
Therefore, when the vehicle is detected, the road layout aerial view of the current position of the automatic driving vehicle and the weight of the image feature belonging to the road element in the image feature map are obtained, the correction image of the feature depth map is obtained by utilizing the road layout aerial view, and then the feature depth map is corrected by utilizing the correction image and the weight.
In addition, the application also provides a device, a detector, a system model training method and a computer readable storage medium with the advantages.
Drawings
For a clearer description of embodiments of the present application or of the prior art, the drawings that are used in the description of the embodiments or of the prior art will be briefly described, it being apparent that the drawings in the description that follow are only some embodiments of the present application, and that other drawings may be obtained from these drawings by a person of ordinary skill in the art without inventive effort.
Fig. 1 is a flowchart of a vehicle detection method according to an embodiment of the present application;
FIG. 2 is a flow chart of another vehicle detection method according to an embodiment of the present application;
FIG. 3 is a flow chart of another vehicle detection method according to an embodiment of the present disclosure;
fig. 4 is a schematic diagram of a vehicle detection method according to an embodiment of the present application;
fig. 5 is a block diagram of a vehicle detection device according to an embodiment of the present application;
FIG. 6 is a block diagram of a detector provided in an embodiment of the present application;
fig. 7 is a block diagram of a vehicle detection system according to an embodiment of the present application.
Detailed Description
In order to provide a better understanding of the present application, those skilled in the art will now make further details of the present application with reference to the drawings and detailed description. It will be apparent that the described embodiments are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
As described in the background section, when the automatic driving system detects the surrounding vehicle, depth estimation is completed based on the external environment image information, and in the process of mapping the features into the three-dimensional space, dislocation and confusion between the 3D space features and the real objects are often caused, so that the detection result is affected.
In view of this, the present application provides a vehicle detection method, please refer to fig. 1, including:
step S101: acquiring a feature depth map corresponding to the image feature map; the image feature map corresponds to an image of the environment outside the vehicle.
As an implementation manner, acquiring a feature depth map corresponding to an image feature map includes: acquiring an image feature map corresponding to an external environment image of a vehicle; and acquiring a feature depth map corresponding to the image feature map. Alternatively, the feature depth map that has been processed is obtained directly.
The acquiring of the image feature map corresponding to the vehicle external environment image includes:
and inputting the external environment image of the vehicle into a third preset neural network model for image feature extraction to obtain an image feature map. The third preset neural network model is a multi-layer convolutional neural network, and the number of layers and the number of channels are set in a feasible manner.
Optionally, before acquiring the feature depth map corresponding to the image feature map, the method may further include:
and acquiring the vehicle external environment image.
The vehicle external environment image may be acquired by an in-vehicle image acquisition device. In the present application, the number of the external environment images of the vehicle is not limited, and may be one or two or more, as the case may be.
Optionally, acquiring the feature depth map corresponding to the image feature map includes:
and inputting the image feature map into a first preset neural network model to perform depth estimation, and obtaining the feature depth map.
The image features in the vehicle external environment image are obtained by locally extracting the features of the vehicle external environment image.
Through a first preset neural network model and image characteristics F 0 Obtaining a feature depth map D C Reference is made to the related art, and detailed description thereof is omitted. The first preset neural network model is a multi-layer convolutional neural network, and the number of layers and the number of channels are set.
Feature depth map
Figure SMS_17
Wherein R is a real number, H F and WF The dimension of the image features is that D is the number of quantized depth, namely the designated depth min , depth max ]Dividing the depth value into D units, wherein the value on the ith (i is more than or equal to 1 and less than or equal to D) unit represents the probability that the depth value of the current characteristic point is in the depth range of the ith unit.
Step S102: acquiring a road layout aerial view of the current position of the automatic driving vehicle; the road layout bird's eye view includes road elements.
Road elements refer to objects that are present or appear on a road.
It should be noted that, the road elements are not limited in this application, and include static road elements and dynamic road elements. Wherein, the static road elements can be pavements, crosswalks, signal lamps and the like; the dynamic road element may be a pedestrian, a bicycle, a traveling vehicle, etc.
In the present application, the method for obtaining the road layout bird's eye view is not limited, and may be used as appropriate. For example, the road map can be obtained by means of aerial photography, mapping, labeling of static road elements and the like.
The dynamic road elements may be obtained by a dynamic road element self-positioning system or by a dynamic object detection instrument external to the autonomous vehicle, the detected dynamic road elements including the location and size of the dynamic road elements.
Step S103: and estimating the weight of the image feature belonging to the road element in the image feature map.
The image feature map is an image formed by extracting features of an external environment image of the vehicle.
As an implementation manner, estimating the weights of the image features belonging to the road elements in the image feature map includes:
and inputting the image features into a second preset neural network model to obtain the weights of the image features belonging to the road elements.
Weighting of
Figure SMS_18
R is a real number, H F and WF For the scale of the image feature, weight W r Representing the probability that an image feature belongs to a road element, the weight of each image feature in the image feature map is between 0 and 1.
The second preset neural network model is a multi-layer convolutional neural network, and the number of layers and the number of channels are set.
Step S104: and generating a corrected image of the road element for the characteristic depth map according to the road layout aerial view.
Step S105: and correcting the characteristic depth map by using the corrected image and the weight to obtain a corrected characteristic depth map.
Feature depth map D C Corrected feature depth map obtained after correction
Figure SMS_19
R is a real number, H F and WF And D is the number of quantized depth, which is the scale of the image features.
Step S106: and detecting the exterior of the automatic driving vehicle according to the corrected feature depth map and the image feature map, and outputting a 3D vehicle detection result.
The 3D vehicle detection results include a 3D position, a length, a width, a speed, and a classification score of a target vehicle outside the autonomous vehicle.
It should be noted that, the image feature map utilized in this step may be a feature map obtained by directly extracting features from an external environment image of the vehicle, or may be a enhanced image feature map obtained by further enhancing features of a feature map obtained by extracting features from an external environment image of the vehicle, which are all within the protection scope of the present application.
In this embodiment, when the vehicle is detected, a road layout aerial view of the current position of the autonomous vehicle and a weight of the image feature belonging to the road element in the image feature map are obtained, and a corrected image of the feature depth map is obtained by using the road layout aerial view. And then correcting the characteristic depth map by using the corrected image and the weight, wherein the road layout aerial view comprises road elements, so that the corrected image can improve the characteristic conversion precision of a non-vehicle area in the characteristic depth map, reduce the influence of background information on vehicle detection, further improve the accuracy of the characteristic depth map, further improve the accuracy of mapping image characteristics to a 3D space, and enable the vehicle detection result to be more accurate.
On the basis of the above embodiments, in one embodiment of the present application, a vehicle detection method includes:
step S201: acquiring a feature depth map corresponding to the image feature map; the image feature map corresponds to an image of the environment outside the vehicle.
Step S202: acquiring a road layout aerial view of the current position of the automatic driving vehicle; the road layout bird's eye view includes road elements.
Step S203: and estimating the weight of the image feature belonging to the road element in the image feature map.
Step S204: and generating a road element depth feature map and a road element mask map according to the road layout aerial view.
The corrected image in this embodiment includes a road element depth feature map and a road element mask map.
The road element depth feature map is beneficial to improving feature conversion accuracy of non-vehicle regions in image features, reducing influence of background information on vehicle detection, and further improving accuracy of vehicle detection.
As one embodiment, generating the road element depth feature map and the road element mask map from the road layout bird's eye view map includes:
step S2041: and quantifying the road layout aerial view to obtain a road information map.
The automatic driving vehicle coordinate system is used for obtaining the road information map.
Optionally, quantifying the road layout aerial view, and obtaining the road information map includes:
screening points in the road layout aerial view, which are located in the road element range, and points which are not located in the road element range;
setting the value of a point in the road element range to be 1, and setting the value of a point which is not in the road element range to be 0, thereby obtaining a road information graph.
Rectangular areas of preset length and width under bird view can be quantized into road information map
Figure SMS_20
Wherein R is a real number, H r and Wr The height and width of the road information map, respectively. The value of each point in the road information map is 0 or 1, and when the point is located in a road element, the value is 1, and when the point is not located in the road element, the value is 0.
Step S2042: and determining coordinates of points corresponding to all the road elements in the road information graph in a world coordinate system to form a road element point set.
As an implementation manner, when the road element is a road surface, determining coordinates of points corresponding to all the road elements in the road information map in a world coordinate system includes:
and determining X values, Y values and Z values of points corresponding to all road surfaces in the road information map in a world coordinate system, wherein the Z value is equal to 0.
Due to the trackThe road surface is horizontal, so Z to 0 in the world coordinate system. Correspondingly, the set of road surface points
Figure SMS_21
, wherein ,/>
Figure SMS_22
For the coordinates of the road surface point in the world coordinate system, +.>
Figure SMS_23
The corresponding values of the points in the road surface information map are indicated.
Step S2043: and converting each coordinate in the road element point set into an image coordinate system to obtain a road element depth map and the road element mask map.
Road element depth map
Figure SMS_24
Wherein R is a real number, H F and WF Is the scale of the image feature. Road element depth map->
Figure SMS_25
Represents the depth value of the road element point in the image coordinate system.
Road element mask diagram
Figure SMS_26
Wherein R is a real number, H F and WF Is the scale of the image feature. Road element mask map->
Figure SMS_27
Is 0 or 1,0 representing that no road element point exists at the current location, and 1 representing that a road element point exists at the current location.
Optionally, converting each of the coordinates in the set of road element points into an image coordinate system includes:
and converting each coordinate in the road element point set into an image coordinate system through an image acquisition device parameter, a rotation translation relation between the image acquisition device and the automatic driving vehicle, and a rotation translation relation between the pose of the automatic driving vehicle and the world coordinate system.
Optionally, converting each of the coordinates in the set of road element points into an image coordinate system includes:
and converting each coordinate in the road element point set into an image coordinate system by matrix operation.
The specific process of converting each coordinate in the road element point set to the image coordinate system may refer to the related art, and will not be described in detail herein.
Step S2044: and obtaining the road element depth feature map according to the road element depth map.
Optionally, obtaining the road element depth feature map according to the road element depth map includes:
step S2044a: expanding each element value in the road element depth map into a D-dimensional vector; the element values are depth values corresponding to points in the road element depth map in an image acquisition equipment coordinate system.
The dimension D in the D-dimensional vector is the same value as the depth quantization number D.
The values of the D-dimensional vector are subject to averaging with the element values,
Figure SMS_28
is a gaussian function of variance.
Step S2044b: the D-dimensional vector is normalized such that the sum of all values in the D-dimensional vector is equal to 1.
The obtained road element depth feature map
Figure SMS_29
Wherein R is a real number, H F and WF For the scale of the image features, D is the road element depth map +.>
Figure SMS_30
The dimension of the vector is extended by each element value in (a).
Step S205: and correcting the feature depth map through weighted summation and masking by using the road element depth feature map, the road element mask map and the weights to obtain a corrected feature depth map.
Optionally, using the road element depth feature map, the road element mask map, and the weights, correcting the feature depth map through weighted summation and masking, and obtaining a corrected feature depth map includes:
correcting the characteristic depth map according to a preset formula, wherein the preset formula is as follows:
Figure SMS_31
(1);
wherein ,
Figure SMS_33
for the depth features of the (u, v) points in the corrected feature depth map +.>
Figure SMS_36
Road element weight map corresponding to image feature map +.>
Figure SMS_38
The value corresponding to point (u, v), is->
Figure SMS_34
Weight map M corresponding to road element mask map r The value corresponding to point (u, v), is->
Figure SMS_35
For the feature corresponding to the (u, v) point of the road element depth feature map, the +.>
Figure SMS_37
Is the feature corresponding to the feature depth map at the point (u, v). Wherein, due to the road element mask map->
Figure SMS_39
Each point in the road element mask map is fetched to operate, and the value of each point is not 0, namely 1, is fetched to operate The values of (2) are called weights to form a weight map, so the weight map M corresponding to the road element mask map r Substantially with the road element mask map->
Figure SMS_32
The same applies.
In the corrected characteristic depth map, the position without road elements is [ ]
Figure SMS_40
) Mainly the depth estimated by the image features, at the position of the road element (++>
Figure SMS_41
) The image feature estimation and the road element depth feature map are weighted mainly.
Step S206: and detecting the exterior of the automatic driving vehicle according to the corrected feature depth map and the image feature map, and outputting a 3D vehicle detection result.
On the basis of any one of the foregoing embodiments, in one embodiment of the present application, the vehicle detection method further includes:
acquiring the vehicle external environment image acquired by vehicle-mounted image acquisition equipment;
and extracting image features of the vehicle external environment image to obtain the image feature map, wherein the image feature map comprises the image features.
Image feature extraction can be performed through a deep neural network to obtain image features
Figure SMS_42
Wherein R is a real number, H F and WF C is the channel number of the image feature, which is the scale of the image feature. The specific extraction process may refer to the related art, and will not be described in detail herein.
On the basis of any one of the foregoing embodiments, in one embodiment of the present application, the vehicle detection method further includes:
enhancing the image features in the image feature map to obtain an enhanced image feature map;
correspondingly, the vehicle detection according to the corrected feature depth map and the image feature map comprises the following steps:
and detecting the vehicle according to the corrected characteristic depth map and the enhanced image characteristic map.
And inputting the image features in the image feature map into a fourth preset neural network model to obtain the enhanced image feature map. The fourth preset neural network model is a multi-layer convolutional neural network, and the number of layers and the number of channels are set.
Enhanced image feature map
Figure SMS_43
Wherein R is a real number, H F and WF C is the channel number of the image feature, which is the scale of the image feature.
In the embodiment, the image characteristics are enhanced, so that the expression capability of the image characteristics can be further improved, and the accuracy of vehicle detection is further improved.
Referring to fig. 3, in one embodiment of the present application, the vehicle detection method includes:
step S301: acquiring a feature depth map corresponding to the image feature map; the image feature map corresponds to an image of the environment outside the vehicle.
Step S302: acquiring a road layout aerial view of the current position of the automatic driving vehicle; the road layout bird's eye view includes road elements.
Step S303: and estimating the weight of the image feature belonging to the road element in the image feature map.
Step S304: and generating a corrected image of the road element for the characteristic depth map according to the road layout aerial view.
Step S305: and correcting the characteristic depth map by using the corrected image and the weight to obtain a corrected characteristic depth map.
Step S306: and generating a 3D feature map under an image coordinate system by the corrected feature depth map and the image feature map.
Depth of corrected featureDegree graph D m The process of generating the 3D feature map with respect to the image feature map may refer to the related art, and will not be described in detail herein. 3D feature map in image coordinate system
Figure SMS_44
Features G of each point in (2) m (u,v,i)=D m (u,v,i)×F C (u, v) wherein R is a real number, H F and WF The dimension of the image features is that D is the number of depth quantization and C is the number of channels of the image features.
Step S307: the 3D feature map is converted to a BEV feature map.
As an implementation manner, converting the 3D feature map into the BEV feature map includes:
converting the characteristics of each point in the 3D characteristic map into a vehicle coordinate system through the parameters of the image acquisition equipment and the rotation translation relation between the image acquisition equipment and the vehicle;
Carrying out voxelization on points in the vehicle coordinate system, accumulating the characteristic points falling into the same voxel grid, setting the characteristic of the voxel grid without the characteristic points falling into 0, and forming a 3D characteristic map in the vehicle coordinate system;
and accumulating the features corresponding to the voxel grids at all heights in the height dimension to obtain the BEV feature map.
The specific process of converting the 3D feature map into the BEV feature map may refer to the related art, and will not be described in detail herein.
BEV (Bird-eye's-view) feature map
Figure SMS_45
Where R is a real number, X, Y represents the dimension of the BEV feature map and C is the number of channels of the image feature.
Step S308: and detecting the vehicle by using the BEV characteristic diagram.
And (3) detecting the vehicle through a neural network by utilizing the BEV characteristic diagram, and outputting the 3D position, the length, the width, the height, the speed and the classification score of the target vehicle outside the automatic driving vehicle. The detection process may refer to the related art, and will not be described in detail herein.
On the basis of any one of the foregoing embodiments, in one embodiment of the present application, the vehicle detection method further includes:
acquiring the external environment image of the vehicle; the number of the external environment images of the vehicle is at least two, and each external environment image of the vehicle is acquired by vehicle-mounted image acquisition equipment at different positions;
Correspondingly, correcting the feature depth map by using the corrected image and the weight, and obtaining the corrected feature depth map includes:
correcting each characteristic depth map by using the corrected image and the weight to obtain a plurality of corrected characteristic depth maps;
according to the corrected feature depth map and the image feature map, vehicle detection is carried out on the outside of the automatic driving vehicle, and the output of the 3D vehicle detection result comprises:
according to each corrected feature depth map and the image feature map corresponding to the corrected feature depth map, vehicle detection is carried out on the exterior of the automatic driving vehicle, and a 3D vehicle detection result to be combined is obtained;
and combining all the 3D vehicle detection results to be combined, and outputting a 3D vehicle detection result.
It can be understood that the number of the feature depth maps is equal to the number of the external environment images of the vehicle, and accordingly, how many external environment images of the vehicle exist can obtain how many corrected feature depth maps. And carrying out one-time vehicle detection by utilizing the corrected feature depth map and the image feature map corresponding to the same vehicle external environment image, correspondingly obtaining a 3D vehicle detection result to be combined, and finally combining all the 3D vehicle detection results to be combined and outputting.
The following describes a vehicle detection method in the present application, taking a road element as an example of a road surface. The corresponding architecture diagram of the vehicle detection method is shown in fig. 4.
Step 1, acquiring an external environment image of a vehicle acquired by a camera, and performing image acquisition on the external environment image of the vehicle through a third preset depth neural network pairImage feature extraction is carried out on an external environment image of a vehicle, and an image feature map is obtained
Figure SMS_46
Wherein R is a real number, H F and WF C is the channel number of the image feature;
step 2, image feature map F 0 Inputting the image characteristics into a fourth preset neural network model, and carrying out image characteristic enhancement on the image characteristics to obtain an enhanced image characteristic diagram
Figure SMS_47
Wherein R is a real number, H F and WF C is the channel number of the image feature;
step 3, image feature F 0 Inputting the image feature map into a first preset neural network model to obtain a feature depth map corresponding to the image feature map
Figure SMS_48
Wherein R is a real number, H F and WF The dimension of the image features is D, namely the number of depth quantization;
step 4, image feature map F 0 Inputting a second preset neural network model to obtain a road weight corresponding to the image feature map
Figure SMS_49
The weight of each image feature is between 0 and 1, and the road weight W r Representing the probability that the image features belong to the road surface;
step 5, obtaining road surface images through aerial photography, mapping or marking, and quantifying rectangular areas with specified length and width under bird views into road surface information maps based on an automatic driving vehicle coordinate system
Figure SMS_50
Wherein R is a real number, H r and Wr The height and the width of the pavement information map are respectively;
step 6, calculating the world coordinate system of all the points of the road surface in the road surface information mapThe X and Y values below and the Z value is set to 0 (representing a horizontal road surface) to form a set of road surface points in the world coordinate system
Figure SMS_51
;/>
Step 7, converting the road surface points into an image coordinate system through matrix operation by utilizing the existing camera internal reference data, the rotation translation relation between the camera and the vehicle and the rotation translation relation between the vehicle pose and the world coordinate system to form a road surface depth map
Figure SMS_52
And pavement mask pattern->
Figure SMS_53
Wherein R is a real number, H F and WF Is the scale of the image feature;
step 8, mapping the road surface depth
Figure SMS_54
Expanding each element value into a D-dimensional vector, and normalizing to obtain a pavement depth feature map ++>
Figure SMS_55
Wherein R is a real number, H F and WF For the scale of the image features, D is the road element depth map +. >
Figure SMS_56
The dimension of the vector of each element value expansion;
step 9, inputting a characteristic depth map D C Road surface depth feature map D r Road weight W corresponding to image feature map r With road mask pattern
Figure SMS_57
Obtaining a corrected depth map +.>
Figure SMS_58
R is a real number, H F and WF For image featuresD is the number of quantized depths, and each value in the modified depth map is obtained by using formula (1):
Figure SMS_59
(1);
step 10, correcting the depth map D by the prior art m And enhanced image feature map F c Generating 3D feature maps in an image coordinate system
Figure SMS_60
Wherein R is a real number, H F and WF The dimension of the image features is that D is the number of depth quantization and C is the number of channels of the image features;
step 11, 3D characteristic diagram G m Conversion to BEV feature maps
Figure SMS_61
Wherein R is a real number, X, Y represents the dimension of the BEV feature map, and C is the number of channels of the image feature;
and 12, using the BEV characteristic diagram, performing vehicle detection through a neural network, and outputting the 3D position, the length, the width, the height, the speed and the classification score of the target vehicles around the automatic driving vehicle.
The application also provides a vehicle detection model training method, wherein the vehicle detection model comprises a first preset neural network model and a second preset neural network model, and the method comprises the following steps:
Step S401: inputting an image feature map into the first preset neural network model for depth estimation to obtain a feature depth map corresponding to the image feature map; the image feature map corresponds to an external environment image of the vehicle;
step S402: acquiring a road layout aerial view of the current position of the automatic driving vehicle; the road layout aerial view comprises road elements;
step S403: inputting the image features in the image feature map into the second preset neural network model, and estimating the weights of the image features belonging to the road elements;
step S404: generating a corrected image of the road element for the characteristic depth map according to the road layout aerial view;
step S405: correcting the characteristic depth map by using the corrected image and the weight to obtain a corrected characteristic depth map;
step S406: according to the corrected feature depth map and the image feature map, detecting the exterior of the automatic driving vehicle, and outputting a 3D vehicle detection result;
step S407: and training the vehicle detection model based on the loss function to obtain a trained vehicle detection model.
On the basis of the above embodiment, in one embodiment of the present application, the vehicle detection model training method further includes:
Setting the region with the road element labeling error in the road element mask map to 0, and generating a corrected mask map;
and training a second preset neural network model by using the corrected mask map.
Pavement mask pattern
Figure SMS_62
Is 0 or 1,0 representing that no road element point exists at the current location, and 1 representing that a road element point exists at the current location. The region of the road element labeling error, i.e., the region in the road element mask map is not a road element, but the region in the road element mask map is labeled as a road element.
In this embodiment, the second preset neural network model is trained through the corrected mask map, so that the weight obtained through the trained second preset neural network model is more accurate, and further the vehicle detection accuracy is improved.
On the basis of any one of the foregoing embodiments, in one embodiment of the present application, training the vehicle detection model based on a loss function includes:
training the vehicle detection model based on a preset formula; the preset formula is:
Figure SMS_63
(2);
in the formula ,
Figure SMS_65
estimating the loss for the depth of the image feature,/->
Figure SMS_68
Loss for vehicle classification->
Figure SMS_70
Estimating losses for vehicle position, length, width, speed,/->
Figure SMS_66
Training loss of road weight corresponding to image feature map, < - >
Figure SMS_67
Estimating the lost weight for the depth of image feature, +.>
Figure SMS_69
Weight lost for vehicle classification, +.>
Figure SMS_71
Weight lost for vehicle position, length, width, speed estimation, < >>
Figure SMS_64
And training the lost weight for the road weight corresponding to the image feature map.
The following describes a vehicle detection device provided in an embodiment of the present application, and the vehicle detection device described below and the vehicle detection method described above may be referred to correspondingly to each other.
Fig. 5 is a block diagram of a vehicle detection device according to an embodiment of the present application, and referring to fig. 5, the vehicle detection device may include:
a first obtaining module 100, configured to obtain a feature depth map corresponding to an image feature map; the image feature map corresponds to an external environment image of the vehicle;
a second obtaining module 200, configured to obtain a road layout aerial view of a current position of the autopilot vehicle; the road layout aerial view comprises road elements;
an estimating module 300, configured to estimate a weight of the road element to which the image feature belongs in the image feature map;
a first generation module 400, configured to generate a corrected image of the road element for the feature depth map according to the road layout aerial view;
the correction module 500 is configured to correct the feature depth map by using the corrected image and the weight, so as to obtain a corrected feature depth map;
And the detection module 600 is configured to perform vehicle detection on the exterior of the autonomous vehicle according to the corrected feature depth map and the image feature map, and output a 3D vehicle detection result.
The vehicle detection apparatus of the present embodiment is used to implement the foregoing vehicle detection method, so that the specific implementation in the vehicle detection apparatus may be found in the foregoing example portions of the vehicle detection method, for example, the first acquisition module 100, the second acquisition module 200, the estimation module 300, the first generation module 400, the correction module 500, and the detection module 600, which are respectively used to implement steps S101, S102, S103, S104, S105, and S106 in the foregoing vehicle detection method, so that the specific implementation thereof may refer to the description of the corresponding respective portion examples and will not be repeated herein.
When the device in the embodiment detects the vehicle, a road layout aerial view of the current position of the automatic driving vehicle and the weight of the image feature belonging to the road element in the image feature map are obtained, and a corrected image of the feature depth map is obtained by using the road layout aerial view. And then correcting the characteristic depth map by using the corrected image and the weight, wherein the road layout aerial view comprises road elements, so that the corrected image can improve the characteristic conversion precision of a non-vehicle area in the characteristic depth map, reduce the influence of background information on vehicle detection, further improve the accuracy of the characteristic depth map, further improve the accuracy of mapping image characteristics to a 3D space, and enable the vehicle detection result to be more accurate.
Optionally, the first generating module 400 is specifically configured to: generating a road element depth feature map and a road element mask map according to the road layout aerial view map;
accordingly, the correction module 500 is specifically configured to: and correcting the feature depth map through weighted summation and masking by using the road element depth feature map, the road element mask map and the weights to obtain a corrected feature depth map.
Optionally, the correction module 500 is specifically configured to:
correcting the characteristic depth map according to a preset formula, wherein the preset formula is as follows:
Figure SMS_72
wherein ,
Figure SMS_73
for the depth features of the (u, v) points in the corrected feature depth map +.>
Figure SMS_74
Road element weight map corresponding to image feature map +.>
Figure SMS_75
The value corresponding to point (u, v), is->
Figure SMS_76
Weight map M corresponding to road element mask map r The value corresponding to point (u, v), is->
Figure SMS_77
For the feature corresponding to the (u, v) point of the road element depth feature map, the +.>
Figure SMS_78
Is the feature corresponding to the feature depth map at the point (u, v).
Optionally, the first generating module 400 includes:
the quantization sub-module is used for quantizing the road layout aerial view to obtain a road information map;
the determining submodule is used for determining coordinates of points corresponding to all the road elements in the road information graph in a world coordinate system to form a road element point set;
The first conversion sub-module is used for converting each coordinate in the road element point set into an image coordinate system to obtain a road element depth map and the road element mask map;
and the obtaining submodule is used for obtaining the road element depth feature map according to the road element depth map.
Optionally, the first conversion sub-module is specifically configured to: and converting each coordinate in the road element point set into an image coordinate system through an image acquisition device parameter, a rotation translation relation between the image acquisition device and the automatic driving vehicle, and a rotation translation relation between the pose of the automatic driving vehicle and the world coordinate system.
Optionally, when the first conversion sub-module converts each coordinate in the road element point set into an image coordinate system, the first conversion sub-module specifically converts each coordinate in the road element point set into the image coordinate system by using matrix operation.
Optionally, obtaining the submodule includes:
the expansion unit is used for expanding each element value in the road element depth map into a D-dimensional vector; the element values are depth values corresponding to points in the road element depth map in an image acquisition equipment coordinate system;
And the normalization unit is used for normalizing the D-dimensional vector so that the sum of all values in the D-dimensional vector is equal to 1.
Optionally, the quantization submodule includes:
a screening unit, configured to screen points in the road layout aerial view that are located in the road element range and points that are not located in the road element range;
and the setting unit is used for setting the value of a point in the road element range to be 1 and setting the value of a point which is not in the road element range to be 0, so as to obtain a road information graph.
Optionally, when the road element is a road surface, the determining submodule is specifically configured to determine an X value, a Y value, and a Z value of points corresponding to all road surfaces in the road information map in a world coordinate system, where the Z value is equal to 0.
Optionally, the vehicle detection device further includes:
the feature enhancement module is used for enhancing the image features in the image feature map to obtain an enhanced image feature map;
accordingly, the detection module 600 is specifically configured to: and detecting the vehicle according to the corrected characteristic depth map and the enhanced image characteristic map.
Optionally, the detection module 600 includes:
the generation submodule is used for generating a 3D feature map under an image coordinate system from the corrected feature depth map and the image feature map;
A second conversion sub-module for converting the 3D feature map to a BEV feature map;
and the detection sub-module is used for detecting the vehicle by utilizing the BEV characteristic diagram.
Optionally, the second conversion submodule includes:
the conversion unit is used for converting the characteristics of each point in the 3D characteristic map into a vehicle coordinate system through the parameters of the image acquisition equipment and the rotation translation relation between the image acquisition equipment and the vehicle;
the voxelization unit is used for voxelization of points in the vehicle coordinate system, accumulating the characteristic points falling into the same voxel grid, setting the characteristic of the voxel grid without the characteristic point to be 0, and forming a 3D characteristic map in the vehicle coordinate system;
and the accumulation unit is used for accumulating the characteristics corresponding to the voxel grids at all heights in the height dimension to obtain the BEV characteristic diagram.
Optionally, the vehicle detection device further includes:
the third acquisition module is used for acquiring the vehicle external environment image acquired by the vehicle-mounted image acquisition equipment;
the feature extraction module is used for extracting image features of the vehicle external environment image to obtain the image feature map, wherein the image feature map comprises the image features.
Optionally, the first obtaining module 100 is specifically configured to input the image feature map into a first preset neural network model for depth estimation, so as to obtain the feature depth map.
Optionally, the estimation module 300 is specifically configured to input the image feature into a second preset neural network model, so as to obtain a weight of the road element to which the image feature belongs.
Optionally, the vehicle detection device further includes:
a fourth acquisition module configured to acquire the vehicle external environment image; the number of the external environment images of the vehicle is at least two, and each external environment image of the vehicle is acquired by vehicle-mounted image acquisition equipment at different positions;
accordingly, the correction module 500 is specifically configured to: correcting each characteristic depth map by using the corrected image and the weight to obtain a plurality of corrected characteristic depth maps;
the detection module 600 includes:
the detection sub-module is used for detecting the vehicle outside the automatic driving vehicle according to each corrected characteristic depth map and the image characteristic map corresponding to the corrected characteristic depth map to obtain a 3D vehicle detection result to be combined;
and the combination sub-module is used for combining all the 3D vehicle detection results to be combined and outputting the 3D vehicle detection results.
The following describes a detector provided in an embodiment of the present application, and the detector described below and the vehicle detection method described above may be referred to correspondingly to each other.
Fig. 6 is a block diagram of a detector according to an embodiment of the present application, where the detector may include:
a memory 11 for storing a computer program;
the processor 12 is configured to implement the steps of the vehicle detection method according to any one of the above embodiments and the steps of the vehicle detection model training method according to any one of the above embodiments when executing the computer program.
When the detector of the embodiment detects the vehicle, a road layout aerial view of the current position of the automatic driving vehicle and the weight of the image feature belonging to the road element in the image feature map are obtained, and a corrected image of the feature depth map is obtained by using the road layout aerial view. And then correcting the characteristic depth map by using the corrected image and the weight, wherein the road layout aerial view comprises road elements, so that the corrected image can improve the characteristic conversion precision of a non-vehicle area in the characteristic depth map, reduce the influence of background information on vehicle detection, further improve the accuracy of the characteristic depth map, further improve the accuracy of mapping image characteristics to a 3D space, and enable the vehicle detection result to be more accurate.
The following describes a vehicle detection system provided in an embodiment of the present application, and the vehicle detection system described below and the vehicle detection method described above may be referred to correspondingly.
Referring to fig. 7, the present application further provides a vehicle detection system, including:
the detector 1 described in the above embodiment;
an image acquisition device 2 connected to the detector.
The image acquisition device is mounted on the autonomous vehicle for acquiring an image of a vehicle exterior environment outside the autonomous vehicle. The image capturing device may be a camera, a video camera, or the like.
It should be noted that the number of image capturing devices is not limited in this application. The number of the image acquisition devices may be one or two or more. When the number of the image capturing devices is more than two, the number of the image capturing devices is set at different positions of the autonomous vehicle.
When the vehicle detection system of the embodiment detects a vehicle, a road layout aerial view of the current position of the automatic driving vehicle and weights of image features belonging to road elements in the image feature map are obtained, and a corrected image of the feature depth map is obtained by using the road layout aerial view. And then correcting the characteristic depth map by using the corrected image and the weight, wherein the road layout aerial view comprises road elements, so that the corrected image can improve the characteristic conversion precision of a non-vehicle area in the characteristic depth map, reduce the influence of background information on vehicle detection, further improve the accuracy of the characteristic depth map, further improve the accuracy of mapping image characteristics to a 3D space, and enable the vehicle detection result to be more accurate.
The present application further provides a computer readable storage medium, where a computer program is stored, where the computer program when executed by a processor implements the steps of the vehicle detection method according to any one of the above embodiments, and the steps of the vehicle detection model training method according to any one of the above embodiments.
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, so that the same or similar parts between the embodiments are referred to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The vehicle detection method, apparatus, detector, system, model training method, and computer readable storage medium provided by the present application are described in detail above. Specific examples are set forth herein to illustrate the principles and embodiments of the present application, and the description of the examples above is only intended to assist in understanding the methods of the present application and their core ideas. It should be noted that it would be obvious to those skilled in the art that various improvements and modifications can be made to the present application without departing from the principles of the present application, and such improvements and modifications fall within the scope of the claims of the present application.

Claims (22)

1. A vehicle detection method, characterized by comprising:
acquiring a feature depth map corresponding to the image feature map; the image feature map corresponds to an external environment image of the vehicle;
Acquiring a road layout aerial view of the current position of the automatic driving vehicle; the road layout aerial view comprises road elements;
estimating the weight of the image feature belonging to the road element in the image feature map;
generating a corrected image of the road element for the characteristic depth map according to the road layout aerial view;
correcting the characteristic depth map by using the corrected image and the weight to obtain a corrected characteristic depth map;
according to the corrected feature depth map and the image feature map, detecting the exterior of the automatic driving vehicle, and outputting a 3D vehicle detection result;
generating a corrected image of the road element for the feature depth map according to the road layout aerial view comprises:
generating a road element depth feature map and a road element mask map according to the road layout aerial view map;
correspondingly, correcting the feature depth map by using the corrected image and the weight, and obtaining the corrected feature depth map includes:
and correcting the feature depth map through weighted summation and masking by using the road element depth feature map, the road element mask map and the weights to obtain a corrected feature depth map.
2. The vehicle detection method of claim 1, wherein modifying the feature depth map by weighted summation and masking using the road element depth feature map, the road element mask map, and the weights, the obtaining a modified feature depth map comprising:
correcting the characteristic depth map according to a preset formula, wherein the preset formula is as follows:
Figure QLYQS_1
wherein ,
Figure QLYQS_2
for the depth features of the (u, v) points in the corrected feature depth map +.>
Figure QLYQS_3
Road element weight map corresponding to image feature map +.>
Figure QLYQS_4
The value corresponding to point (u, v), is->
Figure QLYQS_5
Weight map M corresponding to road element mask map r The value corresponding to point (u, v), is->
Figure QLYQS_6
For the feature corresponding to the (u, v) point of the road element depth feature map, the +.>
Figure QLYQS_7
Is the feature corresponding to the feature depth map at the point (u, v).
3. The vehicle detection method according to claim 1, wherein generating a road element depth feature map and a road element mask map from the road layout bird's eye view map includes:
quantifying the road layout aerial view to obtain a road information map;
determining coordinates of points corresponding to all the road elements in the road information graph in a world coordinate system to form a road element point set;
Converting each coordinate in the road element point set into an image coordinate system to obtain a road element depth map and the road element mask map;
and obtaining the road element depth feature map according to the road element depth map.
4. The vehicle detection method of claim 3, wherein converting each of the coordinates in the set of road element points into an image coordinate system comprises:
and converting each coordinate in the road element point set into an image coordinate system through an image acquisition device parameter, a rotation translation relation between the image acquisition device and the automatic driving vehicle, and a rotation translation relation between the pose of the automatic driving vehicle and the world coordinate system.
5. The vehicle detection method of claim 4, wherein converting each of the coordinates in the set of road element points into an image coordinate system comprises:
and converting each coordinate in the road element point set into an image coordinate system by matrix operation.
6. The vehicle detection method of claim 3, wherein obtaining the road element depth feature map from the road element depth map comprises:
Expanding each element value in the road element depth map into a D-dimensional vector; the element values are depth values corresponding to points in the road element depth map in an image acquisition equipment coordinate system;
the D-dimensional vector is normalized such that the sum of all values in the D-dimensional vector is equal to 1.
7. The vehicle detection method according to claim 3, wherein quantifying the road layout bird's eye view map includes:
screening points in the road layout aerial view, which are located in the road element range, and points which are not located in the road element range;
setting the value of a point in the road element range to be 1, and setting the value of a point which is not in the road element range to be 0, thereby obtaining a road information graph.
8. The vehicle detection method according to claim 3, wherein when the road element is a road surface, determining coordinates in a world coordinate system of points corresponding to all the road elements in the road information map includes:
and determining X values, Y values and Z values of points corresponding to all road surfaces in the road information map in a world coordinate system, wherein the Z value is equal to 0.
9. The vehicle detection method according to claim 1, characterized by further comprising:
enhancing the image features in the image feature map to obtain an enhanced image feature map;
correspondingly, the vehicle detection according to the corrected feature depth map and the image feature map comprises the following steps:
and detecting the vehicle according to the corrected characteristic depth map and the enhanced image characteristic map.
10. The vehicle detection method according to claim 1, wherein performing vehicle detection based on the corrected feature depth map and the image feature map includes:
generating a 3D feature map under an image coordinate system by the corrected feature depth map and the image feature map;
converting the 3D feature map to a BEV feature map;
and detecting the vehicle by using the BEV characteristic diagram.
11. The vehicle detection method of claim 10, wherein converting the 3D signature to a BEV signature comprises:
converting the characteristics of each point in the 3D characteristic map into a vehicle coordinate system through the parameters of the image acquisition equipment and the rotation translation relation between the image acquisition equipment and the vehicle;
carrying out voxelization on points in the vehicle coordinate system, accumulating the characteristic points falling into the same voxel grid, setting the characteristic of the voxel grid without the characteristic points falling into 0, and forming a 3D characteristic map in the vehicle coordinate system;
And accumulating the features corresponding to the voxel grids at all heights in the height dimension to obtain the BEV feature map.
12. The vehicle detection method according to claim 1, characterized by further comprising:
acquiring the vehicle external environment image acquired by vehicle-mounted image acquisition equipment;
and extracting image features of the vehicle external environment image to obtain the image feature map, wherein the image feature map comprises the image features.
13. The vehicle detection method according to claim 12, wherein acquiring a feature depth map corresponding to the image feature map includes:
and inputting the image feature map into a first preset neural network model to perform depth estimation, and obtaining the feature depth map.
14. The vehicle detection method according to claim 1, wherein estimating weights of image features belonging to the road elements in the image feature map includes:
and inputting the image features into a second preset neural network model to obtain the weights of the image features belonging to the road elements.
15. The vehicle detection method according to any one of claims 1 to 14, characterized by further comprising:
acquiring the external environment image of the vehicle; the number of the external environment images of the vehicle is at least two, and each external environment image of the vehicle is acquired by vehicle-mounted image acquisition equipment at different positions;
Correspondingly, correcting the feature depth map by using the corrected image and the weight, and obtaining the corrected feature depth map includes:
correcting each characteristic depth map by using the corrected image and the weight to obtain a plurality of corrected characteristic depth maps;
according to the corrected feature depth map and the image feature map, vehicle detection is carried out on the outside of the automatic driving vehicle, and the output of the 3D vehicle detection result comprises:
according to each corrected feature depth map and the image feature map corresponding to the corrected feature depth map, vehicle detection is carried out on the exterior of the automatic driving vehicle, and a 3D vehicle detection result to be combined is obtained;
and combining all the 3D vehicle detection results to be combined, and outputting a 3D vehicle detection result.
16. A vehicle detection apparatus, characterized by comprising:
the first acquisition module is used for acquiring a characteristic depth map corresponding to the image characteristic map; the image feature map corresponds to an external environment image of the vehicle;
the second acquisition module is used for acquiring a road layout aerial view of the current position of the automatic driving vehicle; the road layout aerial view comprises road elements;
the estimating module is used for estimating the weight of the image feature belonging to the road element in the image feature map;
The first generation module is used for generating a corrected image of the road element for the characteristic depth map according to the road layout aerial view;
the correction module is used for correcting the characteristic depth map by utilizing the corrected image and the weight to obtain a corrected characteristic depth map;
the detection module is used for detecting the vehicle outside the automatic driving vehicle according to the corrected feature depth map and the image feature map and outputting a 3D vehicle detection result;
the first generation module is specifically configured to: generating a road element depth feature map and a road element mask map according to the road layout aerial view map;
correspondingly, the correction module is specifically configured to: and correcting the feature depth map through weighted summation and masking by using the road element depth feature map, the road element mask map and the weights to obtain a corrected feature depth map.
17. A vehicle detection model training method, wherein the vehicle detection model comprises a first preset neural network model and a second preset neural network model, the method comprising:
inputting an image feature map into the first preset neural network model for depth estimation to obtain a feature depth map corresponding to the image feature map; the image feature map corresponds to an external environment image of the vehicle;
Acquiring a road layout aerial view of the current position of the automatic driving vehicle; the road layout aerial view comprises road elements;
inputting the image features in the image feature map into the second preset neural network model, and estimating the weights of the image features belonging to the road elements;
generating a corrected image of the road element for the characteristic depth map according to the road layout aerial view;
correcting the characteristic depth map by using the corrected image and the weight to obtain a corrected characteristic depth map;
according to the corrected feature depth map and the image feature map, detecting the exterior of the automatic driving vehicle, and outputting a 3D vehicle detection result;
training the vehicle detection model based on a loss function to obtain a trained vehicle detection model;
generating a corrected image of the road element for the feature depth map according to the road layout aerial view comprises:
generating a road element depth feature map and a road element mask map according to the road layout aerial view map;
correspondingly, correcting the feature depth map by using the corrected image and the weight, and obtaining the corrected feature depth map includes:
And correcting the feature depth map through weighted summation and masking by using the road element depth feature map, the road element mask map and the weights to obtain a corrected feature depth map.
18. The vehicle detection model training method of claim 17, further comprising:
setting the region with the road element labeling error in the road element mask map to 0, and generating a corrected mask map;
training the second preset neural network model by using the corrected mask map.
19. The vehicle detection model training method according to claim 17 or 18, characterized in that training the vehicle detection model based on a loss function includes:
based on
Figure QLYQS_8
Training the vehicle detection model;
in the formula ,
Figure QLYQS_10
estimating the loss for the depth of the image feature,/->
Figure QLYQS_11
Loss for vehicle classification->
Figure QLYQS_12
Estimating losses for vehicle position, length, width, speed,/->
Figure QLYQS_13
Training loss of road weight corresponding to image feature map, < ->
Figure QLYQS_14
Estimating the lost weight for the depth of image feature, +.>
Figure QLYQS_15
Weight lost for vehicle classification, +.>
Figure QLYQS_16
The lost weight is estimated for vehicle position, length-width-height, speed,
Figure QLYQS_9
and training the lost weight for the road weight corresponding to the image feature map.
20. A detector, comprising:
a memory for storing a computer program;
processor for implementing the steps of the vehicle detection method according to any one of claims 1 to 15, the steps of the vehicle detection model training method according to any one of claims 17 to 19 when executing the computer program.
21. A vehicle detection system, characterized by comprising:
the detector of claim 20;
and the image acquisition device is connected with the detector.
22. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of the vehicle detection method according to any one of claims 1 to 15, the steps of the vehicle detection model training method according to any one of claims 17 to 19.
CN202310318133.2A 2023-03-29 2023-03-29 Vehicle detection method, device, detector, system and model training method Active CN116012806B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310318133.2A CN116012806B (en) 2023-03-29 2023-03-29 Vehicle detection method, device, detector, system and model training method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310318133.2A CN116012806B (en) 2023-03-29 2023-03-29 Vehicle detection method, device, detector, system and model training method

Publications (2)

Publication Number Publication Date
CN116012806A CN116012806A (en) 2023-04-25
CN116012806B true CN116012806B (en) 2023-06-13

Family

ID=86035830

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310318133.2A Active CN116012806B (en) 2023-03-29 2023-03-29 Vehicle detection method, device, detector, system and model training method

Country Status (1)

Country Link
CN (1) CN116012806B (en)

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20180038475A (en) * 2015-08-03 2018-04-16 톰톰 글로벌 콘텐트 비.브이. METHODS AND SYSTEMS FOR GENERATING AND USING POSITIONING REFERENCE DATA
DE102019106625A1 (en) * 2019-03-15 2020-09-17 HELLA GmbH & Co. KGaA Method and device for determining a source of danger on a roadway
CN113936139A (en) * 2021-10-29 2022-01-14 江苏大学 Scene aerial view reconstruction method and system combining visual depth information and semantic segmentation
CN114998856B (en) * 2022-06-17 2023-08-08 苏州浪潮智能科技有限公司 3D target detection method, device, equipment and medium for multi-camera image
CN115578702B (en) * 2022-09-26 2023-12-05 北京百度网讯科技有限公司 Road element extraction method and device, electronic equipment, storage medium and vehicle
CN115797454B (en) * 2023-02-08 2023-06-02 深圳佑驾创新科技有限公司 Multi-camera fusion sensing method and device under bird's eye view angle

Also Published As

Publication number Publication date
CN116012806A (en) 2023-04-25

Similar Documents

Publication Publication Date Title
CN107505644B (en) Three-dimensional high-precision map generation system and method based on vehicle-mounted multi-sensor fusion
CN111448478B (en) System and method for correcting high-definition maps based on obstacle detection
WO2018142900A1 (en) Information processing device, data management device, data management system, method, and program
CN112257605B (en) Three-dimensional target detection method, system and device based on self-labeling training sample
CN112270719B (en) Camera calibration method, device and system
CN111179152A (en) Road sign identification method and device, medium and terminal
CN110728720B (en) Method, apparatus, device and storage medium for camera calibration
JP7173471B2 (en) 3D position estimation device and program
JP2018088151A (en) Boundary line estimating apparatus
CN115063465B (en) Unmanned vehicle driving road condition modeling method based on laser radar
CN113763569A (en) Image annotation method and device used in three-dimensional simulation and electronic equipment
CN114719873B (en) Low-cost fine map automatic generation method and device and readable medium
CN109115232B (en) Navigation method and device
KR20200094657A (en) Method and device for estimating passenger statuses in 2 dimension image shot by using 2 dimension camera with fisheye lens
CN116012806B (en) Vehicle detection method, device, detector, system and model training method
CN113240750A (en) Three-dimensional space information measuring and calculating method and device
CN112837404B (en) Method and device for constructing three-dimensional information of planar object
CN110706327A (en) Target area calculation method for three-dimensional cloud modeling
WO2022133986A1 (en) Accuracy estimation method and system
CN115239794A (en) Road ponding area detection method and device and electronic equipment
CN115235493A (en) Method and device for automatic driving positioning based on vector map
CN116704037B (en) Satellite lock-losing repositioning method and system based on image processing technology
CN113155121B (en) Vehicle positioning method and device and electronic equipment
CN116681884B (en) Object detection method and related device
KR102275168B1 (en) Vehicle navigation method based on vision sensor

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant