CN116052121A - Multi-sensing target detection fusion method and device based on distance estimation - Google Patents
Multi-sensing target detection fusion method and device based on distance estimation Download PDFInfo
- Publication number
- CN116052121A CN116052121A CN202310042622.XA CN202310042622A CN116052121A CN 116052121 A CN116052121 A CN 116052121A CN 202310042622 A CN202310042622 A CN 202310042622A CN 116052121 A CN116052121 A CN 116052121A
- Authority
- CN
- China
- Prior art keywords
- target
- sensor
- point cloud
- cloud data
- coordinate
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 74
- 238000007500 overflow downdraw method Methods 0.000 title claims abstract description 24
- 230000000007 visual effect Effects 0.000 claims abstract description 33
- 238000013528 artificial neural network Methods 0.000 claims abstract description 24
- 238000013507 mapping Methods 0.000 claims abstract description 9
- 238000012216 screening Methods 0.000 claims abstract description 9
- 230000004927 fusion Effects 0.000 claims description 16
- 239000011159 matrix material Substances 0.000 claims description 11
- 230000009466 transformation Effects 0.000 claims description 11
- 238000000034 method Methods 0.000 claims description 8
- 101100001674 Emericella variicolor andI gene Proteins 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 6
- 238000010586 diagram Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 2
- 238000003062 neural network model Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000007499 fusion processing Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/56—Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
- G06V20/58—Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S17/00—Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
- G01S17/86—Combinations of lidar systems with systems other than lidar, radar or sonar, e.g. with direction finders
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/803—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of input or preprocessed data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10028—Range image; Depth image; 3D point clouds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20212—Image combination
- G06T2207/20221—Image fusion; Image merging
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30248—Vehicle exterior or interior
- G06T2207/30252—Vehicle exterior; Vicinity of vehicle
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Databases & Information Systems (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Radar, Positioning & Navigation (AREA)
- Remote Sensing (AREA)
- Computer Networks & Wireless Communication (AREA)
- Electromagnetism (AREA)
- Length Measuring Devices By Optical Means (AREA)
Abstract
The invention provides a multi-sensing target detection fusion method and device based on distance estimation, which are used for calibrating a first sensor and a second sensor respectively; predefining a target information summary table, wherein the target information summary table comprises the common physical sizes of all targets detected by the visual detection neural network; performing target detection on the visual image acquired by the first sensor to acquire two-dimensional information of a target, and estimating a first coordinate of the target; acquiring continuous twice point cloud data of the second sensor before target time according to the data acquisition time of the first sensor, and predicting the point cloud data of the target time; and mapping the point cloud data of the target time to a two-dimensional space of the target, screening out target point cloud data, and obtaining the real physical size of the target according to the target point cloud data. According to the multi-sensor target detection fusion method and device based on distance estimation, the target information is detected more accurately by fusing the detection results of the first sensor and the second sensor.
Description
Technical Field
The embodiment of the invention relates to the technical field of automatic driving, in particular to a multi-sensing target detection fusion method and device based on distance estimation.
Background
In the field of autopilot, it is often necessary to use a variety of sensors for sensing to ensure the safety and reliability of autopilot. For example, at least one monocular camera and one radar are used, the neural network is used for carrying out target recognition on the images, the Lei Dadian cloud is clustered, and finally, the targets detected by various sensors are fused and tracked.
However, since the monocular vision image has no depth information, a common fusion processing method may be to add depth information to the neural network, but this requires retraining the neural network, which is time-consuming. The method can also be used for projecting the radar point cloud clustering result into a two-dimensional image space to find the coincidence region with the image detection target, but the method must ensure that the image and the radar point cloud detect the target simultaneously, and the common condition that partial shielding exists between the front and rear of the two targets is difficult to process.
Therefore, a new multi-sensing target detection fusion method and device are needed to effectively solve the above problems.
Disclosure of Invention
The invention provides a multi-sensor target detection fusion method and device based on distance estimation, which can be used for more accurately detecting target information by fusing detection results of a first sensor and a second sensor.
The embodiment of the invention provides a multi-sensing target detection fusion method based on distance estimation, which comprises the following steps:
calibrating a first sensor and a second sensor respectively to obtain a transformation matrix between the first sensor and the second sensor and obtain an internal reference of the first sensor;
predefining a target information summary table, the target information summary table comprising a common physical size of all targets detected by the visual detection neural network;
performing target detection on the visual image acquired by the first sensor through the visual detection neural network to obtain two-dimensional information of the target, wherein the two-dimensional information of the target comprises a target type and a boundary frame, the estimated distance of the target is obtained according to the target type and the size of the boundary frame, and the first coordinate of the target is estimated through the estimated distance of the target;
acquiring continuous twice point cloud data of the second sensor before target time according to the data acquisition time of the first sensor, and predicting the point cloud data of the target time;
and mapping the point cloud data of the target time to a two-dimensional space of the target, screening out target point cloud data, and obtaining the real physical size of the target according to the target point cloud data, wherein the target point cloud data comprises the point cloud data which is mapped into the boundary frame and accords with the first coordinate.
Preferably, the first sensor is at least one monocular camera, and the second sensor is at least one lidar.
Preferably, the target information summary table includes a plurality of target information sub-tables, each target corresponds to one target information sub-table, and the estimating the distance between the targets according to the size of the bounding box is specifically calculated by the following formula:
wherein , andIs an internal reference of the first sensor,/-A>For the first width of the target in the corresponding target information sub-table,/for the target>For the second width of the target in the corresponding target information sub-table,/for the target>Storing the first width, the second width and the height in the target information sub-table for the height of the target in the corresponding target information sub-table,/a> andThe pixel width and height of the target respectively, obtaining the pixel width and height of the object according to the size of the boundary box of the object,/->Is a preliminary estimated distance of the target based on the first width,/i>Is a preliminary estimated distance of the target based on the second width,/i>Is a preliminary estimated distance of the target based on the height.
Preferably, will respectively、The value of (2) is compared with 1 if +.>The final estimated distance of the target is +.>If->The final estimated distance of the target isThe first ratio->Said second ratio->The third ratio->Specifically, the calculation is performed by the following formula:
Preferably, the estimating the first coordinate of the target through the estimated distance of the target is specifically calculated by the following formula:
wherein ,for the three-dimensional Cartesian coordinates of the object relative to the first sensor +.>Indicating the front-back direction, +.>Indicating the left and right direction>Indicating up-down direction, ++> andIs the center point coordinates of the target.
Preferably, willConverting into coordinates of the target relative to the second sensor>Coordinates of the second sensor +.>Specifically, the calculation is performed by the following formula:
Preferably, the predicting the point cloud data acquired by the second sensor is specifically calculated by the following formula:
wherein ,for point cloud data of the target time, +.> andFor the second sensor's two consecutive point cloud data before the target time,/o>For the point cloud data acquired at the moment on the second sensor, +.>Point cloud data acquired at a later time for the second sensor, +.>For the current moment +.>For the last moment,/->To be the last moment.
Preferably, the point cloud data conforming to the first coordinate includes a tolerance, specifically determined by the following formula:
Preferably, the obtaining the second coordinate of the target based on the second sensor includes calculating an average value of the point cloud data conforming to the first coordinate, obtaining a three-dimensional center point coordinate of the target, and calculating a real physical size of the target; if the point number of the point cloud data conforming to the first coordinate is less than a first preset value, usingThe true physical dimensions of the target are calculated.
The embodiment of the invention also provides a multi-sensing target detection fusion device based on distance estimation, which comprises the following steps:
the sensor calibration module is used for calibrating a first sensor and a second sensor respectively, obtaining a transformation matrix between the first sensor and the second sensor and obtaining an internal reference of the first sensor;
a target information summary predefined module for predefining a target information summary comprising a common physical size of all targets detected by the visual detection neural network;
the first sensor data acquisition module is used for carrying out target detection on the visual image acquired by the first sensor through the visual detection neural network to obtain two-dimensional information of the target, wherein the two-dimensional information of the target comprises a target type and a boundary frame, the estimated distance of the target is obtained according to the target type and the size of the boundary frame, and the first coordinate of the target is estimated according to the estimated distance of the target;
the second sensor data acquisition module is used for acquiring continuous twice point cloud data of the second sensor before the target time according to the data acquisition time of the first sensor, and predicting the point cloud data of the target time;
the fusion module is used for mapping the point cloud data of the target time to the two-dimensional space of the target, screening out target point cloud data, and obtaining the real physical size of the target according to the target point cloud data, wherein the target point cloud data comprises the point cloud data which are mapped into the boundary frame and accord with the first coordinates.
Compared with the prior art, the technical scheme of the embodiment of the invention has the following beneficial effects:
according to the multi-sensing target detection fusion method and device based on distance estimation, the target information summary table is predefined, and the target information summary table comprises the common physical dimensions of all targets detected by the visual detection neural network; performing target detection on the visual image acquired by the first sensor through the visual detection neural network to obtain two-dimensional information of the target, wherein the two-dimensional information of the target comprises a target type and a boundary frame, the estimated distance of the target is obtained according to the target type and the size of the boundary frame, and the first coordinate of the target is estimated through the estimated distance of the target; acquiring continuous twice point cloud data of the second sensor before target time according to the data acquisition time of the first sensor, and predicting the point cloud data of the target time; mapping the point cloud data of the target time to a two-dimensional space of the target, screening out target point cloud data, and obtaining the real physical size of the target according to the target point cloud data, wherein the target point cloud data comprises the point cloud data which is mapped into the boundary frame and accords with the first coordinate; by fusing the data acquired by the first sensor and the second sensor, the target information is detected more accurately, and particularly under the condition that partial shielding exists between targets, an accurate detection result can be obtained;
further, the target information summary table comprises a plurality of target information sub-tables, each target corresponds to one target information sub-table, the first width, the second width and the height of the target in the corresponding target information sub-table are obtained, and the first width, the second width and the height are stored in the target information sub-table, so that the size of the target is updated in real time, and the target information is detected more accurately;
further, the obtaining the target based on the second coordinate of the second sensor includes calculating an average value of the point cloud data conforming to the first coordinate, obtaining a three-dimensional center point coordinate of the target, and calculating a real physical size of the target; if the point number of the point cloud data conforming to the first coordinate is less than a first preset value, usingThe true physical size of the target is calculated, thereby reducing the effect of noise.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the prior art, a brief description of the drawings is provided below, wherein it is apparent that the drawings in the following description are some, but not all, embodiments of the present invention. Other figures may be derived from these figures without inventive effort for a person of ordinary skill in the art.
FIG. 1 is a schematic flow chart of a multi-sensor target detection fusion method based on distance estimation according to an embodiment of the present invention;
FIG. 2 is a flow chart of a multi-sensor target detection fusion method based on distance estimation according to another embodiment of the present invention;
FIG. 3 is a schematic block diagram of a multi-sensor target detection fusion device based on distance estimation according to an embodiment of the present invention;
fig. 4 is a schematic block diagram of a multi-sensor target detection fusion device based on distance estimation according to another embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The technical scheme of the invention is described in detail below by specific examples. The following embodiments may be combined with each other, and some embodiments may not be repeated for the same or similar concepts or processes.
Based on the problems existing in the prior art, the embodiment of the invention provides a multi-sensor target detection fusion method and device based on distance estimation, and the target information is detected more accurately by fusing the detection results of a first sensor and a second sensor.
Fig. 1 is a flow chart of a multi-sensor target detection fusion method based on distance estimation according to an embodiment of the invention. Referring now to fig. 1, an embodiment of the present invention provides a multi-sensor target detection fusion method based on distance estimation, including:
step S101: and calibrating the first sensor and the second sensor respectively to obtain a transformation matrix between the first sensor and the second sensor and obtain an internal reference of the first sensor.
In some embodiments, the first sensor is at least one monocular camera and the second sensor is at least one lidar.
Step S102: a summary of target information is predefined, the summary of target information comprising the usual physical dimensions of all targets detected by the visual detection neural network.
In practice, all targets differ according to neural network model, some neural network models being able to detect 20 objects, some 80 or more. These targets are all trained according to different needs, and in autopilot targets typically include: people, cars, trucks, buses, bicycles, electric vehicles, traffic lights, traffic signs, and the like.
The usual physical dimensions of all objects are predefined differently depending on the object type, e.g. the object is a person, the predefined height is 1.75 meters, the first width is 0.6 meters, the second width is 0.3 meters, the first width represents the front width of the person and the second width represents the side width of the person.
Step S103: performing target detection on the visual image acquired by the first sensor through the visual detection neural network to obtain two-dimensional information of the target, wherein the two-dimensional information of the target comprises a target type and a boundary frame, the estimated distance of the target is obtained according to the target type and the size of the boundary frame, and the first coordinate of the target is estimated through the estimated distance of the target.
In some embodiments, a target is obtained, for example, from a visual image, the target having a pixel size of 100 x 30 in the image. If the object type is a person, the possible distance is 3 meters, if the object type is an automobile, the distance may be 9 meters, so that the exact estimated distance of the object needs to be obtained in combination with the size of the object type and the bounding box.
In some embodiments, the target information summary table includes a plurality of target information sub-tables, each target corresponds to one target information sub-table, and the estimating the distance between the targets according to the size of the bounding box is specifically calculated by the following formula:
wherein , andIs an internal reference of the first sensor,/-A>For the first width of the target in the corresponding target information sub-table,/for the target>For a second width of the target in the corresponding target information summary table,/for the target>Storing the first width, the second width and the height in the target information sub-table for the height of the target in the corresponding target information summary table, andThe pixel width and the height of the target are respectively obtained according to the size of the boundary frame of the targetHeight (I) of (II)>Is a preliminary estimated distance of the target based on the first width,/i>Is a preliminary estimated distance of the target based on the second width,/i>Is a preliminary estimated distance of the target based on the height.
The predefined target information summary is a relatively coarse table defining the coarse size of all targets. For example, the target type is a person, and the height in the predefined target information summary table is defined as 1.7 meters.
The predefined target information summary table comprises target information sub-tables, which are more accurate tables, and each target corresponds to one target information sub-table. The initial information of the target information sub-table is obtained from a predefined target information summary table. For example, the target type is a person, the initial information of the target information sub-table is obtained from a predefined target information summary table, the height of the target is 1.7 m, the height of the target obtained through the second sensor data fusion is 1.9 m, and the height of the target is updated to 1.9 m in the target information sub-table, so that the target is more accurate in the subsequent detection and tracking processes.
After a period of detection, for example, the predefined target information summary and target information sub-table may be information comprising: in a predefined target information summary table, the target type is a person, and the height is defined as 1.7 meters; the target type is an automobile and the height is defined as 1.6 meters. In the first target information sub-table, the target type is a person, and the height is updated to be 1.9 meters. In the second target information sub-table, the target type is a person, and the height is updated to be 1.75 meters. In the third target information sub-table, the target type is an automobile, and the height is updated to be 1.7 meters. In some embodiments, each will、The value of (2) is compared with 1 if +.>The final estimated distance of the target is +.>If->The final estimated distance of the target is +.>The first ratio->Said second ratio->The third ratio->Specifically, the calculation is performed by the following formula:
In some embodiments, the estimating the first coordinate of the target through the estimated distance of the target is specifically calculated by the following formula:
wherein ,for the three-dimensional Cartesian coordinates of the object relative to the first sensor +.>Indicating the front-back direction, +.>Indicating the left and right direction>Indicating up-down direction, ++> andIs the center point coordinates of the target.
In some embodiments, it willTranslating into coordinates of the target relative to the second sensorCoordinates of the second sensor +.>Specifically, the calculation is performed by the following formula:
Step S104: and acquiring continuous twice point cloud data of the second sensor before the target time according to the data acquisition time of the first sensor, and predicting the point cloud data of the target time. In some embodiments, the predicting the point cloud data of the target time is specifically calculated by the following formula:
wherein ,for point cloud data of the target time, +.> andFor the second sensor's two consecutive point cloud data before the target time,/o>For the point cloud data acquired at the moment on the second sensor, +.>Point cloud data acquired at a later time for the second sensor, +.>For the current moment +.>For the last moment,/->To be the last moment.
In a specific implementation, the first sensor and the second sensor acquire data not in synchronization, for example, the moments when the first sensor and the second sensor acquire image data are as follows: at 100 milliseconds, the first sensor collects image data; at 120 milliseconds, the second sensor collects image data; at 150 milliseconds, the first sensor collects image data; at 170 ms, the second sensor collects image data; at 200 milliseconds, the first sensor acquires image data.
After detecting the image data result of the first sensor at the 200 ms moment, it is desirable to obtain the point cloud data of the second sensor at the 200 ms moment, but the 200 ms moment does not have real point cloud data, and the point cloud data of the second sensor at the 200 ms moment can be predicted according to the point cloud data of the second sensor at the 120 ms moment and the 170 ms moment.
In some embodiments, the point cloud data conforming to the first coordinate includes a tolerance, specifically determined by the following formula:
Step S105: and mapping the point cloud data of the target time to a two-dimensional space of the target, screening out target point cloud data, and obtaining the real physical size of the target according to the target point cloud data, wherein the target point cloud data comprises the point cloud data which is mapped into the boundary frame and accords with the first coordinate.
In some embodiments, the obtaining the second coordinates of the target based on the second sensor includes calculating an average of the point cloud data conforming to the first coordinates, obtaining three-dimensional center point coordinates of the target, and calculating a true physical size of the target; if the point number of the point cloud data conforming to the first coordinate is less than a first preset value, usingThe true physical dimensions of the target are calculated.
Fig. 2 is a flow chart of a multi-sensor target detection fusion method based on distance estimation according to an embodiment of the invention. Referring now to fig. 2, an embodiment of the present invention provides a multi-sensor target detection fusion method based on distance estimation, including:
step S201: calibrating a first sensor and a second sensor respectively to obtain a transformation matrix between the first sensor and the second sensor and obtain an internal reference of the first sensor;
step S202: predefining a target information summary table, the target information summary table comprising a common physical size of all targets detected by the visual detection neural network;
step S203: performing target detection on the visual image acquired by the first sensor through the visual detection neural network to obtain two-dimensional information of the target, wherein the two-dimensional information of the target comprises a target type and a boundary frame, the estimated distance of the target is obtained according to the target type and the size of the boundary frame, and the first coordinate of the target is estimated through the estimated distance of the target;
step S204: the target information summary table comprises a plurality of target information sub-tables, each target corresponds to one target information sub-table, and the first width, the second width and the height of each target are acquired in real time and stored in the target information sub-tables;
step S205: acquiring continuous twice point cloud data of the second sensor before target time according to the data acquisition time of the first sensor, and predicting the point cloud data of the target time;
step S206: and mapping the point cloud data of the target time to a two-dimensional space of the target, screening out target point cloud data, and obtaining the real physical size of the target according to the target point cloud data, wherein the target point cloud data comprises the point cloud data which is mapped into the boundary frame and accords with the first coordinate.
Fig. 3 is a schematic block diagram of a multi-sensor target detection fusion device based on distance estimation according to an embodiment of the invention. Referring now to FIG. 3, one embodiment of the present invention provides a multi-sensor target detection fusion apparatus based on distance estimation, comprising:
the sensor calibration module 31 is configured to calibrate a first sensor and a second sensor respectively, obtain a transformation matrix between the first sensor and the second sensor, and obtain an internal reference of the first sensor;
a target information summary table predefined module 32 for predefining a target information summary table comprising the usual physical dimensions of all targets detected by the visual detection neural network;
a first sensor data acquisition module 33, configured to perform target detection on a visual image acquired by the first sensor through the visual detection neural network, to obtain two-dimensional information of the target, where the two-dimensional information of the target includes a target type and a bounding box, obtain an estimated distance of the target according to the target type and the size of the bounding box, and estimate a first coordinate of the target through the estimated distance of the target;
a second sensor data acquisition module 34, configured to acquire, according to the data acquisition time of the first sensor, two consecutive point cloud data of the second sensor before a target time, and predict the point cloud data of the target time;
the fusion module 35 is configured to map the point cloud data of the target time to a two-dimensional space of the target, screen out target point cloud data, and obtain a real physical size of the target according to the target point cloud data, where the target point cloud data includes the point cloud data mapped into the bounding box and conforming to the first coordinate.
Fig. 4 is a schematic block diagram of a multi-sensor target detection fusion device based on distance estimation according to another embodiment of the present invention. Referring now to FIG. 4, one embodiment of the present invention provides a multi-sensor target detection fusion apparatus based on distance estimation, comprising:
a sensor calibration module 41, configured to calibrate a first sensor and a second sensor, respectively, obtain a transformation matrix between the first sensor and the second sensor, and obtain an internal reference of the first sensor;
a target information summary table predefining module 42 for predefining a target information summary table including a usual physical size of all targets detected by the visual detection neural network;
the target information sub-table module 43 is configured to acquire, in real time, a first width, a second width, and a height of each target, and store the first width, the second width, and the height in the target information sub-table, where each target corresponds to one target information sub-table;
a first sensor data acquisition module 44, configured to perform target detection on a visual image acquired by the first sensor through the visual detection neural network, to obtain two-dimensional information of the target, where the two-dimensional information of the target includes a target type and a bounding box, obtain an estimated distance of the target according to the target type and the size of the bounding box, and estimate a first coordinate of the target through the estimated distance of the target;
a second sensor data acquisition module 45, configured to acquire, according to the data acquisition time of the first sensor, two consecutive point cloud data of the second sensor before a target time, and predict point cloud data of the target time;
the fusion module 46 is configured to map the point cloud data of the target time to a two-dimensional space of the target, screen out target point cloud data, and obtain a real physical size of the target according to the target point cloud data, where the target point cloud data includes the point cloud data mapped into the bounding box and conforming to the first coordinate. After obtaining the real three-dimensional coordinates of the target, the fusion module 46 can feed back the coordinate information to the target information sub-table module 43, so that the target information sub-table can acquire more accurate target information, and the target information can be detected more accurately.
In summary, according to the multi-sensor target detection fusion method and device based on distance estimation provided by the embodiment of the invention, the target information summary table is predefined, and the target information summary table comprises the common physical dimensions of all targets detected by the visual detection neural network; performing target detection on the visual image acquired by the first sensor through the visual detection neural network to obtain two-dimensional information of the target, wherein the two-dimensional information of the target comprises a target type and a boundary frame, the estimated distance of the target is obtained according to the target type and the size of the boundary frame, and the first coordinate of the target is estimated through the estimated distance of the target; acquiring continuous twice point cloud data of the second sensor before target time according to the data acquisition time of the first sensor, and predicting the point cloud data of the target time; mapping the point cloud data of the target time to a two-dimensional space of the target, screening out target point cloud data, and obtaining the real physical size of the target according to the target point cloud data, wherein the target point cloud data comprises the point cloud data which is mapped into the boundary frame and accords with the first coordinate; the data acquired by the first sensor and the second sensor are fused, so that the target information is detected more accurately, and an accurate detection result can be obtained particularly under the condition that partial shielding exists between targets;
further, the target information summary table comprises a plurality of target information sub-tables, each target corresponds to one target information sub-table, the first width, the second width and the height of the target in the corresponding target information sub-table are obtained, and the first width, the second width and the height are stored in the target information sub-table, so that the size of the target is updated in real time, and the target information is detected more accurately;
further, the obtaining the target based on the second coordinate of the second sensor includes calculating an average value of the point cloud data conforming to the first coordinate, obtaining a three-dimensional center point coordinate of the target, and calculating a real physical size of the target; if the point number of the point cloud data conforming to the first coordinate is less than a first preset value, usingThe true physical size of the target is calculated, thereby reducing the effect of noise. />
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the invention.
Claims (10)
1. A multi-sensing target detection fusion method based on distance estimation is characterized by comprising the following steps:
calibrating a first sensor and a second sensor respectively to obtain a transformation matrix between the first sensor and the second sensor and obtain an internal reference of the first sensor;
predefining a target information summary table, the target information summary table comprising a common physical size of all targets detected by the visual detection neural network;
performing target detection on the visual image acquired by the first sensor through the visual detection neural network to obtain two-dimensional information of the target, wherein the two-dimensional information of the target comprises a target type and a boundary frame, the estimated distance of the target is obtained according to the target type and the size of the boundary frame, and the first coordinate of the target is estimated through the estimated distance of the target;
acquiring continuous twice point cloud data of the second sensor before target time according to the data acquisition time of the first sensor, and predicting the point cloud data of the target time;
and mapping the point cloud data of the target time to a two-dimensional space of the target, screening out target point cloud data, and obtaining the real physical size of the target according to the target point cloud data, wherein the target point cloud data comprises the point cloud data which is mapped into the boundary frame and accords with the first coordinate.
2. The distance estimation-based multi-sensor target detection fusion method of claim 1, wherein the first sensor is at least one monocular camera and the second sensor is at least one lidar.
3. The multi-sensor target detection fusion method based on distance estimation according to claim 1, wherein the target information summary table comprises a plurality of target information sub-tables, each target corresponds to one target information sub-table, and the distance of the target estimated according to the size of the bounding box is specifically calculated by the following formula:
wherein , andIs an internal reference of the first sensor,/-A>For the first width of the target in the corresponding target information sub-table,/for the target>For the second width of the target in the corresponding target information sub-table,/for the target>Feeding back the first width, the second width and the height of the target in the corresponding target information sub-table to a predefined target information summary table for the height of the target in the corresponding target information sub-table, and +_> andThe pixel width and height of the target respectively, obtaining the pixel width and height of the object according to the size of the boundary box of the object,/->Is a preliminary estimated distance of the target based on the first width,/i>Is a preliminary estimated distance of the target based on the second width,/i>Is a preliminary estimated distance of the target based on the height.
4. The multi-sensor target detection fusion method based on distance estimation according to claim 3, wherein the following steps are respectively performed、The value of (2) is compared with 1 if +.>The final estimated distance of the target isIf->The final estimated distance of the target is +.>The first ratio->Said second ratio->The third ratio->Specifically, the calculation is performed by the following formula:
5. The method for multi-sensor target detection fusion based on distance estimation according to claim 4, wherein,
the first coordinate of the target is estimated through the estimated distance of the target, specifically calculated through the following formula:
6. The distance estimation-based multi-sensor target detection fusion method according to claim 5, whereinConverting into coordinates of the target relative to the second sensor>Coordinates of the second sensor +.>Specifically, the calculation is performed by the following formula:
7. The method for multi-sensor target detection fusion based on distance estimation according to claim 1, wherein,
the point cloud data of the target time is predicted, and is calculated specifically by the following formula:
wherein ,for the point of the target timeCloud data-> andFor the second sensor's two consecutive point cloud data before the target time,/o>For the point cloud data acquired at the moment on the second sensor, +.>Point cloud data acquired at a later time for the second sensor, +.>For the current moment +.>For the last moment,/->To be the last moment. />
9. The distance estimation based multisensor of claim 8The target detection fusion method is characterized in that the obtaining of the target based on the second coordinate of the second sensor comprises calculating an average value of the point cloud data conforming to the first coordinate, obtaining a three-dimensional center point coordinate of the target and calculating the real physical size of the target; if the point number of the point cloud data conforming to the first coordinate is less than a first preset value, usingThe true physical dimensions of the target are calculated.
10. A multi-sensor target detection fusion device based on distance estimation, comprising:
the sensor calibration module is used for calibrating a first sensor and a second sensor respectively, obtaining a transformation matrix between the first sensor and the second sensor and obtaining an internal reference of the first sensor;
a target information summary predefined module for predefining a target information summary comprising a common physical size of all targets detected by the visual detection neural network;
the first sensor data acquisition module is used for carrying out target detection on the visual image acquired by the first sensor through the visual detection neural network to obtain two-dimensional information of the target, wherein the two-dimensional information of the target comprises a target type and a boundary frame, the estimated distance of the target is obtained according to the target type and the size of the boundary frame, and the first coordinate of the target is estimated according to the estimated distance of the target;
the second sensor data acquisition module is used for acquiring continuous twice point cloud data of the second sensor before the target time according to the data acquisition time of the first sensor, and predicting the point cloud data of the target time;
the fusion module is used for mapping the point cloud data of the target time to the two-dimensional space of the target, screening out target point cloud data, and obtaining the real physical size of the target according to the target point cloud data, wherein the target point cloud data comprises the point cloud data which are mapped into the boundary frame and accord with the first coordinates.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310042622.XA CN116052121B (en) | 2023-01-28 | 2023-01-28 | Multi-sensing target detection fusion method and device based on distance estimation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310042622.XA CN116052121B (en) | 2023-01-28 | 2023-01-28 | Multi-sensing target detection fusion method and device based on distance estimation |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116052121A true CN116052121A (en) | 2023-05-02 |
CN116052121B CN116052121B (en) | 2023-06-27 |
Family
ID=86123678
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310042622.XA Active CN116052121B (en) | 2023-01-28 | 2023-01-28 | Multi-sensing target detection fusion method and device based on distance estimation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116052121B (en) |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110244322A (en) * | 2019-06-28 | 2019-09-17 | 东南大学 | Pavement construction robot environment sensory perceptual system and method based on Multiple Source Sensor |
CN111464978A (en) * | 2019-01-22 | 2020-07-28 | 岳秀兰 | Vehicle remote driving system established by connection of primary wireless equipment and secondary wireless equipment through Internet of things |
CN112396650A (en) * | 2020-03-30 | 2021-02-23 | 青岛慧拓智能机器有限公司 | Target ranging system and method based on fusion of image and laser radar |
CN112652016A (en) * | 2020-12-30 | 2021-04-13 | 北京百度网讯科技有限公司 | Point cloud prediction model generation method, pose estimation method and device |
CN112733678A (en) * | 2020-12-31 | 2021-04-30 | 深兰人工智能(深圳)有限公司 | Ranging method, ranging device, computer equipment and storage medium |
CN113436258A (en) * | 2021-06-17 | 2021-09-24 | 中国船舶重工集团公司第七0七研究所九江分部 | Offshore pontoon detection method and system based on fusion of vision and laser radar |
US20210302534A1 (en) * | 2017-06-13 | 2021-09-30 | Veoneer Sweden Ab | Error estimation for a vehicle environment detection system |
CN114708585A (en) * | 2022-04-15 | 2022-07-05 | 电子科技大学 | Three-dimensional target detection method based on attention mechanism and integrating millimeter wave radar with vision |
CN115542312A (en) * | 2022-11-30 | 2022-12-30 | 苏州挚途科技有限公司 | Multi-sensor association method and device |
US20230014874A1 (en) * | 2020-10-22 | 2023-01-19 | Tencent Technology (Shenzhen) Company Limited | Obstacle detection method and apparatus, computer device, and storage medium |
-
2023
- 2023-01-28 CN CN202310042622.XA patent/CN116052121B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210302534A1 (en) * | 2017-06-13 | 2021-09-30 | Veoneer Sweden Ab | Error estimation for a vehicle environment detection system |
CN111464978A (en) * | 2019-01-22 | 2020-07-28 | 岳秀兰 | Vehicle remote driving system established by connection of primary wireless equipment and secondary wireless equipment through Internet of things |
CN110244322A (en) * | 2019-06-28 | 2019-09-17 | 东南大学 | Pavement construction robot environment sensory perceptual system and method based on Multiple Source Sensor |
CN112396650A (en) * | 2020-03-30 | 2021-02-23 | 青岛慧拓智能机器有限公司 | Target ranging system and method based on fusion of image and laser radar |
US20230014874A1 (en) * | 2020-10-22 | 2023-01-19 | Tencent Technology (Shenzhen) Company Limited | Obstacle detection method and apparatus, computer device, and storage medium |
CN112652016A (en) * | 2020-12-30 | 2021-04-13 | 北京百度网讯科技有限公司 | Point cloud prediction model generation method, pose estimation method and device |
CN112733678A (en) * | 2020-12-31 | 2021-04-30 | 深兰人工智能(深圳)有限公司 | Ranging method, ranging device, computer equipment and storage medium |
CN113436258A (en) * | 2021-06-17 | 2021-09-24 | 中国船舶重工集团公司第七0七研究所九江分部 | Offshore pontoon detection method and system based on fusion of vision and laser radar |
CN114708585A (en) * | 2022-04-15 | 2022-07-05 | 电子科技大学 | Three-dimensional target detection method based on attention mechanism and integrating millimeter wave radar with vision |
CN115542312A (en) * | 2022-11-30 | 2022-12-30 | 苏州挚途科技有限公司 | Multi-sensor association method and device |
Non-Patent Citations (2)
Title |
---|
BENEDIKT MERSCH等: "Self-supervised Point Cloud Prediction Using 3D Spatio-temporal Convolutional Networks", 《5TH CONFERENCE ON ROBOT LEARNING (CORL 2021)》, pages 1 - 11 * |
徐泽楷: "多模态传感器融合技术研究", 《中国优秀硕士论文电子期刊》, pages 27 - 47 * |
Also Published As
Publication number | Publication date |
---|---|
CN116052121B (en) | 2023-06-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10620000B2 (en) | Calibration apparatus, calibration method, and calibration program | |
CN110942449A (en) | Vehicle detection method based on laser and vision fusion | |
CN107038723B (en) | Method and system for estimating rod-shaped pixels | |
JP4943034B2 (en) | Stereo image processing device | |
WO2016117200A1 (en) | Outside environment recognition device for vehicles and vehicle behavior control device using same | |
KR20190102665A (en) | Calibration system and method using real-world object information | |
CN110738121A (en) | front vehicle detection method and detection system | |
CN114359181B (en) | Intelligent traffic target fusion detection method and system based on image and point cloud | |
CN112562093B (en) | Object detection method, electronic medium, and computer storage medium | |
CN113205604A (en) | Feasible region detection method based on camera and laser radar | |
CN111723778B (en) | Vehicle distance measuring system and method based on MobileNet-SSD | |
CN114495064A (en) | Monocular depth estimation-based vehicle surrounding obstacle early warning method | |
CN114463303B (en) | Road target detection method based on fusion of binocular camera and laser radar | |
US20190297314A1 (en) | Method and Apparatus for the Autocalibration of a Vehicle Camera System | |
US20240192316A1 (en) | Method for calibrating sensor information from a vehicle, and vehicle assistance system | |
Petrovai et al. | A stereovision based approach for detecting and tracking lane and forward obstacles on mobile devices | |
CN112130153A (en) | Method for realizing edge detection of unmanned vehicle based on millimeter wave radar and camera | |
CN113029185A (en) | Road marking change detection method and system in crowdsourcing type high-precision map updating | |
CN116486351A (en) | Driving early warning method, device, equipment and storage medium | |
CN111538008B (en) | Transformation matrix determining method, system and device | |
CN114359865A (en) | Obstacle detection method and related device | |
CN113869440A (en) | Image processing method, apparatus, device, medium, and program product | |
CN116052121B (en) | Multi-sensing target detection fusion method and device based on distance estimation | |
CN114152942B (en) | Millimeter wave radar and vision second-order fusion multi-classification target detection method | |
CN106874837B (en) | Vehicle detection method based on video image processing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |