CN116052121A - Multi-sensing target detection fusion method and device based on distance estimation - Google Patents

Multi-sensing target detection fusion method and device based on distance estimation Download PDF

Info

Publication number
CN116052121A
CN116052121A CN202310042622.XA CN202310042622A CN116052121A CN 116052121 A CN116052121 A CN 116052121A CN 202310042622 A CN202310042622 A CN 202310042622A CN 116052121 A CN116052121 A CN 116052121A
Authority
CN
China
Prior art keywords
target
sensor
point cloud
cloud data
coordinate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310042622.XA
Other languages
Chinese (zh)
Other versions
CN116052121B (en
Inventor
孙坚伟
胡力
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Core Computing Technology Co ltd
Original Assignee
Shanghai Core Computing Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Core Computing Technology Co ltd filed Critical Shanghai Core Computing Technology Co ltd
Priority to CN202310042622.XA priority Critical patent/CN116052121B/en
Publication of CN116052121A publication Critical patent/CN116052121A/en
Application granted granted Critical
Publication of CN116052121B publication Critical patent/CN116052121B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S17/00Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
    • G01S17/86Combinations of lidar systems with systems other than lidar, radar or sonar, e.g. with direction finders
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/803Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of input or preprocessed data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30248Vehicle exterior or interior
    • G06T2207/30252Vehicle exterior; Vicinity of vehicle
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Abstract

The invention provides a multi-sensing target detection fusion method and device based on distance estimation, which are used for calibrating a first sensor and a second sensor respectively; predefining a target information summary table, wherein the target information summary table comprises the common physical sizes of all targets detected by the visual detection neural network; performing target detection on the visual image acquired by the first sensor to acquire two-dimensional information of a target, and estimating a first coordinate of the target; acquiring continuous twice point cloud data of the second sensor before target time according to the data acquisition time of the first sensor, and predicting the point cloud data of the target time; and mapping the point cloud data of the target time to a two-dimensional space of the target, screening out target point cloud data, and obtaining the real physical size of the target according to the target point cloud data. According to the multi-sensor target detection fusion method and device based on distance estimation, the target information is detected more accurately by fusing the detection results of the first sensor and the second sensor.

Description

Multi-sensing target detection fusion method and device based on distance estimation
Technical Field
The embodiment of the invention relates to the technical field of automatic driving, in particular to a multi-sensing target detection fusion method and device based on distance estimation.
Background
In the field of autopilot, it is often necessary to use a variety of sensors for sensing to ensure the safety and reliability of autopilot. For example, at least one monocular camera and one radar are used, the neural network is used for carrying out target recognition on the images, the Lei Dadian cloud is clustered, and finally, the targets detected by various sensors are fused and tracked.
However, since the monocular vision image has no depth information, a common fusion processing method may be to add depth information to the neural network, but this requires retraining the neural network, which is time-consuming. The method can also be used for projecting the radar point cloud clustering result into a two-dimensional image space to find the coincidence region with the image detection target, but the method must ensure that the image and the radar point cloud detect the target simultaneously, and the common condition that partial shielding exists between the front and rear of the two targets is difficult to process.
Therefore, a new multi-sensing target detection fusion method and device are needed to effectively solve the above problems.
Disclosure of Invention
The invention provides a multi-sensor target detection fusion method and device based on distance estimation, which can be used for more accurately detecting target information by fusing detection results of a first sensor and a second sensor.
The embodiment of the invention provides a multi-sensing target detection fusion method based on distance estimation, which comprises the following steps:
calibrating a first sensor and a second sensor respectively to obtain a transformation matrix between the first sensor and the second sensor and obtain an internal reference of the first sensor;
predefining a target information summary table, the target information summary table comprising a common physical size of all targets detected by the visual detection neural network;
performing target detection on the visual image acquired by the first sensor through the visual detection neural network to obtain two-dimensional information of the target, wherein the two-dimensional information of the target comprises a target type and a boundary frame, the estimated distance of the target is obtained according to the target type and the size of the boundary frame, and the first coordinate of the target is estimated through the estimated distance of the target;
acquiring continuous twice point cloud data of the second sensor before target time according to the data acquisition time of the first sensor, and predicting the point cloud data of the target time;
and mapping the point cloud data of the target time to a two-dimensional space of the target, screening out target point cloud data, and obtaining the real physical size of the target according to the target point cloud data, wherein the target point cloud data comprises the point cloud data which is mapped into the boundary frame and accords with the first coordinate.
Preferably, the first sensor is at least one monocular camera, and the second sensor is at least one lidar.
Preferably, the target information summary table includes a plurality of target information sub-tables, each target corresponds to one target information sub-table, and the estimating the distance between the targets according to the size of the bounding box is specifically calculated by the following formula:
Figure SMS_1
wherein ,
Figure SMS_3
and />
Figure SMS_5
Is an internal reference of the first sensor,/-A>
Figure SMS_10
For the first width of the target in the corresponding target information sub-table,/for the target>
Figure SMS_4
For the second width of the target in the corresponding target information sub-table,/for the target>
Figure SMS_7
Storing the first width, the second width and the height in the target information sub-table for the height of the target in the corresponding target information sub-table,/a>
Figure SMS_8
and />
Figure SMS_11
The pixel width and height of the target respectively, obtaining the pixel width and height of the object according to the size of the boundary box of the object,/->
Figure SMS_2
Is a preliminary estimated distance of the target based on the first width,/i>
Figure SMS_6
Is a preliminary estimated distance of the target based on the second width,/i>
Figure SMS_9
Is a preliminary estimated distance of the target based on the height.
Preferably, will respectively
Figure SMS_13
、/>
Figure SMS_15
The value of (2) is compared with 1 if +.>
Figure SMS_18
The final estimated distance of the target is +.>
Figure SMS_14
If->
Figure SMS_16
The final estimated distance of the target is
Figure SMS_19
The first ratio->
Figure SMS_20
Said second ratio->
Figure SMS_12
The third ratio->
Figure SMS_17
Specifically, the calculation is performed by the following formula:
Figure SMS_21
wherein ,
Figure SMS_22
and />
Figure SMS_23
The pixel width and height of the target, respectively.
Preferably, the estimating the first coordinate of the target through the estimated distance of the target is specifically calculated by the following formula:
Figure SMS_24
Figure SMS_25
Figure SMS_26
/>
wherein ,
Figure SMS_27
for the three-dimensional Cartesian coordinates of the object relative to the first sensor +.>
Figure SMS_28
Indicating the front-back direction, +.>
Figure SMS_29
Indicating the left and right direction>
Figure SMS_30
Indicating up-down direction, ++>
Figure SMS_31
and />
Figure SMS_32
Is the center point coordinates of the target.
Preferably, will
Figure SMS_33
Converting into coordinates of the target relative to the second sensor>
Figure SMS_34
Coordinates of the second sensor +.>
Figure SMS_35
Specifically, the calculation is performed by the following formula:
Figure SMS_36
wherein ,
Figure SMS_37
is a transformation matrix between the first sensor and the second sensor.
Preferably, the predicting the point cloud data acquired by the second sensor is specifically calculated by the following formula:
Figure SMS_38
wherein ,
Figure SMS_40
for point cloud data of the target time, +.>
Figure SMS_43
and />
Figure SMS_45
For the second sensor's two consecutive point cloud data before the target time,/o>
Figure SMS_41
For the point cloud data acquired at the moment on the second sensor, +.>
Figure SMS_42
Point cloud data acquired at a later time for the second sensor, +.>
Figure SMS_44
For the current moment +.>
Figure SMS_46
For the last moment,/->
Figure SMS_39
To be the last moment.
Preferably, the point cloud data conforming to the first coordinate includes a tolerance, specifically determined by the following formula:
Figure SMS_47
wherein ,
Figure SMS_48
and the allowable error is the allowable error.
Preferably, the obtaining the second coordinate of the target based on the second sensor includes calculating an average value of the point cloud data conforming to the first coordinate, obtaining a three-dimensional center point coordinate of the target, and calculating a real physical size of the target; if the point number of the point cloud data conforming to the first coordinate is less than a first preset value, using
Figure SMS_49
The true physical dimensions of the target are calculated.
The embodiment of the invention also provides a multi-sensing target detection fusion device based on distance estimation, which comprises the following steps:
the sensor calibration module is used for calibrating a first sensor and a second sensor respectively, obtaining a transformation matrix between the first sensor and the second sensor and obtaining an internal reference of the first sensor;
a target information summary predefined module for predefining a target information summary comprising a common physical size of all targets detected by the visual detection neural network;
the first sensor data acquisition module is used for carrying out target detection on the visual image acquired by the first sensor through the visual detection neural network to obtain two-dimensional information of the target, wherein the two-dimensional information of the target comprises a target type and a boundary frame, the estimated distance of the target is obtained according to the target type and the size of the boundary frame, and the first coordinate of the target is estimated according to the estimated distance of the target;
the second sensor data acquisition module is used for acquiring continuous twice point cloud data of the second sensor before the target time according to the data acquisition time of the first sensor, and predicting the point cloud data of the target time;
the fusion module is used for mapping the point cloud data of the target time to the two-dimensional space of the target, screening out target point cloud data, and obtaining the real physical size of the target according to the target point cloud data, wherein the target point cloud data comprises the point cloud data which are mapped into the boundary frame and accord with the first coordinates.
Compared with the prior art, the technical scheme of the embodiment of the invention has the following beneficial effects:
according to the multi-sensing target detection fusion method and device based on distance estimation, the target information summary table is predefined, and the target information summary table comprises the common physical dimensions of all targets detected by the visual detection neural network; performing target detection on the visual image acquired by the first sensor through the visual detection neural network to obtain two-dimensional information of the target, wherein the two-dimensional information of the target comprises a target type and a boundary frame, the estimated distance of the target is obtained according to the target type and the size of the boundary frame, and the first coordinate of the target is estimated through the estimated distance of the target; acquiring continuous twice point cloud data of the second sensor before target time according to the data acquisition time of the first sensor, and predicting the point cloud data of the target time; mapping the point cloud data of the target time to a two-dimensional space of the target, screening out target point cloud data, and obtaining the real physical size of the target according to the target point cloud data, wherein the target point cloud data comprises the point cloud data which is mapped into the boundary frame and accords with the first coordinate; by fusing the data acquired by the first sensor and the second sensor, the target information is detected more accurately, and particularly under the condition that partial shielding exists between targets, an accurate detection result can be obtained;
further, the target information summary table comprises a plurality of target information sub-tables, each target corresponds to one target information sub-table, the first width, the second width and the height of the target in the corresponding target information sub-table are obtained, and the first width, the second width and the height are stored in the target information sub-table, so that the size of the target is updated in real time, and the target information is detected more accurately;
further, the obtaining the target based on the second coordinate of the second sensor includes calculating an average value of the point cloud data conforming to the first coordinate, obtaining a three-dimensional center point coordinate of the target, and calculating a real physical size of the target; if the point number of the point cloud data conforming to the first coordinate is less than a first preset value, using
Figure SMS_50
The true physical size of the target is calculated, thereby reducing the effect of noise.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the prior art, a brief description of the drawings is provided below, wherein it is apparent that the drawings in the following description are some, but not all, embodiments of the present invention. Other figures may be derived from these figures without inventive effort for a person of ordinary skill in the art.
FIG. 1 is a schematic flow chart of a multi-sensor target detection fusion method based on distance estimation according to an embodiment of the present invention;
FIG. 2 is a flow chart of a multi-sensor target detection fusion method based on distance estimation according to another embodiment of the present invention;
FIG. 3 is a schematic block diagram of a multi-sensor target detection fusion device based on distance estimation according to an embodiment of the present invention;
fig. 4 is a schematic block diagram of a multi-sensor target detection fusion device based on distance estimation according to another embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The technical scheme of the invention is described in detail below by specific examples. The following embodiments may be combined with each other, and some embodiments may not be repeated for the same or similar concepts or processes.
Based on the problems existing in the prior art, the embodiment of the invention provides a multi-sensor target detection fusion method and device based on distance estimation, and the target information is detected more accurately by fusing the detection results of a first sensor and a second sensor.
Fig. 1 is a flow chart of a multi-sensor target detection fusion method based on distance estimation according to an embodiment of the invention. Referring now to fig. 1, an embodiment of the present invention provides a multi-sensor target detection fusion method based on distance estimation, including:
step S101: and calibrating the first sensor and the second sensor respectively to obtain a transformation matrix between the first sensor and the second sensor and obtain an internal reference of the first sensor.
In some embodiments, the first sensor is at least one monocular camera and the second sensor is at least one lidar.
Step S102: a summary of target information is predefined, the summary of target information comprising the usual physical dimensions of all targets detected by the visual detection neural network.
In practice, all targets differ according to neural network model, some neural network models being able to detect 20 objects, some 80 or more. These targets are all trained according to different needs, and in autopilot targets typically include: people, cars, trucks, buses, bicycles, electric vehicles, traffic lights, traffic signs, and the like.
The usual physical dimensions of all objects are predefined differently depending on the object type, e.g. the object is a person, the predefined height is 1.75 meters, the first width is 0.6 meters, the second width is 0.3 meters, the first width represents the front width of the person and the second width represents the side width of the person.
Step S103: performing target detection on the visual image acquired by the first sensor through the visual detection neural network to obtain two-dimensional information of the target, wherein the two-dimensional information of the target comprises a target type and a boundary frame, the estimated distance of the target is obtained according to the target type and the size of the boundary frame, and the first coordinate of the target is estimated through the estimated distance of the target.
In some embodiments, a target is obtained, for example, from a visual image, the target having a pixel size of 100 x 30 in the image. If the object type is a person, the possible distance is 3 meters, if the object type is an automobile, the distance may be 9 meters, so that the exact estimated distance of the object needs to be obtained in combination with the size of the object type and the bounding box.
In some embodiments, the target information summary table includes a plurality of target information sub-tables, each target corresponds to one target information sub-table, and the estimating the distance between the targets according to the size of the bounding box is specifically calculated by the following formula:
Figure SMS_51
/>
wherein ,
Figure SMS_54
and />
Figure SMS_57
Is an internal reference of the first sensor,/-A>
Figure SMS_60
For the first width of the target in the corresponding target information sub-table,/for the target>
Figure SMS_53
For a second width of the target in the corresponding target information summary table,/for the target>
Figure SMS_56
Storing the first width, the second width and the height in the target information sub-table for the height of the target in the corresponding target information summary table,/>
Figure SMS_59
and />
Figure SMS_61
The pixel width and the height of the target are respectively obtained according to the size of the boundary frame of the targetHeight (I) of (II)>
Figure SMS_52
Is a preliminary estimated distance of the target based on the first width,/i>
Figure SMS_55
Is a preliminary estimated distance of the target based on the second width,/i>
Figure SMS_58
Is a preliminary estimated distance of the target based on the height.
The predefined target information summary is a relatively coarse table defining the coarse size of all targets. For example, the target type is a person, and the height in the predefined target information summary table is defined as 1.7 meters.
The predefined target information summary table comprises target information sub-tables, which are more accurate tables, and each target corresponds to one target information sub-table. The initial information of the target information sub-table is obtained from a predefined target information summary table. For example, the target type is a person, the initial information of the target information sub-table is obtained from a predefined target information summary table, the height of the target is 1.7 m, the height of the target obtained through the second sensor data fusion is 1.9 m, and the height of the target is updated to 1.9 m in the target information sub-table, so that the target is more accurate in the subsequent detection and tracking processes.
After a period of detection, for example, the predefined target information summary and target information sub-table may be information comprising: in a predefined target information summary table, the target type is a person, and the height is defined as 1.7 meters; the target type is an automobile and the height is defined as 1.6 meters. In the first target information sub-table, the target type is a person, and the height is updated to be 1.9 meters. In the second target information sub-table, the target type is a person, and the height is updated to be 1.75 meters. In the third target information sub-table, the target type is an automobile, and the height is updated to be 1.7 meters. In some embodiments, each will
Figure SMS_62
、/>
Figure SMS_65
The value of (2) is compared with 1 if +.>
Figure SMS_68
The final estimated distance of the target is +.>
Figure SMS_64
If->
Figure SMS_66
The final estimated distance of the target is +.>
Figure SMS_69
The first ratio->
Figure SMS_70
Said second ratio->
Figure SMS_63
The third ratio->
Figure SMS_67
Specifically, the calculation is performed by the following formula:
Figure SMS_71
wherein ,
Figure SMS_72
and />
Figure SMS_73
The pixel width and height of the target, respectively.
In some embodiments, the estimating the first coordinate of the target through the estimated distance of the target is specifically calculated by the following formula:
Figure SMS_74
wherein ,
Figure SMS_75
for the three-dimensional Cartesian coordinates of the object relative to the first sensor +.>
Figure SMS_76
Indicating the front-back direction, +.>
Figure SMS_77
Indicating the left and right direction>
Figure SMS_78
Indicating up-down direction, ++>
Figure SMS_79
and />
Figure SMS_80
Is the center point coordinates of the target.
In some embodiments, it will
Figure SMS_81
Translating into coordinates of the target relative to the second sensor
Figure SMS_82
Coordinates of the second sensor +.>
Figure SMS_83
Specifically, the calculation is performed by the following formula:
Figure SMS_84
wherein ,
Figure SMS_85
is a transformation matrix between the first sensor and the second sensor.
Step S104: and acquiring continuous twice point cloud data of the second sensor before the target time according to the data acquisition time of the first sensor, and predicting the point cloud data of the target time. In some embodiments, the predicting the point cloud data of the target time is specifically calculated by the following formula:
Figure SMS_86
wherein ,
Figure SMS_88
for point cloud data of the target time, +.>
Figure SMS_90
and />
Figure SMS_92
For the second sensor's two consecutive point cloud data before the target time,/o>
Figure SMS_89
For the point cloud data acquired at the moment on the second sensor, +.>
Figure SMS_91
Point cloud data acquired at a later time for the second sensor, +.>
Figure SMS_93
For the current moment +.>
Figure SMS_94
For the last moment,/->
Figure SMS_87
To be the last moment.
In a specific implementation, the first sensor and the second sensor acquire data not in synchronization, for example, the moments when the first sensor and the second sensor acquire image data are as follows: at 100 milliseconds, the first sensor collects image data; at 120 milliseconds, the second sensor collects image data; at 150 milliseconds, the first sensor collects image data; at 170 ms, the second sensor collects image data; at 200 milliseconds, the first sensor acquires image data.
After detecting the image data result of the first sensor at the 200 ms moment, it is desirable to obtain the point cloud data of the second sensor at the 200 ms moment, but the 200 ms moment does not have real point cloud data, and the point cloud data of the second sensor at the 200 ms moment can be predicted according to the point cloud data of the second sensor at the 120 ms moment and the 170 ms moment.
In some embodiments, the point cloud data conforming to the first coordinate includes a tolerance, specifically determined by the following formula:
Figure SMS_95
wherein ,
Figure SMS_96
and the allowable error is the allowable error.
Step S105: and mapping the point cloud data of the target time to a two-dimensional space of the target, screening out target point cloud data, and obtaining the real physical size of the target according to the target point cloud data, wherein the target point cloud data comprises the point cloud data which is mapped into the boundary frame and accords with the first coordinate.
In some embodiments, the obtaining the second coordinates of the target based on the second sensor includes calculating an average of the point cloud data conforming to the first coordinates, obtaining three-dimensional center point coordinates of the target, and calculating a true physical size of the target; if the point number of the point cloud data conforming to the first coordinate is less than a first preset value, using
Figure SMS_97
The true physical dimensions of the target are calculated.
Fig. 2 is a flow chart of a multi-sensor target detection fusion method based on distance estimation according to an embodiment of the invention. Referring now to fig. 2, an embodiment of the present invention provides a multi-sensor target detection fusion method based on distance estimation, including:
step S201: calibrating a first sensor and a second sensor respectively to obtain a transformation matrix between the first sensor and the second sensor and obtain an internal reference of the first sensor;
step S202: predefining a target information summary table, the target information summary table comprising a common physical size of all targets detected by the visual detection neural network;
step S203: performing target detection on the visual image acquired by the first sensor through the visual detection neural network to obtain two-dimensional information of the target, wherein the two-dimensional information of the target comprises a target type and a boundary frame, the estimated distance of the target is obtained according to the target type and the size of the boundary frame, and the first coordinate of the target is estimated through the estimated distance of the target;
step S204: the target information summary table comprises a plurality of target information sub-tables, each target corresponds to one target information sub-table, and the first width, the second width and the height of each target are acquired in real time and stored in the target information sub-tables;
step S205: acquiring continuous twice point cloud data of the second sensor before target time according to the data acquisition time of the first sensor, and predicting the point cloud data of the target time;
step S206: and mapping the point cloud data of the target time to a two-dimensional space of the target, screening out target point cloud data, and obtaining the real physical size of the target according to the target point cloud data, wherein the target point cloud data comprises the point cloud data which is mapped into the boundary frame and accords with the first coordinate.
Fig. 3 is a schematic block diagram of a multi-sensor target detection fusion device based on distance estimation according to an embodiment of the invention. Referring now to FIG. 3, one embodiment of the present invention provides a multi-sensor target detection fusion apparatus based on distance estimation, comprising:
the sensor calibration module 31 is configured to calibrate a first sensor and a second sensor respectively, obtain a transformation matrix between the first sensor and the second sensor, and obtain an internal reference of the first sensor;
a target information summary table predefined module 32 for predefining a target information summary table comprising the usual physical dimensions of all targets detected by the visual detection neural network;
a first sensor data acquisition module 33, configured to perform target detection on a visual image acquired by the first sensor through the visual detection neural network, to obtain two-dimensional information of the target, where the two-dimensional information of the target includes a target type and a bounding box, obtain an estimated distance of the target according to the target type and the size of the bounding box, and estimate a first coordinate of the target through the estimated distance of the target;
a second sensor data acquisition module 34, configured to acquire, according to the data acquisition time of the first sensor, two consecutive point cloud data of the second sensor before a target time, and predict the point cloud data of the target time;
the fusion module 35 is configured to map the point cloud data of the target time to a two-dimensional space of the target, screen out target point cloud data, and obtain a real physical size of the target according to the target point cloud data, where the target point cloud data includes the point cloud data mapped into the bounding box and conforming to the first coordinate.
Fig. 4 is a schematic block diagram of a multi-sensor target detection fusion device based on distance estimation according to another embodiment of the present invention. Referring now to FIG. 4, one embodiment of the present invention provides a multi-sensor target detection fusion apparatus based on distance estimation, comprising:
a sensor calibration module 41, configured to calibrate a first sensor and a second sensor, respectively, obtain a transformation matrix between the first sensor and the second sensor, and obtain an internal reference of the first sensor;
a target information summary table predefining module 42 for predefining a target information summary table including a usual physical size of all targets detected by the visual detection neural network;
the target information sub-table module 43 is configured to acquire, in real time, a first width, a second width, and a height of each target, and store the first width, the second width, and the height in the target information sub-table, where each target corresponds to one target information sub-table;
a first sensor data acquisition module 44, configured to perform target detection on a visual image acquired by the first sensor through the visual detection neural network, to obtain two-dimensional information of the target, where the two-dimensional information of the target includes a target type and a bounding box, obtain an estimated distance of the target according to the target type and the size of the bounding box, and estimate a first coordinate of the target through the estimated distance of the target;
a second sensor data acquisition module 45, configured to acquire, according to the data acquisition time of the first sensor, two consecutive point cloud data of the second sensor before a target time, and predict point cloud data of the target time;
the fusion module 46 is configured to map the point cloud data of the target time to a two-dimensional space of the target, screen out target point cloud data, and obtain a real physical size of the target according to the target point cloud data, where the target point cloud data includes the point cloud data mapped into the bounding box and conforming to the first coordinate. After obtaining the real three-dimensional coordinates of the target, the fusion module 46 can feed back the coordinate information to the target information sub-table module 43, so that the target information sub-table can acquire more accurate target information, and the target information can be detected more accurately.
In summary, according to the multi-sensor target detection fusion method and device based on distance estimation provided by the embodiment of the invention, the target information summary table is predefined, and the target information summary table comprises the common physical dimensions of all targets detected by the visual detection neural network; performing target detection on the visual image acquired by the first sensor through the visual detection neural network to obtain two-dimensional information of the target, wherein the two-dimensional information of the target comprises a target type and a boundary frame, the estimated distance of the target is obtained according to the target type and the size of the boundary frame, and the first coordinate of the target is estimated through the estimated distance of the target; acquiring continuous twice point cloud data of the second sensor before target time according to the data acquisition time of the first sensor, and predicting the point cloud data of the target time; mapping the point cloud data of the target time to a two-dimensional space of the target, screening out target point cloud data, and obtaining the real physical size of the target according to the target point cloud data, wherein the target point cloud data comprises the point cloud data which is mapped into the boundary frame and accords with the first coordinate; the data acquired by the first sensor and the second sensor are fused, so that the target information is detected more accurately, and an accurate detection result can be obtained particularly under the condition that partial shielding exists between targets;
further, the target information summary table comprises a plurality of target information sub-tables, each target corresponds to one target information sub-table, the first width, the second width and the height of the target in the corresponding target information sub-table are obtained, and the first width, the second width and the height are stored in the target information sub-table, so that the size of the target is updated in real time, and the target information is detected more accurately;
further, the obtaining the target based on the second coordinate of the second sensor includes calculating an average value of the point cloud data conforming to the first coordinate, obtaining a three-dimensional center point coordinate of the target, and calculating a real physical size of the target; if the point number of the point cloud data conforming to the first coordinate is less than a first preset value, using
Figure SMS_98
The true physical size of the target is calculated, thereby reducing the effect of noise. />
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the invention.

Claims (10)

1. A multi-sensing target detection fusion method based on distance estimation is characterized by comprising the following steps:
calibrating a first sensor and a second sensor respectively to obtain a transformation matrix between the first sensor and the second sensor and obtain an internal reference of the first sensor;
predefining a target information summary table, the target information summary table comprising a common physical size of all targets detected by the visual detection neural network;
performing target detection on the visual image acquired by the first sensor through the visual detection neural network to obtain two-dimensional information of the target, wherein the two-dimensional information of the target comprises a target type and a boundary frame, the estimated distance of the target is obtained according to the target type and the size of the boundary frame, and the first coordinate of the target is estimated through the estimated distance of the target;
acquiring continuous twice point cloud data of the second sensor before target time according to the data acquisition time of the first sensor, and predicting the point cloud data of the target time;
and mapping the point cloud data of the target time to a two-dimensional space of the target, screening out target point cloud data, and obtaining the real physical size of the target according to the target point cloud data, wherein the target point cloud data comprises the point cloud data which is mapped into the boundary frame and accords with the first coordinate.
2. The distance estimation-based multi-sensor target detection fusion method of claim 1, wherein the first sensor is at least one monocular camera and the second sensor is at least one lidar.
3. The multi-sensor target detection fusion method based on distance estimation according to claim 1, wherein the target information summary table comprises a plurality of target information sub-tables, each target corresponds to one target information sub-table, and the distance of the target estimated according to the size of the bounding box is specifically calculated by the following formula:
Figure QLYQS_1
wherein ,
Figure QLYQS_3
and />
Figure QLYQS_5
Is an internal reference of the first sensor,/-A>
Figure QLYQS_8
For the first width of the target in the corresponding target information sub-table,/for the target>
Figure QLYQS_4
For the second width of the target in the corresponding target information sub-table,/for the target>
Figure QLYQS_7
Feeding back the first width, the second width and the height of the target in the corresponding target information sub-table to a predefined target information summary table for the height of the target in the corresponding target information sub-table, and +_>
Figure QLYQS_9
and />
Figure QLYQS_11
The pixel width and height of the target respectively, obtaining the pixel width and height of the object according to the size of the boundary box of the object,/->
Figure QLYQS_2
Is a preliminary estimated distance of the target based on the first width,/i>
Figure QLYQS_6
Is a preliminary estimated distance of the target based on the second width,/i>
Figure QLYQS_10
Is a preliminary estimated distance of the target based on the height.
4. The multi-sensor target detection fusion method based on distance estimation according to claim 3, wherein the following steps are respectively performed
Figure QLYQS_13
、/>
Figure QLYQS_17
The value of (2) is compared with 1 if +.>
Figure QLYQS_19
The final estimated distance of the target is
Figure QLYQS_14
If->
Figure QLYQS_16
The final estimated distance of the target is +.>
Figure QLYQS_18
The first ratio->
Figure QLYQS_20
Said second ratio->
Figure QLYQS_12
The third ratio->
Figure QLYQS_15
Specifically, the calculation is performed by the following formula:
Figure QLYQS_21
wherein ,
Figure QLYQS_22
and />
Figure QLYQS_23
The pixel width and height of the target, respectively.
5. The method for multi-sensor target detection fusion based on distance estimation according to claim 4, wherein,
the first coordinate of the target is estimated through the estimated distance of the target, specifically calculated through the following formula:
Figure QLYQS_24
wherein ,
Figure QLYQS_25
for the three-dimensional Cartesian coordinates of the object relative to the first sensor +.>
Figure QLYQS_26
Indicating the front-back direction, +.>
Figure QLYQS_27
Indicating the left and right direction>
Figure QLYQS_28
Indicating up-down direction, ++>
Figure QLYQS_29
and />
Figure QLYQS_30
Is the center point coordinates of the target.
6. The distance estimation-based multi-sensor target detection fusion method according to claim 5, wherein
Figure QLYQS_31
Converting into coordinates of the target relative to the second sensor>
Figure QLYQS_32
Coordinates of the second sensor +.>
Figure QLYQS_33
Specifically, the calculation is performed by the following formula:
Figure QLYQS_34
wherein ,
Figure QLYQS_35
is a transformation matrix between the first sensor and the second sensor.
7. The method for multi-sensor target detection fusion based on distance estimation according to claim 1, wherein,
the point cloud data of the target time is predicted, and is calculated specifically by the following formula:
Figure QLYQS_36
wherein ,
Figure QLYQS_38
for the point of the target timeCloud data->
Figure QLYQS_41
and />
Figure QLYQS_43
For the second sensor's two consecutive point cloud data before the target time,/o>
Figure QLYQS_39
For the point cloud data acquired at the moment on the second sensor, +.>
Figure QLYQS_40
Point cloud data acquired at a later time for the second sensor, +.>
Figure QLYQS_42
For the current moment +.>
Figure QLYQS_44
For the last moment,/->
Figure QLYQS_37
To be the last moment. />
8. The distance estimation-based multi-sensor target detection fusion method according to claim 6, wherein the point cloud data conforming to the first coordinate includes an allowable error, specifically determined by the following formula:
Figure QLYQS_45
wherein ,
Figure QLYQS_46
and the allowable error is the allowable error.
9. The distance estimation based multisensor of claim 8The target detection fusion method is characterized in that the obtaining of the target based on the second coordinate of the second sensor comprises calculating an average value of the point cloud data conforming to the first coordinate, obtaining a three-dimensional center point coordinate of the target and calculating the real physical size of the target; if the point number of the point cloud data conforming to the first coordinate is less than a first preset value, using
Figure QLYQS_47
The true physical dimensions of the target are calculated.
10. A multi-sensor target detection fusion device based on distance estimation, comprising:
the sensor calibration module is used for calibrating a first sensor and a second sensor respectively, obtaining a transformation matrix between the first sensor and the second sensor and obtaining an internal reference of the first sensor;
a target information summary predefined module for predefining a target information summary comprising a common physical size of all targets detected by the visual detection neural network;
the first sensor data acquisition module is used for carrying out target detection on the visual image acquired by the first sensor through the visual detection neural network to obtain two-dimensional information of the target, wherein the two-dimensional information of the target comprises a target type and a boundary frame, the estimated distance of the target is obtained according to the target type and the size of the boundary frame, and the first coordinate of the target is estimated according to the estimated distance of the target;
the second sensor data acquisition module is used for acquiring continuous twice point cloud data of the second sensor before the target time according to the data acquisition time of the first sensor, and predicting the point cloud data of the target time;
the fusion module is used for mapping the point cloud data of the target time to the two-dimensional space of the target, screening out target point cloud data, and obtaining the real physical size of the target according to the target point cloud data, wherein the target point cloud data comprises the point cloud data which are mapped into the boundary frame and accord with the first coordinates.
CN202310042622.XA 2023-01-28 2023-01-28 Multi-sensing target detection fusion method and device based on distance estimation Active CN116052121B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310042622.XA CN116052121B (en) 2023-01-28 2023-01-28 Multi-sensing target detection fusion method and device based on distance estimation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310042622.XA CN116052121B (en) 2023-01-28 2023-01-28 Multi-sensing target detection fusion method and device based on distance estimation

Publications (2)

Publication Number Publication Date
CN116052121A true CN116052121A (en) 2023-05-02
CN116052121B CN116052121B (en) 2023-06-27

Family

ID=86123678

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310042622.XA Active CN116052121B (en) 2023-01-28 2023-01-28 Multi-sensing target detection fusion method and device based on distance estimation

Country Status (1)

Country Link
CN (1) CN116052121B (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110244322A (en) * 2019-06-28 2019-09-17 东南大学 Pavement construction robot environment sensory perceptual system and method based on Multiple Source Sensor
CN111464978A (en) * 2019-01-22 2020-07-28 岳秀兰 Vehicle remote driving system established by connection of primary wireless equipment and secondary wireless equipment through Internet of things
CN112396650A (en) * 2020-03-30 2021-02-23 青岛慧拓智能机器有限公司 Target ranging system and method based on fusion of image and laser radar
CN112652016A (en) * 2020-12-30 2021-04-13 北京百度网讯科技有限公司 Point cloud prediction model generation method, pose estimation method and device
CN112733678A (en) * 2020-12-31 2021-04-30 深兰人工智能(深圳)有限公司 Ranging method, ranging device, computer equipment and storage medium
CN113436258A (en) * 2021-06-17 2021-09-24 中国船舶重工集团公司第七0七研究所九江分部 Offshore pontoon detection method and system based on fusion of vision and laser radar
US20210302534A1 (en) * 2017-06-13 2021-09-30 Veoneer Sweden Ab Error estimation for a vehicle environment detection system
CN114708585A (en) * 2022-04-15 2022-07-05 电子科技大学 Three-dimensional target detection method based on attention mechanism and integrating millimeter wave radar with vision
CN115542312A (en) * 2022-11-30 2022-12-30 苏州挚途科技有限公司 Multi-sensor association method and device
US20230014874A1 (en) * 2020-10-22 2023-01-19 Tencent Technology (Shenzhen) Company Limited Obstacle detection method and apparatus, computer device, and storage medium

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210302534A1 (en) * 2017-06-13 2021-09-30 Veoneer Sweden Ab Error estimation for a vehicle environment detection system
CN111464978A (en) * 2019-01-22 2020-07-28 岳秀兰 Vehicle remote driving system established by connection of primary wireless equipment and secondary wireless equipment through Internet of things
CN110244322A (en) * 2019-06-28 2019-09-17 东南大学 Pavement construction robot environment sensory perceptual system and method based on Multiple Source Sensor
CN112396650A (en) * 2020-03-30 2021-02-23 青岛慧拓智能机器有限公司 Target ranging system and method based on fusion of image and laser radar
US20230014874A1 (en) * 2020-10-22 2023-01-19 Tencent Technology (Shenzhen) Company Limited Obstacle detection method and apparatus, computer device, and storage medium
CN112652016A (en) * 2020-12-30 2021-04-13 北京百度网讯科技有限公司 Point cloud prediction model generation method, pose estimation method and device
CN112733678A (en) * 2020-12-31 2021-04-30 深兰人工智能(深圳)有限公司 Ranging method, ranging device, computer equipment and storage medium
CN113436258A (en) * 2021-06-17 2021-09-24 中国船舶重工集团公司第七0七研究所九江分部 Offshore pontoon detection method and system based on fusion of vision and laser radar
CN114708585A (en) * 2022-04-15 2022-07-05 电子科技大学 Three-dimensional target detection method based on attention mechanism and integrating millimeter wave radar with vision
CN115542312A (en) * 2022-11-30 2022-12-30 苏州挚途科技有限公司 Multi-sensor association method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
BENEDIKT MERSCH等: "Self-supervised Point Cloud Prediction Using 3D Spatio-temporal Convolutional Networks", 《5TH CONFERENCE ON ROBOT LEARNING (CORL 2021)》, pages 1 - 11 *
徐泽楷: "多模态传感器融合技术研究", 《中国优秀硕士论文电子期刊》, pages 27 - 47 *

Also Published As

Publication number Publication date
CN116052121B (en) 2023-06-27

Similar Documents

Publication Publication Date Title
EP3367677B1 (en) Calibration apparatus, calibration method, and calibration program
CN110942449A (en) Vehicle detection method based on laser and vision fusion
JP4943034B2 (en) Stereo image processing device
WO2016117200A1 (en) Outside environment recognition device for vehicles and vehicle behavior control device using same
KR20190102665A (en) Calibration system and method using real-world object information
CN110738121A (en) front vehicle detection method and detection system
JP2008186246A (en) Moving object recognizing device
CN114359181B (en) Intelligent traffic target fusion detection method and system based on image and point cloud
CN112562093B (en) Object detection method, electronic medium, and computer storage medium
CN114495064A (en) Monocular depth estimation-based vehicle surrounding obstacle early warning method
CN113205604A (en) Feasible region detection method based on camera and laser radar
US20190297314A1 (en) Method and Apparatus for the Autocalibration of a Vehicle Camera System
Petrovai et al. A stereovision based approach for detecting and tracking lane and forward obstacles on mobile devices
CN111723778B (en) Vehicle distance measuring system and method based on MobileNet-SSD
CN112130153A (en) Method for realizing edge detection of unmanned vehicle based on millimeter wave radar and camera
CN111538008B (en) Transformation matrix determining method, system and device
CN114463303A (en) Road target detection method based on fusion of binocular camera and laser radar
CN114359865A (en) Obstacle detection method and related device
CN113029185A (en) Road marking change detection method and system in crowdsourcing type high-precision map updating
CN116052121B (en) Multi-sensing target detection fusion method and device based on distance estimation
CN114152942B (en) Millimeter wave radar and vision second-order fusion multi-classification target detection method
CN114495038B (en) Post-processing method for automatic driving detection marking data
CN106874837B (en) Vehicle detection method based on video image processing
CN113869440A (en) Image processing method, apparatus, device, medium, and program product
CN114518106A (en) High-precision map vertical element updating detection method, system, medium and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant