CN113569803A - Multi-mode data fusion lane target detection method and system based on multi-scale convolution - Google Patents

Multi-mode data fusion lane target detection method and system based on multi-scale convolution Download PDF

Info

Publication number
CN113569803A
CN113569803A CN202110921918.XA CN202110921918A CN113569803A CN 113569803 A CN113569803 A CN 113569803A CN 202110921918 A CN202110921918 A CN 202110921918A CN 113569803 A CN113569803 A CN 113569803A
Authority
CN
China
Prior art keywords
lane
fusion
channel
data
target detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110921918.XA
Other languages
Chinese (zh)
Inventor
张国英
高鑫
熊一瑾
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China University of Mining and Technology Beijing CUMTB
Original Assignee
China University of Mining and Technology Beijing CUMTB
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China University of Mining and Technology Beijing CUMTB filed Critical China University of Mining and Technology Beijing CUMTB
Priority to CN202110921918.XA priority Critical patent/CN113569803A/en
Publication of CN113569803A publication Critical patent/CN113569803A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/20Finite element generation, e.g. wire-frame surface description, tesselation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a multi-modal data fusion lane target detection method and system based on multi-scale convolution, which comprises the steps of firstly preprocessing lane original laser radar point cloud data to obtain three-channel pseudo point cloud data aligned with RGB images; performing multi-mode data fusion on the obtained lane three-channel pseudo-point cloud data and corresponding RGB image data; in the convolution process after multi-modal data fusion, calculating the fusion characteristic channel again by using multi-scale convolution so as to correct the weight of the fusion characteristic channel; and inputting the corrected lane multi-mode fusion data into a pre-established and trained lane target detection model, and predicting lane targets to obtain a lane target detection result. The method can improve the utilization rate of the advantage characteristics of different modal data, effectively improve the precision of the target detection model, has good real-time performance, and is applicable to different target detection models.

Description

Multi-mode data fusion lane target detection method and system based on multi-scale convolution
Technical Field
The invention relates to the technical field of target detection, in particular to a multi-mode data fusion lane target detection method and system based on multi-scale convolution.
Background
In recent years, object detection has made a remarkable progress in face recognition, image recognition, video recognition, automatic driving, and the like. The target detection is particularly important in automatic driving, in order to ensure the safety of getting on the road, all pedestrians and vehicles on the road surface need to be accurately detected, the pedestrians are high-risk groups on the road surface and are the groups which are most easily damaged in traffic accidents, and the pedestrian identification and the pedestrian movement track prediction are research hotspots in the current safe driving field. The vehicle is a main body in a driving scene, and other vehicles running on a detection road are the most important in obstacle detection, so that the method has important significance for safe running, emergency collision avoidance and the like of the vehicle.
The current mainstream method for detecting the target in automatic driving is to detect a camera image, and data of other sensors, such as laser radar point cloud data reflecting depth information, has great difficulty in application due to high calculation complexity. However, the target detection only by using the camera image in the automatic driving has great limitations, such as the illumination change can affect the detection effect, the detection precision is low when people and vehicles are dense, and the possibility of false detection and missing detection of the target with the color similar to the background color is high. The current target detection method relying on pure visual images loses depth information and is easily influenced by environment and weather, and the accuracy of the detection model is difficult to improve in an application scene and lacks robustness. In contrast, lidar point cloud data with depth information for each point has natural advantages in solving the above problems, and therefore, it is reasonable and necessary to perform target detection with multi-modal fusion in autonomous driving, but the current multi-modal fusion scheme has several disadvantages:
1. the calculation cost is high: the point cloud data has high calculation complexity, and the direct calculation of the point cloud consumes a large amount of time and calculation resources, greatly affects the real-time performance, and cannot meet the requirement of automatic driving.
2. No attention was paid to the characteristics of multimodal data: one feasible method is to project the depth information in the point cloud to a two-dimensional plane and fuse the depth image and the camera image, but the camera image and the lidar point cloud are data of different modalities and have great difference. The multi-modal data are coarsely placed in a feature space to be fused, so that the respective advantages of the multi-modal data can be inhibited, and even the noise effect can be generated.
3. The importance of multimodal data is not calculated: in fact, in the process of extracting features by a deep learning network, part of feature channels greatly contribute to a target detection result, and part of feature channels have small contribution, and the fusion by using the fixed proportion is unreasonable.
Disclosure of Invention
The invention aims to provide a multi-modal data fusion lane target detection method and system based on multi-scale convolution.
The purpose of the invention is realized by the following technical scheme:
a method of multi-modal data fusion lane target detection based on multi-scale convolution, the method comprising:
step 1, firstly, preprocessing original laser radar point cloud data to obtain three-channel pseudo point cloud data aligned with an RGB image;
step 2, performing multi-mode data fusion on the three-channel pseudo-point cloud data obtained in the step 1 and corresponding RGB image data; in the multi-modal data fusion process, weight assignment is carried out on the feature channels of different modal data, and fusion weights of all the feature channels are acquired in a self-adaptive mode according to the importance degrees of the respective feature channels;
step 3, in the convolution process after multi-modal data fusion, calculating the fusion characteristic channel again by using multi-scale convolution so as to correct the weight of the fusion characteristic channel;
and 4, inputting the lane multi-mode fusion data corrected in the step 3 into a pre-established and trained lane target detection model, and predicting lane targets to obtain a lane target detection result.
According to the technical scheme provided by the invention, the method can improve the utilization rate of the advantage characteristics of different modal data, effectively improve the accuracy of the lane target detection model, has good real-time performance, and is applicable to different target detection models.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.
Fig. 1 is a schematic flowchart of a method for multi-modal data fusion lane target detection based on multi-scale convolution according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a multi-modal data fusion process according to an embodiment of the present invention;
FIG. 3 is a schematic diagram illustrating a multi-scale calibration process according to an embodiment of the present invention;
FIG. 4 is a schematic structural diagram of a system according to an embodiment of the present invention;
fig. 5 is a structural diagram of a lane object detection model according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, not all embodiments, and this does not limit the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a schematic flow chart of a method for multi-modal data fusion lane target detection based on multi-scale convolution according to an embodiment of the present invention, where the method includes:
step 1, firstly, preprocessing lane original laser radar point cloud data to obtain three-channel pseudo point cloud data aligned with an RGB image;
in the step, the lane original laser radar point cloud data is the original point cloud data collected by the laser radar equipment and is a bin file; the lane RGB image is a color image acquired by a monocular camera and is a three-channel image.
The pretreatment process specifically comprises the following steps:
because the original laser radar point cloud data is sparse, the target detection effect by directly utilizing the original laser radar point cloud data is not ideal, so that the laser radar point cloud can be complemented by using a K nearest neighbor algorithm, specifically, K nearest neighbor search point cloud interpolation is applied, and K = 2; searching 3 nearest points of each blank pixel through the distance between each point, and calculating a weighted average value according to the normalized pixel distance to be used as a result of the blank pixel to obtain dense point cloud data; the completion process can make the representation of the point cloud data to the target object clearer and improve the detection precision to a certain extent;
then projecting a front view of the dense point cloud data onto an imaging plane of a monocular camera, aligning the front view with an RGB image acquired by the monocular camera, and obtaining the reflectivity of the dense point cloud data; the projection result is a single-channel image with the same size as the RGB image, and the pixel value of the single-channel image is the reflectivity of the pixel alignment position of the dense point cloud data and the RGB image;
projecting the height information and the depth information of the dense point cloud data to respectively obtain corresponding single-channel images, and performing channel stacking on the three obtained single-channel images to obtain three-channel pseudo point cloud data aligned with the RGB images;
the upper half part of the three-channel pseudo-point cloud data can be further trimmed so as to better reduce the influence of irrelevant background.
Step 2, performing multi-mode data fusion on the lane three-channel pseudo-point cloud data obtained in the step 1 and corresponding RGB image data;
in the multi-modal data fusion process, weight assignment is carried out on the feature channels of different modal data, and fusion weights of all the feature channels are acquired in a self-adaptive mode according to the importance degrees of the respective feature channels;
fig. 2 is a schematic diagram of a process of multi-modal data fusion according to an embodiment of the present invention, where the specific process is as follows:
firstly, performing convolution operation on lane RGB image data and three-channel Pseudo-point cloud data (Pseudo LiDAR in the figure 2) by using convolution kernels with the sizes of 3 x 3 and 5 x 5 respectively, wherein all operations performed by two branches keep the number of output channels consistent; specifically, after RGB image data and three-channel pseudo-point cloud data are input, firstly, primary low-dimensional features are extracted respectively, and then, features are extracted from the RGB image data by adopting a 3-by-3 convolution kernel; in contrast, the three-channel pseudo-point cloud data is subjected to feature extraction again by using a convolution kernel of 5 × 5, so that the depth information is better utilized;
merging the output characteristics of the two branches, wherein each channel characteristic is formed by adding the two branches point by point, and then carrying out global average pooling to further refine the characteristics; wherein, the formula for global average pooling is as follows:
Figure DEST_PATH_IMAGE002
wherein
Figure DEST_PATH_IMAGE004
And
Figure DEST_PATH_IMAGE006
is a size parameter of a single characteristic channelNumber, height and width, respectively;
Figure DEST_PATH_IMAGE008
outputting the feature channel set after feature combination for the two branches,
Figure 435188DEST_PATH_IMAGE008
size of passing space
Figure DEST_PATH_IMAGE010
Contracting result completion pairs of two branch output mergerssCalculation of the c-th element of (a), (b), (c), (d) and b), (d) and (d), (d) and (d)i,j) Coordinates of feature points on the feature channel are obtained;
and performing feature mapping on the average pooled features by using a full-connection operation, wherein the formula of the full-connection operation is as follows:
Figure DEST_PATH_IMAGE012
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE014
is the ReLU activation function;
Figure DEST_PATH_IMAGE016
representing a batch normalization operation;Wsd is a characteristic of d-C dimension, in the patent, d is 8, and C is a channel dimension before branch merging;san output that is a global average pooling;
then mapping by using a softmax function to obtain the weight of each channel, and assigning the weight of the characteristic channel of each modal data; the probability value returned by the softmax function also represents the attention weight of the characteristic channel in different spatial scales;
aiming at all the characteristic channels of a single branch, multiplying the characteristic values of the characteristic channels by the attention weight of the corresponding position to obtain a single branch result after weight calculation;
and then fusing the two branch results after the two branches are calculated by add operation to realize multi-mode data fusion.
Step 3, in the convolution process after multi-modal data fusion, calculating the fusion characteristic channel again by using multi-scale convolution so as to correct the weight of the fusion characteristic channel;
in this step, in order to correct the weight of the fused feature channel and improve the accuracy of the model, the embodiment of the present invention uses multi-scale convolution correction after the second ResNet block in the feature extraction process, and performs weight calculation of the feature channel under 3 different receptive fields, as shown in fig. 3, which is a schematic diagram of the multi-scale correction process described in the embodiment of the present invention, specifically:
performing Split operation on the data after multi-modal fusion, namely extracting high-dimensional features through convolution kernels with 3 different sizes; calculating channel weights for the high-dimensional features by sequentially using Fuse and Select operations, namely correcting the weight values of all the feature channels;
the Split operation uses three convolution kernels, 3 × 3, 5 × 5 and 7 × 7, to extract multi-path features;
the Fuse operation controls the multipath characteristics, and information carried by the multipath characteristics is transmitted to a next layer of neuron;
performing weighted summation operation on the output of the Fuse operation and the three weight matrixes by Select operation to obtain output vectors of all branches;
and finally, adding the output vectors of the three branches by add operation to obtain the correction weight of the characteristic channel.
Through twice feature channel weight calculation during and after fusion, the utilization rate of the advantage characteristics of different modal data is greatly improved, and the precision of the target detection model is effectively improved.
As shown in fig. 5, which is a structural diagram of a lane object detection model according to an embodiment of the present invention, on the framework of fast RCNN, classification and regression of lane objects are completed by using FPN and RPN to obtain a lane object detection result.
And 4, inputting the lane multi-mode fusion data corrected in the step 3 into a pre-established and trained lane target detection model, and predicting lane targets to obtain a lane target detection result.
In the step, lane multi-mode fusion data after weight correction is sent to the FPN and the RPN of a lane target detection model in sequence, and the RPN finishes classification and regression of lane targets according to 256 channel feature layers output by the last layer of convolutional layer; the FPN (feature pyramid) can be used for detecting targets with different scales, and after extracting feature information with different scales, the feature graphs with all scales have high-level semantic information by using a transverse connection structure; RPN (RegionProposal network) generates a candidate region, which combines an anchor frame with a priori size to distinguish the background from the foreground and make the anchor frame closer to a real target;
the classification finishes the classification judgment of the lane target, the regression uses a boundary frame (an anchor frame obtained finally) to frame the detected target, and the classification and regression results are the lane target detection results.
Based on the above method, an embodiment of the present invention further provides a system for multi-modal data fusion lane target detection based on multi-scale convolution, and as shown in fig. 4, the system according to the embodiment of the present invention is a schematic structural diagram, and the system includes:
the data preprocessing module is used for preprocessing the original laser radar point cloud data to obtain three-channel pseudo point cloud data aligned with the RGB image;
the multi-mode data fusion module is used for performing multi-mode data fusion on the three-channel pseudo-point cloud data obtained by the data preprocessing module and the corresponding RGB image data; the multi-mode data fusion module performs weight assignment on the feature channels of different modal data in the multi-mode data fusion process, and adaptively acquires the fusion weights of all the feature channels according to the importance degrees of the feature channels;
the multi-scale correction module is used for calculating the fusion characteristic channel again by using multi-scale convolution in the convolution process after the multi-mode data fusion module carries out multi-mode data fusion so as to correct the weight of the fusion characteristic channel;
and the lane target prediction module is used for inputting the multi-modal fusion data corrected by the multi-scale correction module into a pre-established and trained lane target detection model and predicting lane targets to obtain a lane target detection result.
The specific implementation process of each module in the system is described in the embodiment of the method.
It is noted that those skilled in the art will recognize that embodiments of the present invention are not described in detail herein.
In summary, the method and system of the embodiment of the invention have the advantages that:
1. according to the method, the laser radar point cloud data can be better utilized, and the problems of time consumption and resource occupation caused by calculation of a large number of three-dimensional point clouds are solved by utilizing generated and cut three-channel pseudo point cloud image information;
2. the adopted fusion method can reasonably and effectively calculate the weight of all the characteristic channels, improves the utilization rate of the advantage characteristics of the two modal data, and has great advantages compared with the existing fixed fusion weight mode;
3. after high-dimensional features are extracted from the fused data, the weight of the feature channel is calculated, the weight of the feature channel of the fused data is corrected, the feature channel which greatly contributes to the detection result is greatly highlighted by using the weight calculation of the feature channel twice, and the detection precision of the model is improved.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims. The information disclosed in this background section is only for enhancement of understanding of the general background of the invention and should not be taken as an acknowledgement or any form of suggestion that this information forms the prior art already known to a person skilled in the art.

Claims (6)

1. A multi-modal data fusion lane target detection method based on multi-scale convolution is characterized by comprising the following steps:
step 1, firstly, preprocessing original laser radar point cloud data to obtain three-channel pseudo point cloud data aligned with an RGB image;
step 2, performing multi-mode data fusion on the three-channel pseudo-point cloud data obtained in the step 1 and corresponding RGB image data; in the multi-modal data fusion process, weight assignment is carried out on the feature channels of different modal data, and fusion weights of all the feature channels are acquired in a self-adaptive mode according to the importance degrees of the respective feature channels;
step 3, in the convolution process after multi-modal data fusion, calculating the fusion characteristic channel again by using multi-scale convolution so as to correct the weight of the fusion characteristic channel;
and 4, inputting the lane multi-mode fusion data corrected in the step 3 into a pre-established and trained lane target detection model, and predicting lane targets to obtain a lane target detection result.
2. The method for multi-modal data fusion lane target detection based on multi-scale convolution according to claim 1, wherein the process of step 1 is specifically as follows:
firstly, complementing original laser radar point cloud data, specifically applying K nearest neighbor search point cloud interpolation, wherein K = 2; searching 3 nearest points of each blank pixel through the distance between each point, and calculating a weighted average value according to the normalized pixel distance to be used as a result of the blank pixel to obtain dense point cloud data;
then projecting a front view of the dense point cloud data onto an imaging plane of a monocular camera, aligning the front view with an RGB image acquired by the monocular camera, and obtaining the reflectivity of the dense point cloud data;
and projecting the height information and the depth information of the dense point cloud data to respectively obtain corresponding single-channel images, and stacking the three obtained single-channel images to obtain three-channel pseudo point cloud data aligned with the RGB images.
3. The method for multi-modal data fusion lane target detection based on multi-scale convolution according to claim 1, wherein the process of step 2 is specifically as follows:
firstly, performing convolution operation on RGB image data and three-channel pseudo-point cloud data by using convolution kernels with the sizes of 3X 3 and 5X 5 respectively, wherein all operations performed by the two branches keep the number of output channels consistent;
combining the output characteristics of the two branches, wherein each channel characteristic is formed by adding the two branches point by point;
then carrying out global average pooling to further refine the characteristics; wherein, the formula for global average pooling is as follows:
Figure 826619DEST_PATH_IMAGE002
wherein the content of the first and second substances,
Figure 205517DEST_PATH_IMAGE004
and
Figure 691993DEST_PATH_IMAGE006
is the size parameter of the single characteristic channel, which is respectively the height and the width;
Figure 248876DEST_PATH_IMAGE008
outputting the feature channel set after feature combination for the two branches,
Figure 234019DEST_PATH_IMAGE008
size of passing space
Figure 105023DEST_PATH_IMAGE010
Contracting result completion pairs of two branch output mergerssCalculation of the c-th element of (a), (b), (c), (d) and b), (d) and (d), (d) and (d)i,j) Coordinates of feature points on the feature channel are obtained;
and performing feature mapping on the average pooled features by using a full-connection operation, wherein the formula of the full-connection operation is as follows:
Figure 242743DEST_PATH_IMAGE012
wherein the content of the first and second substances,
Figure 219795DEST_PATH_IMAGE014
is the ReLU activation function;
Figure 442966DEST_PATH_IMAGE016
representing a batch normalization operation;Wsfeatures in d x C dimensions;san output that is a global average pooling;
then mapping by using a softmax function to obtain the weight of each channel, and assigning the weight of the characteristic channel of each modal data; the probability value returned by the softmax function also represents the attention weight of the characteristic channel in different spatial scales;
multiplying the characteristic value of all characteristic channels of a single branch by the attention weight of the corresponding position to obtain a single branch result after weighted calculation;
and fusing the two branch results by add operation to realize multi-mode data fusion.
4. The method for multi-modal data fusion lane target detection based on multi-scale convolution according to claim 1, wherein the process of step 3 is specifically as follows:
performing Split operation on the data after multi-modal fusion, namely extracting high-dimensional features through convolution kernels with 3 different sizes; calculating channel weights for the high-dimensional features by sequentially using Fuse and Select operations, namely correcting the weight values of all the feature channels;
the Split operation uses three convolution kernels, 3 × 3, 5 × 5 and 7 × 7, to extract multi-path features;
the Fuse operation controls the multipath characteristics, and information carried by the multipath characteristics is transmitted to a next layer of neuron;
performing weighted summation operation on the output of the Fuse operation and the three weight matrixes by Select operation to obtain output vectors of all branches;
and finally, adding the output vectors of the three branches by add operation to obtain the correction weight of the characteristic channel.
5. The method for multi-modal data fusion lane target detection based on multi-scale convolution according to claim 1, wherein in step 4, lane multi-modal fusion data after weight correction is sequentially sent to the FPN and RPN of the lane target detection model, and the RPN completes classification and regression of lane targets according to the 256-channel feature layers output by the last layer of convolution layer;
and the classification finishes the classification judgment of the lane target, the regression uses a boundary frame to frame the detected target, and the classification and regression results are the lane target detection results.
6. A system for multi-modal data fusion lane target detection based on multi-scale convolution, the system comprising:
the lane original laser radar point cloud data preprocessing module is used for acquiring three-channel pseudo point cloud data aligned with the lane RGB image;
the lane multi-mode data fusion module is used for performing multi-mode data fusion on the lane three-channel pseudo-point cloud data and the corresponding RGB image; carrying out weight assignment on the feature channels of different modal data, and acquiring fusion weights of all the feature channels in a self-adaptive manner according to the importance degree;
the multi-scale correction module is used for calculating the fused characteristic channel again by using multi-scale convolution so as to correct the weight of the fused characteristic channel;
and the lane target prediction module is used for inputting the multi-modal fusion data corrected by the multi-scale correction module into a pre-established and trained lane target detection model and predicting lane targets to obtain a lane target detection result.
CN202110921918.XA 2021-08-12 2021-08-12 Multi-mode data fusion lane target detection method and system based on multi-scale convolution Pending CN113569803A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110921918.XA CN113569803A (en) 2021-08-12 2021-08-12 Multi-mode data fusion lane target detection method and system based on multi-scale convolution

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110921918.XA CN113569803A (en) 2021-08-12 2021-08-12 Multi-mode data fusion lane target detection method and system based on multi-scale convolution

Publications (1)

Publication Number Publication Date
CN113569803A true CN113569803A (en) 2021-10-29

Family

ID=78171312

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110921918.XA Pending CN113569803A (en) 2021-08-12 2021-08-12 Multi-mode data fusion lane target detection method and system based on multi-scale convolution

Country Status (1)

Country Link
CN (1) CN113569803A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114842313A (en) * 2022-05-10 2022-08-02 北京易航远智科技有限公司 Target detection method and device based on pseudo-point cloud, electronic equipment and storage medium
CN114842313B (en) * 2022-05-10 2024-05-31 北京易航远智科技有限公司 Target detection method and device based on pseudo point cloud, electronic equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200160559A1 (en) * 2018-11-16 2020-05-21 Uatc, Llc Multi-Task Multi-Sensor Fusion for Three-Dimensional Object Detection
CN111967373A (en) * 2020-08-14 2020-11-20 东南大学 Self-adaptive enhanced fusion real-time instance segmentation method based on camera and laser radar
CN112572325A (en) * 2019-09-30 2021-03-30 福特全球技术公司 Adaptive sensor fusion
CN112731436A (en) * 2020-12-17 2021-04-30 浙江大学 Multi-mode data fusion travelable area detection method based on point cloud up-sampling

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200160559A1 (en) * 2018-11-16 2020-05-21 Uatc, Llc Multi-Task Multi-Sensor Fusion for Three-Dimensional Object Detection
CN112572325A (en) * 2019-09-30 2021-03-30 福特全球技术公司 Adaptive sensor fusion
CN111967373A (en) * 2020-08-14 2020-11-20 东南大学 Self-adaptive enhanced fusion real-time instance segmentation method based on camera and laser radar
CN112731436A (en) * 2020-12-17 2021-04-30 浙江大学 Multi-mode data fusion travelable area detection method based on point cloud up-sampling

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
YONGLIN TIAN: "Adaptive and Azimuth-Aware Fusion Network of Multimodal Local Features for 3D Object Detection", 《HTTPS://ARXIV.ORG/ABS/1910.04392》 *
自然资源部农用地质量与监控重点实验室编著: "《中国农用地质量发展研究报告 2019版》", 30 April 2020 *
陈莹: "模态自适应权值学习机制下的", 《光学精密工程》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114842313A (en) * 2022-05-10 2022-08-02 北京易航远智科技有限公司 Target detection method and device based on pseudo-point cloud, electronic equipment and storage medium
CN114842313B (en) * 2022-05-10 2024-05-31 北京易航远智科技有限公司 Target detection method and device based on pseudo point cloud, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN110929692B (en) Three-dimensional target detection method and device based on multi-sensor information fusion
CN110175576B (en) Driving vehicle visual detection method combining laser point cloud data
CN108983219B (en) Fusion method and system for image information and radar information of traffic scene
EP4152204A1 (en) Lane line detection method, and related apparatus
CN113111887B (en) Semantic segmentation method and system based on information fusion of camera and laser radar
CN110738121A (en) front vehicle detection method and detection system
US20230213643A1 (en) Camera-radar sensor fusion using local attention mechanism
CN116685874A (en) Camera-laser radar fusion object detection system and method
JP2016009487A (en) Sensor system for determining distance information on the basis of stereoscopic image
CN114495064A (en) Monocular depth estimation-based vehicle surrounding obstacle early warning method
CN112825192A (en) Object identification system and method based on machine learning
CN117058646B (en) Complex road target detection method based on multi-mode fusion aerial view
CN113643345A (en) Multi-view road intelligent identification method based on double-light fusion
CN117274749B (en) Fused 3D target detection method based on 4D millimeter wave radar and image
CN116830164A (en) LiDAR decorrelated object detection system and method
Dewangan et al. Towards the design of vision-based intelligent vehicle system: methodologies and challenges
CN115937819A (en) Three-dimensional target detection method and system based on multi-mode fusion
CN114764856A (en) Image semantic segmentation method and image semantic segmentation device
Kühnl et al. Visual ego-vehicle lane assignment using spatial ray features
CN116612468A (en) Three-dimensional target detection method based on multi-mode fusion and depth attention mechanism
CN116704304A (en) Multi-mode fusion target detection method of mixed attention mechanism
Jie et al. Llformer: An efficient and real-time lidar lane detection method based on transformer
CN113255779A (en) Multi-source perception data fusion identification method and system and computer readable storage medium
CN117808689A (en) Depth complement method based on fusion of millimeter wave radar and camera
CN112529011A (en) Target detection method and related device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20211029