CN115393684A - Anti-interference target detection method based on automatic driving scene multi-mode fusion - Google Patents

Anti-interference target detection method based on automatic driving scene multi-mode fusion Download PDF

Info

Publication number
CN115393684A
CN115393684A CN202211321720.9A CN202211321720A CN115393684A CN 115393684 A CN115393684 A CN 115393684A CN 202211321720 A CN202211321720 A CN 202211321720A CN 115393684 A CN115393684 A CN 115393684A
Authority
CN
China
Prior art keywords
different
target
target detection
features
fusion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211321720.9A
Other languages
Chinese (zh)
Other versions
CN115393684B (en
Inventor
刘寒松
王永
王国强
刘瑞
焦安健
李贤超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sonli Holdings Group Co Ltd
Original Assignee
Sonli Holdings Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sonli Holdings Group Co Ltd filed Critical Sonli Holdings Group Co Ltd
Priority to CN202211321720.9A priority Critical patent/CN115393684B/en
Publication of CN115393684A publication Critical patent/CN115393684A/en
Application granted granted Critical
Publication of CN115393684B publication Critical patent/CN115393684B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10048Infrared image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Abstract

The invention belongs to the technical field of target detection, and relates to an anti-interference target detection method based on multi-mode fusion of an automatic driving scene, which adopts a detection means of complementary multi-mode combination of visible light and near-infrared images, fully utilizes multi-mode data from a characteristic level, and designs different trunk networks for extracting basic characteristics of the visible light and the near-infrared images and extracting the basic characteristics aiming at different apparent characteristics of the near-infrared and visible light; aiming at the problems of different information abundance and different learning difficulty degrees of two groups of images, the learning rate constraint module is used in different branches and is used for controlling the updating rate of the branches in different modes; and then, the two groups of features are fused by using a mode fusion module, so that similar features and complementary features are reserved, namely coexistence and difference are sought, information of the two modes is fully utilized, the detection precision is high, and the detection omission rate in a complex interference scene is reduced.

Description

Anti-interference target detection method based on automatic driving scene multi-mode fusion
Technical Field
The invention belongs to the technical field of target detection, and relates to an anti-interference target detection method based on automatic driving scene multi-mode fusion.
Background
With the acceleration of urban processes, the number of vehicles in cities is increased sharply, and accompanying traffic problems become more obvious. With the development of artificial intelligence and big data technology, automatic driving can separate the driver from heavy and mechanized driving, reduce traffic accidents caused by human factors such as the driver and the like, and improve road traffic efficiency. The current difficulty of automatic driving is perception, although automatic driving has better performance in a clear scene, many limitations still exist in a complex scene, and especially under the conditions of low light at night and interference of complex weather such as rain, snow, fog and the like, visibility is sharply reduced, and target background contrast is degraded.
The automatic driving technology needs to be realized by various sensors to make up for information loss of single-mode data, and a laser radar, a visible light camera, a millimeter wave radar and the like are usually used for knowing the surrounding traffic conditions. When facing to complicated weather interference such as rain, snow, fog, haze, although the visible light image has abundant texture information, complicated weather interference also can reduce image quality, influences the testing result, therefore current solution mostly uses lidar, but long-distance target size is little, the detail is few, it is similar and difficult to distinguish with granule interference appearance such as rain, snow, have peculiar structural morphology and motion pattern, and inhomogeneous rain fog will let lidar produce the false in the globoid fog that produces, thereby judge into the barrier, great influence is caused to the system, consequently complicated weather can influence the accuracy and the integrality of laser point cloud.
The near-infrared sensor has good transmissivity, has good imaging effect under severe conditions such as rain, snow, fog, low-light scenes at night and the like, is almost invisible in near infrared for rainy and foggy weather with the characteristics of sparseness and highlight in a visible light image, can adopt a conventional low-cost lens as the visible light, and can be integrated in the same sensor with the visible light, so that the common optical axis with the visible light image can be realized, registration and alignment are not needed, unnecessary workload is reduced, the cost of equipment acquisition is reduced, the calculation amount of near-infrared image processing is far less than that of 3D point cloud data of a laser radar, and the deployment and realization of a vehicle-mounted resource-limited platform are facilitated. Meanwhile, the near infrared has no rich texture and color information, and the target identification lacks advantages, so that the information loss of single-mode data can be compensated by combining with visible light, and a great problem is how to organically combine images with different apparent characteristics of two modes for detection and identification of the target.
Therefore, the target detection in the automatic driving scene has outstanding challenges under severe weather interference conditions, and a more effective method for improving the detection anti-interference capability is urgently needed.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides an anti-interference target detection method based on multi-mode fusion of an automatic driving scene, which is used for dealing with interference of low light at night and complicated weather such as rain, snow, fog and the like in target detection of the automatic driving scene, can be simultaneously used for target detection tasks under other complicated interference conditions, breaks through the detection bottleneck of a single mode, and improves the performance of target detection.
In order to achieve the above purpose, the specific process of detecting the interference-resisting target of the present invention is as follows:
(1) And (3) data set construction:
the method comprises the steps of collecting image data of an automatic driving scene under different scenes by using a visible light near-infrared co-optical axis camera, marking the data, constructing a data set, and enabling the data set to be 6:2:2, dividing the quantity ratio into a training set, a verification set and a test set;
(2) Differential feature extraction:
respectively inputting the visible light image and the near infrared image into different differentiation feature extraction networks to extract features, and obtaining two groups of basic features;
(3) And (3) learning rate constraint:
respectively inputting the two groups of basic characteristics obtained in the step (2) into a learning rate constraint module for controlling the updating rates of different modal branches;
(4) Modal feature fusion:
inputting the basic characteristics of the learning rate constraint module in the step (3) into a modal characteristic fusion module, and fusing the characteristics of the two modes to obtain modal fusion characteristics for subsequent target detection;
(5) Target detection:
inputting the modal fusion features obtained in the step (4) into a target detection module, connecting two convolution layers with convolution kernels of 3 x 3 after the features are extracted, setting three anchor frames with different length-width ratios at each position of an output feature map, and then learning target frame position information and classification information by using two full-connection layers with unshared parameters, wherein the target frame position information is the deviation of a real target position and the anchor frames, and the classification information is different categories of the target frames, such as background, pedestrians or vehicles.
(6) Training and testing the network, and outputting a detection result:
training by using the training data acquired in the step (1), sequentially inputting a near-infrared image and a visible light image with the size of 640 multiplied by 512 into a differentiation feature extraction network, a learning rate constraint module and a target detection module, finally outputting a predicted target position and a predicted type, performing error calculation by using a real target position and a predicted result, calculating a type error by using a Focal loss, calculating an error between the predicted target position and the real target position by using a smooth L1 loss, updating parameters through back propagation, storing the best model parameters after 200 complete iterations, and obtaining the trained anti-interference target detection network parameters by using the best model parameters as the parameters trained by a final model, and then testing the test data acquired in the step (1) to obtain the position and the type of the target.
As a further technical solution of the present invention, the different scenes in step (1) include night low light and rainy, snowy, and foggy weather scenes, 1500 images are collected in different modalities, and the content labeled to the data includes a position and a category of the target, where the position includes a center point and a length and a width of the target, and the category includes two categories of a vehicle and a pedestrian.
As a further technical scheme of the invention, the differentiated feature extraction network for extracting the visible light image features in the step (2) comprises five convolution modules, wherein each convolution module comprises two convolution layers, three activation layers and a maximum pooling layer; the differential feature extraction network for extracting the near-infrared image features comprises five convolution modules, wherein each convolution module comprises a convolution layer, two activation layers and an average pooling layer.
As a further technical solution of the present invention, the learning rate constraint module in step (3) is composed of a full connection layer, and when the inverse gradient is propagated, the gradient of different modes is multiplied by different coefficients to control the learning rate of different modes, and the full connection layer increases the nonlinear fitting capability of the network.
As a further technical scheme of the present invention, the modal characteristic fusion module in step (4) is composed of two branches, wherein one branch multiplies the characteristics of two modes correspondingly, and increases the significance of similar characteristics, i.e., "find the same"; the other branch is to add the characteristics of the two modes and reserve the differential characteristics, namely the existence of the difference.
Compared with the prior art, the invention adopts a detection means of complementary multi-mode combination of visible light and near-infrared images in order to deal with low-light night and complex weather interference such as rain, snow, fog and the like in an automatic driving scene and consider the problem of single-mode information loss. Fully utilizing multi-modal data from a characteristic level, and designing different trunk networks for extracting basic characteristics of visible light and near infrared images and extracting the basic characteristics according to different apparent characteristics of near infrared and visible light; aiming at the problems of different information abundance and different learning difficulty degrees of two groups of images, the learning rate constraint module is used in different branches and is used for controlling the updating rate of the branches in different modes; and then, the two groups of features are fused by using a modal fusion module, so that similar features and complementary features are reserved, namely coexistence and difference are sought, information of the two modes is fully utilized, the detection precision is high, and the detection omission rate in a complex interference scene is reduced.
Drawings
Fig. 1 is a schematic diagram of a network architecture framework for implementing anti-interference target detection according to the present invention.
Fig. 2 is a block diagram of a process for implementing anti-interference target detection according to the present invention.
Fig. 3 is a first detection example of embodiment 2 of the present invention, wherein (a) is a detection result diagram of a Yolov3 target detection method, and (b) is a detection result diagram of the method of the present invention.
Fig. 4 is a second detection example of embodiment 2 of the present invention, in which (a) is a detection result diagram of the Yolov3 target detection method, and (b) is a detection result diagram of the method of the present invention.
Detailed Description
The invention will be further described by way of examples, without in any way limiting the scope of the invention, with reference to the accompanying drawings.
Example (b):
in this embodiment, the network structure shown in fig. 1 and the process shown in fig. 2 are used to implement detection of an anti-interference target, which specifically includes the following steps:
(1) And (3) data set construction:
collecting image data of an automatic driving scene under different scenes by using a visible light near-infrared co-optical axis camera, wherein the image data comprises low illumination at night and complex weather scenes such as rain, snow, fog and the like, and 1500 images are collected in different modes; the visible light near infrared is a coaxial-axis image, so that registration and alignment are not needed, data are directly marked to construct a data set, marked contents comprise positions and categories of targets, the positions comprise target center points and length and width, and the categories comprise two categories of vehicles and pedestrians; and dividing the data set into a training set, a verification set and a test set, wherein the quantity ratio is 6:2:2;
(2) Differential feature extraction:
aiming at different apparent characteristics of near infrared and visible light, different differentiation characteristic extraction networks are designed and respectively used for extracting basic characteristics of the visible light and the near infrared image, wherein the visible light image has complex and variable textures and high information complexity, the differentiation characteristic extraction network of the visible light image uses five convolution modules, each convolution module comprises two convolution layers, three activation layers and a maximum pooling layer, and the maximum pooling layer is used because the maximum pooling layer is more sensitive to the textures; the texture of the near-infrared image is relatively smooth, background interference is relatively less, five convolution modules are also used in the differential feature extraction network of the near-infrared image, each convolution module comprises a convolution layer, two activation layers and an average pooling layer, and the average pooling layer is favorable for extracting features of the smooth near-infrared image with less texture, so that two groups of basic features of visible light and near infrared are obtained;
(3) Learning rate constraints:
aiming at the problems of different richness and learning difficulty of two groups of image information, a learning rate constraint module is used in different branches and is used for controlling the updating rate of the branches in different modes, wherein the learning rate constraint module uses basic characteristics of different modes as input, the learning rate constraint module consists of a layer of full connection layer, when the reverse gradient is transmitted, the gradient of the different modes is multiplied by different coefficients so as to control the learning rate of the different modes, and meanwhile, the full connection layer can also increase the nonlinear fitting capability of the network;
(4) Modal feature fusion:
inputting the basic features of the learning rate constraint module in the step (3) into a modal feature fusion module for feature fusion of two modalities, wherein the modal feature fusion module consists of two branches, the first branch is used for correspondingly multiplying the features of the two modalities to increase the significance of similar features, namely 'identity finding', the other branch is used for adding the features of the two modalities, keeping the difference features, namely 'difference existence', and combining the output features of the two branches to obtain modal fusion features for subsequent target detection;
(5) Target detection:
inputting the modal fusion features into a target detection module, connecting two convolution layers with convolution kernels of 3 multiplied by 3 after the extracted features, setting three anchor frames with different length-width ratios at each position of an output feature map, and then learning target frame position information and classification information by using a full connection layer with two parameters not shared, wherein the target frame position information is the deviation between a real target position and the anchor frame, and the classification information is different types of the target frame, such as background, pedestrians or vehicles;
(6) Training and testing, and outputting a detection result:
training by using the training data acquired in the step (1), sequentially inputting a near-infrared image and a visible light image with the size of 640 multiplied by 512 into a differentiation feature extraction network, a learning rate constraint module and a target detection module, and finally outputting a predicted target position and a predicted target type; calculating errors by using the real target position and the type and the predicted result, calculating type errors by using Focal loss, and calculating errors between the predicted target position and the real target position by using smooth L1 loss; updating parameters through back propagation, saving the best model parameters after 200 epoch iterations to serve as the final model trained parameters to obtain the trained anti-interference target detection network parameters, and then testing the test data acquired in the step (1) to obtain the position and the type of the target.
Example 2:
in this embodiment, the technical scheme of embodiment 1 is adopted to perform a test in an acquired data set, where the test data set includes 150 pairs of vehicle infrared-visible light data and 150 pairs of pedestrian infrared-visible light data, only visible light data is used for the test in the Yolov3 target detection method, infrared-visible light data is used for the test in this embodiment, and the infrared-visible light data is used for the fusion in this embodiment, and an experimental result shows that, compared with the existing Yolov3 target detection method, the detection accuracy in this embodiment is improved from 86.2% to 92.4%, two sets of real object detection effect graphs are shown in fig. 3 and fig. 4, (a) is a detection result of Yolov3 (a detection result of only a visible light image), (b) is a detection result of infrared-visible light multimodal data used in this embodiment, and is finally shown in an infrared image, and as can be seen from the experimental result graph, the detection omission rate in a complex interference scene is greatly reduced in this embodiment.
Network structures and algorithms not described in detail herein are all common in the art.
It is noted that the disclosed embodiments are intended to aid in further understanding of the invention, but those skilled in the art will appreciate that: various substitutions and modifications are possible without departing from the spirit and scope of the invention and appended claims. Therefore, the invention should not be limited by the disclosure of the embodiments, but should be defined by the scope of the appended claims.

Claims (5)

1. An anti-interference target detection method based on automatic driving scene multi-mode fusion is characterized by comprising the following specific processes:
(1) And (3) data set construction:
the method comprises the steps of collecting image data of an automatic driving scene under different scenes by using a visible light near-infrared co-optical axis camera, marking the data, constructing a data set, and enabling the data set to be 6:2:2, dividing the quantity ratio into a training set, a verification set and a test set;
(2) Differential feature extraction:
respectively inputting the visible light image and the near-infrared image into different differentiation feature extraction networks to extract features, and obtaining two groups of basic features;
(3) And (3) learning rate constraint:
respectively inputting the two groups of basic characteristics obtained in the step (2) into a learning rate constraint module for controlling the updating rates of different modal branches;
(4) Modal feature fusion:
inputting the basic characteristics of the learning rate constraint module in the step (3) into a modal characteristic fusion module, and fusing the characteristics of the two modes to obtain modal fusion characteristics for subsequent target detection;
(5) Target detection:
inputting the modal fusion features obtained in the step (4) into a target detection module, connecting two convolution layers with convolution kernels of 3 x 3 after the features are extracted, setting three anchor frames with different length-width ratios at each position of an output feature map, and then learning target frame position information and classification information by using two full-connection layers with unshared parameters, wherein the target frame position information is the deviation between a real target position and the anchor frame, and the classification information is different classes of the target frame;
(6) Training and testing the network, and outputting a detection result:
training by using the training data acquired in the step (1), sequentially inputting a near-infrared image and a visible light image with the size of 640 x 512 into a differentiation feature extraction network, a learning rate constraint module and a target detection module, finally outputting a predicted target position and category, performing error calculation by using a real target position and category and a predicted result, calculating a category error by using a Focal loss, calculating an error between the predicted target position and the real target position by using a smooth L1 loss, updating parameters through back propagation, storing the best model parameters after 200 complete iterations, and obtaining the trained anti-interference target detection network parameters by using the best model parameters as the final model trained parameters, and then testing the test data acquired in the step (1) to obtain the position and category of the target.
2. The method for detecting the anti-interference target based on the multi-modal fusion of the automatic driving scenes according to the claim 1, wherein the different scenes in the step (1) comprise low-light scenes at night and rain, snow and fog scenes, 1500 images are collected in different modalities, the content labeled on the data comprises the position and the category of the target, wherein the position comprises the central point and the length and the width of the target, and the category comprises two categories of vehicles and pedestrians.
3. The anti-interference target detection method based on multimodal fusion of automatic driving scenes according to claim 2, characterized in that the differentiated feature extraction network for extracting visible light image features in the step (2) comprises five convolution modules, wherein each convolution module comprises two convolution layers, three activation layers and one maximum pooling layer; the differential feature extraction network for extracting the near-infrared image features comprises five convolution modules, wherein each convolution module comprises a convolution layer, two activation layers and an average pooling layer.
4. The anti-interference target detection method based on multi-modal fusion of the automatic driving scene as claimed in claim 3, wherein the learning rate constraint module in step (3) is composed of a fully connected layer, and when the inverse gradient is propagated, the gradient of different modes is multiplied by different coefficients to control the learning rate of different modes, and meanwhile, the fully connected layer increases the nonlinear fitting capability of the network.
5. The anti-interference target detection method based on multi-modal fusion of the automatic driving scenario as claimed in claim 4, wherein the modal feature fusion module in step (4) is composed of two branches, wherein one branch multiplies the features of two modalities correspondingly to increase the significance of similar features, i.e. "find the same"; the other branch is to add the characteristics of the two modes and retain the characteristics of the difference, namely the existence of the difference.
CN202211321720.9A 2022-10-27 2022-10-27 Anti-interference target detection method based on automatic driving scene multi-mode fusion Active CN115393684B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211321720.9A CN115393684B (en) 2022-10-27 2022-10-27 Anti-interference target detection method based on automatic driving scene multi-mode fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211321720.9A CN115393684B (en) 2022-10-27 2022-10-27 Anti-interference target detection method based on automatic driving scene multi-mode fusion

Publications (2)

Publication Number Publication Date
CN115393684A true CN115393684A (en) 2022-11-25
CN115393684B CN115393684B (en) 2023-01-24

Family

ID=84128353

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211321720.9A Active CN115393684B (en) 2022-10-27 2022-10-27 Anti-interference target detection method based on automatic driving scene multi-mode fusion

Country Status (1)

Country Link
CN (1) CN115393684B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116626670A (en) * 2023-07-18 2023-08-22 小米汽车科技有限公司 Automatic driving model generation method and device, vehicle and storage medium
CN116861262A (en) * 2023-09-04 2023-10-10 苏州浪潮智能科技有限公司 Perception model training method and device, electronic equipment and storage medium

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111209810A (en) * 2018-12-26 2020-05-29 浙江大学 Bounding box segmentation supervision deep neural network architecture for accurately detecting pedestrians in real time in visible light and infrared images
WO2020181414A1 (en) * 2019-03-08 2020-09-17 北京数字精准医疗科技有限公司 Multi-spectral imaging system, apparatus and method, and storage medium
CN112418163A (en) * 2020-12-09 2021-02-26 北京深睿博联科技有限责任公司 Multispectral target detection blind guiding system
CN113920066A (en) * 2021-09-24 2022-01-11 国网冀北电力有限公司信息通信分公司 Multispectral infrared inspection hardware detection method based on decoupling attention mechanism
CN113963240A (en) * 2021-09-30 2022-01-21 西南电子技术研究所(中国电子科技集团公司第十研究所) Comprehensive detection method for multi-source remote sensing image fusion target
CN114219824A (en) * 2021-12-17 2022-03-22 南京理工大学 Visible light-infrared target tracking method and system based on deep network
CN114254696A (en) * 2021-11-30 2022-03-29 上海西虹桥导航技术有限公司 Visible light, infrared and radar fusion target detection method based on deep learning
CN114612937A (en) * 2022-03-15 2022-06-10 西安电子科技大学 Single-mode enhancement-based infrared and visible light fusion pedestrian detection method
US20220253639A1 (en) * 2021-02-01 2022-08-11 Inception Institute of Artificial Intelligence Ltd Complementary learning for multi-modal saliency detection
CN115131640A (en) * 2022-06-27 2022-09-30 武汉大学 Target detection method and system utilizing illumination guide and attention mechanism

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111209810A (en) * 2018-12-26 2020-05-29 浙江大学 Bounding box segmentation supervision deep neural network architecture for accurately detecting pedestrians in real time in visible light and infrared images
WO2020181414A1 (en) * 2019-03-08 2020-09-17 北京数字精准医疗科技有限公司 Multi-spectral imaging system, apparatus and method, and storage medium
CN112418163A (en) * 2020-12-09 2021-02-26 北京深睿博联科技有限责任公司 Multispectral target detection blind guiding system
US20220253639A1 (en) * 2021-02-01 2022-08-11 Inception Institute of Artificial Intelligence Ltd Complementary learning for multi-modal saliency detection
CN113920066A (en) * 2021-09-24 2022-01-11 国网冀北电力有限公司信息通信分公司 Multispectral infrared inspection hardware detection method based on decoupling attention mechanism
CN113963240A (en) * 2021-09-30 2022-01-21 西南电子技术研究所(中国电子科技集团公司第十研究所) Comprehensive detection method for multi-source remote sensing image fusion target
CN114254696A (en) * 2021-11-30 2022-03-29 上海西虹桥导航技术有限公司 Visible light, infrared and radar fusion target detection method based on deep learning
CN114219824A (en) * 2021-12-17 2022-03-22 南京理工大学 Visible light-infrared target tracking method and system based on deep network
CN114612937A (en) * 2022-03-15 2022-06-10 西安电子科技大学 Single-mode enhancement-based infrared and visible light fusion pedestrian detection method
CN115131640A (en) * 2022-06-27 2022-09-30 武汉大学 Target detection method and system utilizing illumination guide and attention mechanism

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
CHENGFANG ZHANG 等: ""Convolutional Dictionary Learning Using Global Matching Tracking (CDL-GMT): Application to Visible-Infrared Image Fusion"", 《IEEE》 *
安浩南等: "基于伪模态转换的红外目标融合检测算法", 《光子学报》 *
张典等: "基于轻量网络的近红外光和可见光融合的异质人脸识别", 《小型微型计算机系统》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116626670A (en) * 2023-07-18 2023-08-22 小米汽车科技有限公司 Automatic driving model generation method and device, vehicle and storage medium
CN116626670B (en) * 2023-07-18 2023-11-03 小米汽车科技有限公司 Automatic driving model generation method and device, vehicle and storage medium
CN116861262A (en) * 2023-09-04 2023-10-10 苏州浪潮智能科技有限公司 Perception model training method and device, electronic equipment and storage medium
CN116861262B (en) * 2023-09-04 2024-01-19 苏州浪潮智能科技有限公司 Perception model training method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN115393684B (en) 2023-01-24

Similar Documents

Publication Publication Date Title
CN115393684B (en) Anti-interference target detection method based on automatic driving scene multi-mode fusion
Han et al. Research on road environmental sense method of intelligent vehicle based on tracking check
WO2022206942A1 (en) Laser radar point cloud dynamic segmentation and fusion method based on driving safety risk field
CN113820714B (en) Dust fog weather road environment sensing system based on multi-sensor fusion
CN111369541A (en) Vehicle detection method for intelligent automobile under severe weather condition
CN113420607A (en) Multi-scale target detection and identification method for unmanned aerial vehicle
CN110599497A (en) Drivable region segmentation method based on deep neural network
Li et al. A feature pyramid fusion detection algorithm based on radar and camera sensor
CN116484971A (en) Automatic driving perception self-learning method and device for vehicle and electronic equipment
Jiang et al. Target detection algorithm based on MMW radar and camera fusion
CN115830265A (en) Automatic driving movement obstacle segmentation method based on laser radar
CN115876198A (en) Target detection and early warning method, device, system and medium based on data fusion
Cheng et al. Semantic segmentation of road profiles for efficient sensing in autonomous driving
CN114973199A (en) Rail transit train obstacle detection method based on convolutional neural network
CN110909656A (en) Pedestrian detection method and system with integration of radar and camera
Liu et al. A survey on autonomous driving datasets
CN117372991A (en) Automatic driving method and system based on multi-view multi-mode fusion
CN116486359A (en) All-weather-oriented intelligent vehicle environment sensing network self-adaptive selection method
Zhang et al. Smart-rain: A degradation evaluation dataset for autonomous driving in rain
CN116403186A (en) Automatic driving three-dimensional target detection method based on FPN Swin Transformer and Pointernet++
CN113611008B (en) Vehicle driving scene acquisition method, device, equipment and medium
CN115359067A (en) Continuous convolution network-based point-by-point fusion point cloud semantic segmentation method
CN112513876B (en) Road surface extraction method and device for map
CN115170467A (en) Traffic indication method and system based on multispectral pedestrian detection and vehicle speed detection
CN212990128U (en) Small target intelligent identification system based on remote video monitoring

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant