CN117690161A - Pedestrian detection method, device and medium based on image fusion - Google Patents

Pedestrian detection method, device and medium based on image fusion Download PDF

Info

Publication number
CN117690161A
CN117690161A CN202311704548.XA CN202311704548A CN117690161A CN 117690161 A CN117690161 A CN 117690161A CN 202311704548 A CN202311704548 A CN 202311704548A CN 117690161 A CN117690161 A CN 117690161A
Authority
CN
China
Prior art keywords
image
feature
visible light
thermal infrared
fusion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311704548.XA
Other languages
Chinese (zh)
Other versions
CN117690161B (en
Inventor
陈明轩
叶逸航
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai University of Engineering Science
Original Assignee
Shanghai University of Engineering Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai University of Engineering Science filed Critical Shanghai University of Engineering Science
Priority to CN202311704548.XA priority Critical patent/CN117690161B/en
Publication of CN117690161A publication Critical patent/CN117690161A/en
Application granted granted Critical
Publication of CN117690161B publication Critical patent/CN117690161B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Image Processing (AREA)
  • Traffic Control Systems (AREA)

Abstract

The invention relates to a pedestrian detection method, equipment and medium based on image fusion, which comprises the following steps: s1, acquiring a real-time visible light image and a thermal infrared image and preprocessing the images; s2, respectively carrying out multi-scale feature extraction on the preprocessed visible light image and the preprocessed thermal infrared image for a plurality of times to generate a plurality of visible light feature images and a plurality of thermal infrared feature images; s3, carrying out weighted fusion on the visible light characteristic diagram and the thermal infrared characteristic diagram to obtain a class activation diagram; s4, inputting the class activation diagram into a feature pyramid network to perform multi-scale feature fusion, and generating a fusion feature diagram; and S5, executing a detection task on the fusion feature map, and outputting a pedestrian detection result, wherein the detection task comprises pedestrian prediction boundary box regression and pedestrian prediction boundary box object selection classification. Compared with the prior art, the pedestrian detection method and device improve the accuracy and the instantaneity of the pedestrian detection result.

Description

Pedestrian detection method, device and medium based on image fusion
Technical Field
The invention belongs to the technical field of target detection, and particularly relates to a pedestrian detection method, device and medium based on image fusion.
Background
In industry, manual driving of transport vehicles is usually adopted, but the complexity of the night workshop environment and possible misoperation of drivers bring uncertainty factors to driving safety, and life safety and production efficiency of pedestrians are seriously threatened. Object detection is one of the key tasks in the field of computer vision, and pedestrian detection has been remarkably developed as an important branch of object detection, but the detection result thereof depends largely on the quality of an input image. Under the condition of complex illumination conditions or low illumination, the optical imaging sensor is difficult to provide enough information to clearly outline the target outline, and the traditional single-mode target detection technology is difficult to obtain an ideal imaging result, so that the accuracy and reliability of the output result of the pedestrian detection algorithm are directly affected.
In this context, multi-modal object detection techniques have evolved that aim to obtain more comprehensive object information by using multiple sources of data in combination with different sensors. In the existing multi-mode target pedestrian detection method, a plurality of backbone networks are generally used for respectively extracting feature graphs from input modes, then the feature graphs are fused by utilizing an algorithm, and the fusion part allows a detection model to extract detailed information from each input, so that better performance is realized. For example, LEE and the like propose a cascade fusion method, cascade operation is carried out on two modal feature graphs to double the channel number, and then an NiN layer is used for outputting important features, but when the channel number is doubled, the complexity of calculation is increased due to the introduction of redundant calculation amount, the instantaneity is poor, and the deployment of a model is limited; KIM and the like propose a weighted fusion method based on a region of interest, and the region to be fused is selected by judging the feature quantity extracted from the region of interest, but the feature of an unfused region is sacrificed, so that the detection precision of a tiny target is reduced, the superiority of inter-mode fusion is not considered, insufficient mode fusion is caused, and meanwhile, global information in modes is not considered, so that the mode fusion information is lost. Therefore, it is necessary to design a pedestrian detection method, fully utilizing the advantages of modal fusion, and improving the accuracy and instantaneity of pedestrian detection.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a pedestrian detection method, device and medium based on image fusion, which improve the accuracy and real-time performance of pedestrian detection results.
The aim of the invention can be achieved by the following technical scheme:
a pedestrian detection method based on image fusion comprises the following steps:
s1, acquiring a real-time visible light image and a thermal infrared image and preprocessing the images;
s2, respectively carrying out multi-scale feature extraction on the preprocessed visible light image and the preprocessed thermal infrared image for a plurality of times to generate a plurality of visible light feature images and a plurality of thermal infrared feature images;
s3, carrying out weighted fusion on the visible light characteristic diagram and the thermal infrared characteristic diagram to obtain a class activation diagram;
s4, inputting the class activation diagram into a feature pyramid network to perform multi-scale feature fusion, and generating a fusion feature diagram;
and S5, executing a detection task on the fusion feature map, and outputting a pedestrian detection result, wherein the detection task comprises pedestrian prediction boundary box regression and pedestrian prediction boundary box object selection classification.
Further, in step S1, the preprocessing process includes:
unifying the pixel size and format of each image data;
filtering noise reduction and image enhancement are performed.
Further, in step S2, the specific process of multi-scale feature extraction is as follows:
s201, acquiring a preprocessed visible light image and a preprocessed thermal infrared image, sampling isolation pixels in the horizontal direction and the vertical direction, and generating a plurality of visible light image characteristic layers and a plurality of thermal infrared image characteristic layers;
s202, superposing all visible light image feature layers, superposing all thermal infrared image feature layers, respectively inputting the feature layers into a convolution network for feature extraction, and generating a visible light feature map and a thermal infrared feature map.
Further, in step S202, the convolution network includes a convolution layer, a spatial pyramid pooling layer SPP, and a residual block layer.
Further, the specific process of step S3 is as follows:
s301, obtaining a visible light characteristic diagramAnd thermal infrared signature->Performing an inner product operation to obtain a first feature mapThen a spatial attention operation is performed to obtain a second feature map +.>
S302, obtaining a visible light characteristic diagramAnd thermal infrared signature->Performing addition operation to obtain a third feature mapThen a convolution operation is performed to obtain a fourth feature map +.>
S303, combining the second feature mapAnd fourth characteristic diagram->Performing channel self-attention operation to generate class activation graphs
Further, in step S301, the procedure of the spatial attention operation is as follows:
for the first characteristic diagramRespectively carrying out maximum pooling and average pooling, and then respectively carrying out convolution operation;
splicing the results obtained by the convolution operation by using an activation function to obtain a second feature map
Further, in step S303, the process of the channel self-attention operation is as follows:
for the second characteristic diagramAnd fourth characteristic diagram->After the inner product operation is carried out, respectively carrying out maximum pooling and average pooling;
and respectively weighting the maximum pooling and average pooling results, and then splicing through an activation function.
Further, the activation function is a Sigmoid activation function.
The invention also provides an electronic device comprising a memory, a processor and a program stored in the memory, the processor implementing the method according to any of claims 1-8 when executing the program.
The invention also provides a computer readable storage medium having stored thereon a computer program which when executed by a processor implements the method of any of claims 1-8.
Compared with the prior art, the invention has the following beneficial effects:
1. according to the invention, the preprocessed visible light image and the preprocessed thermal infrared image are respectively subjected to multi-scale feature extraction to generate a plurality of visible light feature images and a plurality of thermal infrared feature images, and then the visible light feature images and the thermal infrared feature images are subjected to weighted fusion to obtain the similar activation images, so that the pedestrian feature information in the visible light feature images and the thermal infrared feature images can be effectively highlighted without losing information, more details and features in the captured image data are facilitated, and the accuracy of pedestrian detection results is improved.
2. According to the invention, the preprocessed visible light image and the preprocessed thermal infrared image are respectively subjected to multi-scale feature extraction, firstly, the isolated pixels are sampled in the horizontal direction and the vertical direction, then, the images are overlapped, and the images are input into a convolution network for feature extraction for many times, and a spatial pyramid pooling layer SPP is used in the convolution network, so that targets with different sizes can be better processed without obviously increasing the size of the network, the model training speed is improved, and the real-time performance of pedestrian detection is further improved.
Drawings
FIG. 1 is a schematic flow chart of the method of the present invention;
FIG. 2 is a schematic diagram of a pedestrian detection model structure based on image fusion;
FIG. 3 is a schematic diagram of a CAM activation module.
Detailed Description
The invention will now be described in detail with reference to the drawings and specific examples. The present embodiment is implemented on the premise of the technical scheme of the present invention, and a detailed implementation manner and a specific operation process are given, but the protection scope of the present invention is not limited to the following examples.
Examples:
the embodiment firstly builds a pedestrian detection model based on image fusion as shown in fig. 2, and comprises a CSPDarknet module, a CAM activation module, a feature pyramid network and a detection probe, wherein the CSPDarknet module is used for respectively carrying out multi-scale feature extraction on a visible light image and a thermal infrared image to obtain a visible light feature map and a thermal infrared feature map; the CAM activation module is used for carrying out weighted fusion on the visible light characteristic diagram and the thermal infrared characteristic diagram to obtain a class activation diagram; the feature pyramid network is used for carrying out multi-scale feature fusion on the class activation graph to generate a fusion feature graph; the detection probe is used for executing detection tasks on the fusion feature map and finally outputting detection results of pedestrian detection.
Based on the pedestrian detection model based on image fusion, the embodiment provides a pedestrian detection method based on image fusion, as shown in fig. 1, comprising the following steps:
s1, acquiring a real-time visible light image and a thermal infrared image and preprocessing.
The pretreatment process comprises the following steps:
unifying the pixel size of each image data to 320×240 pixels, wherein the format is json format;
the filtering noise reduction and the image enhancement are carried out, and common noise reduction modes include mean filtering, gaussian filtering, median filtering and the like, and the image enhancement modes include image enhancement based on a Laplace operator, image enhancement based on logarithmic Log transformation, image enhancement based on the Laplace operator and the like.
S2, respectively inputting the preprocessed visible light images and the preprocessed thermal infrared images into a CSPDarknet module for multi-scale feature extraction to generate a plurality of visible light feature images and a plurality of thermal infrared feature images, wherein the specific process is as follows:
s201, acquiring preprocessed visible light images and preprocessed thermal infrared images, inputting Focus layers, sampling isolated pixels in the horizontal direction and the vertical direction respectively, generating visible light image characteristic layers and thermal infrared image characteristic layers, recombining each input image into four characteristic layers, and then overlapping the four characteristic layers together, so that the number of input channels is expanded by four times, and the number of the overlapped characteristic layers is increased to 12 channels compared with the original 3-channel input;
s202, inputting the result of the step S201 into a convolution network for three times to extract features, wherein the convolution network comprises a convolution layer, a spatial pyramid pooling layer SPP and a plurality of residual block layers. In the convolution layer, the convolution kernel size is 3 multiplied by 3, the step length is 2, and the filling is 1; the space pyramid pooling layer SPP has three layers, and the convolution kernel sizes are 5×5,7×7 and 9×9 respectively; the residual block layer makes a 1x1 convolution and a 3x3 convolution on the image. After three times of feature extraction, as shown in fig. 2, a first visible light feature map, a second visible light feature map, a third visible light feature map, and a first thermal infrared feature map, a second thermal infrared feature map, and a third thermal infrared feature map are generated. The SPP of the spatial pyramid pooling layer can better process targets with different sizes without obviously increasing the size of a network, so that the model training speed is improved, and the real-time performance of pedestrian detection is further improved
And S3, inputting the visible light characteristic diagram and the thermal infrared characteristic diagram into a CAM activation module for weighted fusion, and obtaining a corresponding class activation diagram.
The CAM activation module can enhance the representation of the feature map, integrate the feature information of the visible light and thermodynamic diagram module, help to capture more details and features, and emphasize that the intra-modal features are not lost due to the inter-modal complementarity. The CAM activation block structure is shown in FIG. 3. First, for visible light characteristic diagramAnd thermal infrared signature->Performing inner product operation to obtain a first characteristic diagram +.>Then a spatial attention operation is performed to obtain a second feature map +.>For visible light characteristic map->And thermal infrared signature->Performing addition operation to obtain a third characteristic diagram +.>Then a convolution operation is performed to obtain a fourth feature map +.>Then, the second feature map ++>And fourth characteristic diagram->Performing channel self-attention operation to generate class activation diagram +.>The formula of the whole process is expressed as follows:
wherein CSA is a channel self-attention operation, SA is a spatial attention operation, and conv is a convolution operation.
The specific formula of the spatial attention operation SA is as follows:
wherein F is avg For average pooling, F max For maximum pooling, f 7x7 For convolution with a convolution kernel size of 7x7, σ is the activation function used to splice the results from the convolution operation.
The specific formula of the channel self-attention operation CSA is as follows:
wherein F is avg For average pooling, F max For maximum pooling, W 1 And W is 0 Is a trainable weight matrix, sigma is Sigmoid excitationA living function.
Through step S3, a first class activation diagram, a second class activation diagram and a third class activation diagram are obtained, the class activation diagram is a special convolutional neural network structure to generate a visual thermodynamic diagram, the pedestrian characteristic information in the visible light characteristic diagram and the thermal infrared characteristic diagram is effectively highlighted, meanwhile, the information is not lost, more details and characteristics in the captured image data are facilitated, and the accuracy of the pedestrian detection result is improved.
And S4, inputting each class of activation graphs into a feature pyramid network to perform multi-scale feature fusion, and generating a fusion feature graph.
And inputting the first class activation diagram, the second class activation diagram and the third class activation diagram into a feature pyramid network (YoloxPAPN), wherein the feature pyramid network comprises four up-sampling processes and four down-sampling processes, and three feature diagrams are respectively output after the second, third and fourth down-sampling is finished, the up-sampling convolution kernel is 3x3, and the step length is 2 so as to fuse multi-scale features.
And S5, executing a detection task on the fusion feature map, and outputting a pedestrian detection result, wherein the detection task comprises pedestrian prediction boundary box regression and pedestrian prediction boundary box object selection classification.
The present embodiment selects YOLOXHead detector detection heads that use 1×1 convolution to reduce the feature map of different channel numbers to a uniform channel number, which helps unify the dimensions of the feature map. Two parallel branches are then used, each of which includes two 3x3 convolution kernels, to perform different detection tasks, including pedestrian prediction bounding box regression and pedestrian prediction bounding box object classification, respectively.
In this embodiment, an OSU-CT visible light thermal infrared data set with a labeling format of Cvml is used to train a pedestrian detection model based on image fusion, first, missing labeled data is screened, and finally 4125 pairs of data sets are provided, then, the pixel sizes of the visible light image and the thermal infrared image in the data sets are uniformly adjusted to 320×240 pixels, and the format is uniformly adjusted to a json format input model. After the model executes the detection tasks, the loss of each detection task is calculated respectively, and the parameters of the pedestrian detection model based on image fusion are updated through a back propagation algorithm.
In a preferred embodiment, the loss of pedestrian prediction bounding box regression detection task is calculated using the IOU loss function as follows:
wherein, box gt And box pre The actual frame area and the target detection prediction frame area of the target detection frame are respectively.
In a preferred embodiment, the loss of the pedestrian prediction bounding box object classification task is calculated using a cross entropy loss function, the loss weight is 1.0, the regression loss function uses an IOU loss function, its loss weight is 5.0, the learning rate is 0.00001, and the loss rate of the L1 loss function is 1.0.
To verify the performance of the present invention, this example conducted experiments on a public dataset and some of the mainstream methods of pedestrian detection were analyzed and compared at present. The experiments were trained and tested according to the experimental specifications of the corresponding data sets, and the experimental results are shown in table 1.
In Table 1, method 1 uses a visible light single-modality dataset for the Fast-RCnn method, method 2 uses a thermal infrared single-modality dataset for the Fast-RCnn method, and method 3 uses a multi-modality dataset for the YOLOX method. As can be seen from the comparison of the method 1 and the method 2 with the method of the invention, the method of the invention supplements additional information of other modes compared with a single-mode method, and enables the method to have detection capability under different challenging scenes, and as can be seen from the comparison of the method 3 with the method of the invention, the CAM (Class Activation Map) activation module of the method of the invention can effectively highlight pedestrian characteristic information and can not lose information in each mode, thereby obtaining more excellent detection effect.
TABLE 1 experimental data for OSU-CT dataset
AP50 AP75 mAP
Method 1 75.4 36.2 37.8
Method 2 66.3 21.2 30.6
Method 3 84.2 36.3 41.8
The method of the invention 98.6 59.3 57.6
The foregoing description is only of the preferred embodiments of the invention and is not intended to limit the invention thereto. The invention also comprises a technical scheme which is formed by any combination of the technical characteristics.
The previous description of the embodiments is provided to facilitate a person of ordinary skill in the art in order to make and use the present invention. It will be apparent to those skilled in the art that various modifications can be readily made to these embodiments and the generic principles described herein may be applied to other embodiments without the use of the inventive faculty. Therefore, the present invention is not limited to the above-described embodiments, and those skilled in the art, based on the present disclosure, should make improvements and modifications without departing from the scope of the present invention.

Claims (10)

1. The pedestrian detection method based on image fusion is characterized by comprising the following steps of:
s1, acquiring a real-time visible light image and a thermal infrared image and preprocessing the images;
s2, respectively carrying out multi-scale feature extraction on the preprocessed visible light image and the preprocessed thermal infrared image for a plurality of times to generate a plurality of visible light feature images and a plurality of thermal infrared feature images;
s3, carrying out weighted fusion on the visible light characteristic diagram and the thermal infrared characteristic diagram to obtain a class activation diagram;
s4, inputting the class activation diagram into a feature pyramid network to perform multi-scale feature fusion, and generating a fusion feature diagram;
and S5, executing a detection task on the fusion feature map, and outputting a pedestrian detection result, wherein the detection task comprises pedestrian prediction boundary box regression and pedestrian prediction boundary box object selection classification.
2. The pedestrian detection method based on image fusion according to claim 1, wherein in step S1, the preprocessing process includes:
unifying the pixel size and format of each image data;
filtering noise reduction and image enhancement are performed.
3. The pedestrian detection method based on image fusion according to claim 1, wherein in step S2, the specific process of multi-scale feature extraction is as follows:
s201, acquiring a preprocessed visible light image and a preprocessed thermal infrared image, sampling isolation pixels in the horizontal direction and the vertical direction, and generating a plurality of visible light image characteristic layers and a plurality of thermal infrared image characteristic layers;
s202, superposing all visible light image feature layers, superposing all thermal infrared image feature layers, respectively inputting the feature layers into a convolution network for feature extraction, and generating a visible light feature map and a thermal infrared feature map.
4. A pedestrian detection method based on image fusion according to claim 3, wherein in step S202, the convolution network comprises a convolution layer, a spatial pyramid pooling layer SPP, and a residual block layer.
5. The pedestrian detection method based on image fusion according to claim 1, wherein the specific process of step S3 is as follows:
s301, obtaining a visible light characteristic diagramAnd thermal infrared signature->Performing inner product operation to obtain a first characteristic diagram +.>Then a spatial attention operation is performed to obtain a second feature map +.>
S302, obtaining a visible light characteristic diagramAnd thermal infrared signature->Performing addition operation to obtain a third characteristic diagram +.>Then a convolution operation is performed to obtain a fourth feature map +.>
S303, combining the second feature mapAnd fourth characteristic diagram->Performing channel self-attention operation to generate class activation diagram +.>
6. The pedestrian detection method based on image fusion according to claim 5, wherein in step S301, the spatial attention operation is performed as follows:
for the first characteristic diagramRespectively carrying out maximum pooling and average pooling, and then respectively carrying out convolution operation;
splicing the results obtained by the convolution operation by using an activation function to obtain a second feature map
7. The pedestrian detection method based on image fusion according to claim 5, wherein in step S303, the process of the channel self-attention operation is as follows:
for the second characteristic diagramAnd fourth characteristic diagram->After the inner product operation is carried out, respectively carrying out maximum pooling and average pooling;
and respectively weighting the maximum pooling and average pooling results, and then splicing through an activation function.
8. The pedestrian detection method based on image fusion of claim 7, wherein the activation function is a Sigmoid activation function.
9. An electronic device comprising a memory, a processor, and a program stored in the memory, wherein the processor implements the method of any of claims 1-8 when executing the program.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the method according to any of claims 1-8.
CN202311704548.XA 2023-12-12 2023-12-12 Pedestrian detection method, device and medium based on image fusion Active CN117690161B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311704548.XA CN117690161B (en) 2023-12-12 2023-12-12 Pedestrian detection method, device and medium based on image fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311704548.XA CN117690161B (en) 2023-12-12 2023-12-12 Pedestrian detection method, device and medium based on image fusion

Publications (2)

Publication Number Publication Date
CN117690161A true CN117690161A (en) 2024-03-12
CN117690161B CN117690161B (en) 2024-06-04

Family

ID=90125990

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311704548.XA Active CN117690161B (en) 2023-12-12 2023-12-12 Pedestrian detection method, device and medium based on image fusion

Country Status (1)

Country Link
CN (1) CN117690161B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101908481B1 (en) * 2017-07-24 2018-12-10 동국대학교 산학협력단 Device and method for pedestraian detection
CN113139542A (en) * 2021-04-28 2021-07-20 北京百度网讯科技有限公司 Target detection method, device, equipment and computer readable storage medium
CN116452937A (en) * 2023-04-25 2023-07-18 重庆邮电大学 Multi-mode characteristic target detection method based on dynamic convolution and attention mechanism
CN116580425A (en) * 2023-05-12 2023-08-11 浙江工业大学 Multispectral pedestrian detection method based on cross-transducer fusion
CN116645696A (en) * 2023-05-31 2023-08-25 长春理工大学重庆研究院 Contour information guiding feature detection method for multi-mode pedestrian detection
CN117132759A (en) * 2023-08-02 2023-11-28 上海无线电设备研究所 Saliency target detection method based on multiband visual image perception and fusion

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101908481B1 (en) * 2017-07-24 2018-12-10 동국대학교 산학협력단 Device and method for pedestraian detection
CN113139542A (en) * 2021-04-28 2021-07-20 北京百度网讯科技有限公司 Target detection method, device, equipment and computer readable storage medium
CN116452937A (en) * 2023-04-25 2023-07-18 重庆邮电大学 Multi-mode characteristic target detection method based on dynamic convolution and attention mechanism
CN116580425A (en) * 2023-05-12 2023-08-11 浙江工业大学 Multispectral pedestrian detection method based on cross-transducer fusion
CN116645696A (en) * 2023-05-31 2023-08-25 长春理工大学重庆研究院 Contour information guiding feature detection method for multi-mode pedestrian detection
CN117132759A (en) * 2023-08-02 2023-11-28 上海无线电设备研究所 Saliency target detection method based on multiband visual image perception and fusion

Also Published As

Publication number Publication date
CN117690161B (en) 2024-06-04

Similar Documents

Publication Publication Date Title
CN109584248B (en) Infrared target instance segmentation method based on feature fusion and dense connection network
CN112507997B (en) Face super-resolution system based on multi-scale convolution and receptive field feature fusion
CN110097129B (en) Remote sensing target detection method based on profile wave grouping characteristic pyramid convolution
CN110956126B (en) Small target detection method combined with super-resolution reconstruction
CN112288008A (en) Mosaic multispectral image disguised target detection method based on deep learning
CN111401293B (en) Gesture recognition method based on Head lightweight Mask scanning R-CNN
CN112465759A (en) Convolutional neural network-based aeroengine blade defect detection method
Cui et al. Improved swin transformer-based semantic segmentation of postearthquake dense buildings in urban areas using remote sensing images
CN111462140B (en) Real-time image instance segmentation method based on block stitching
CN111951195A (en) Image enhancement method and device
CN115205147A (en) Multi-scale optimization low-illumination image enhancement method based on Transformer
CN113076884A (en) Cross-mode eye state identification method from near infrared light to visible light
CN114266894A (en) Image segmentation method and device, electronic equipment and storage medium
CN116958782A (en) Method and device for detecting weak and small targets by combining infrared and visible light characteristics
CN116757986A (en) Infrared and visible light image fusion method and device
CN114218999A (en) Millimeter wave radar target detection method and system based on fusion image characteristics
CN117690161B (en) Pedestrian detection method, device and medium based on image fusion
CN116823610A (en) Deep learning-based underwater image super-resolution generation method and system
CN113420660B (en) Infrared image target detection model construction method, prediction method and system
CN115565082A (en) Method, system and device for removing cloud noise of satellite remote sensing image
Bhagat et al. Multimodal sensor fusion using symmetric skip autoencoder via an adversarial regulariser
CN114842012B (en) Medical image small target detection method and device based on position awareness U-shaped network
CN114926734B (en) Solid waste detection device and method based on feature aggregation and attention fusion
CN112927231B (en) Training method of vehicle body dirt detection model, vehicle body dirt detection method and device
WO2023241276A1 (en) Image editing method and related device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant