CN117934820B - Infrared target identification method based on difficult sample enhancement loss - Google Patents

Infrared target identification method based on difficult sample enhancement loss Download PDF

Info

Publication number
CN117934820B
CN117934820B CN202410332193.4A CN202410332193A CN117934820B CN 117934820 B CN117934820 B CN 117934820B CN 202410332193 A CN202410332193 A CN 202410332193A CN 117934820 B CN117934820 B CN 117934820B
Authority
CN
China
Prior art keywords
attention
convolution
input
enhancement
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410332193.4A
Other languages
Chinese (zh)
Other versions
CN117934820A (en
Inventor
徐从安
吴俊峰
高龙
孙显
史骏
孙炜玮
周伟
宿南
艾加秋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Naval Aeronautical University
Original Assignee
Naval Aeronautical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Naval Aeronautical University filed Critical Naval Aeronautical University
Priority to CN202410332193.4A priority Critical patent/CN117934820B/en
Publication of CN117934820A publication Critical patent/CN117934820A/en
Application granted granted Critical
Publication of CN117934820B publication Critical patent/CN117934820B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses an infrared target identification method based on difficult sample enhancement loss, and belongs to the field of data identification. The infrared target recognition detection model adopted by the method comprises a backbone network, a neck and a head, wherein the backbone network is based on ResNet structure and comprises a convolution layer and four stage parts which are connected in sequence; the infrared image to be identified is used as an input image of a backbone network to be input into a convolution layer for extracting image characteristics; the backbone network also comprises three attention enhancement modules which are correspondingly and parallelly arranged with the last three stage parts respectively; the neck includes a feature pyramid network and the head includes a region suggestion network and a result prediction network. The invention introduces a attention enhancement module and a sample enhancement loss function based on infrared image physical characteristics, and obviously improves the recognition and detection performance of infrared targets.

Description

Infrared target identification method based on difficult sample enhancement loss
Technical Field
The invention belongs to the field of data identification, and relates to an infrared target identification method.
Background
Infrared images have a prominent advantage in a severe illumination environment compared to visible light images (RGB images). Infrared target detection plays a very important role in many applications such as early warning systems and marine monitoring systems.
Infrared images tend to be lower in resolution than visible light images and lack color information of objects. In addition, there are often strong background clutter, noise, low contrast, etc. problems in infrared images. This can result in some low difficulty samples in the visible image being difficult to detect in the infrared image, such as a red football on green grass. The above problems make infrared target detection a difficult task.
In recent years, a deep learning-based method is also used for infrared target detection recognition. In the deep learning-based method, the overall model can be largely divided into a feature extraction part and a result prediction part. For example, RISTDnet improves the feature extraction capability of the infrared image target by integrating the multi-scale convolution result; THERMALDET is used for infrared target detection, and competition performance is obtained, and the structure can adaptively re-weight each channel of the features after fusing the features of different layers; the Deep-IR Target establishes an information path between each feature channel through the self-care network, and further extracts spatial information of the image. However, these methods mainly focus on information loss of the infrared image during feature extraction, and do not take into account the context information between the target region and the surrounding background region in the image, resulting in poor recognition and detection performance of the low-contrast target.
On the other hand, due to the wide application of optical images, the object detection algorithm has been developed faster in the optical image field than in the infrared image field. The widely applied YOLO series algorithm designs a large number of residual errors and branch structures in a main network, and improves the characteristic extraction capability of the network. Although the YOLO series algorithm can be applied to infrared images, the method also does not improve the attention degree of useful characteristic information, and cannot improve the recognition and detection performance of low-contrast targets.
Disclosure of Invention
The invention provides an infrared target identification method based on difficult sample enhancement loss, which aims at: the problem of low-contrast infrared target identification detection performance is poor is solved.
The technical scheme of the invention is as follows:
an infrared target recognition method based on difficult sample enhancement loss adopts an infrared target recognition detection model comprising a backbone network, a neck and a head;
The backbone network is based on ResNet structure and comprises a convolution layer and four stage parts which are connected in sequence; the infrared image to be identified is used as an input image of a backbone network to be input into a convolution layer for extracting image characteristics; the backbone network also comprises three attention enhancement modules which are correspondingly and parallelly arranged with the last three stage parts respectively;
The neck comprises a feature pyramid network, and the head comprises a region suggestion network and a result prediction network; the feature pyramid network fuses the image feature information extracted by the backbone network and then inputs the image feature information into the regional suggestion network to obtain a region of interest; the result prediction network predicts the location and class of the target in combination with the region of interest and the image characteristic information.
As a further improvement of the method for infrared target identification based on difficult sample enhancement loss: each stage part comprises a plurality of convolution blocks; the structure of the convolution block uses the structure in Res2Net, and carries out convolution operation on input characteristics, then divides the obtained characteristic tensor into four groups along a channel, and then sequentially carries out addition and convolution operation.
As a further improvement of the method for infrared target identification based on difficult sample enhancement loss: the method comprises the steps that among four stage parts, a first stage part obtains a group of characteristic tensors based on image characteristics output by a convolution layer, and then the group of characteristic tensors are input into a second stage part;
For each group of corresponding attention enhancement modules and phase sections: the input of the attention enhancement module is the same as the input characteristic tensor of the corresponding stage part, and the output of the attention enhancement module is multiplied by the output of the stage part to be used as the output of the group of the corresponding attention enhancement module and the stage part;
The outputs of the first group of attention enhancement modules and the phase section are the input feature tensors of the second group of attention enhancement modules and the phase section, and the outputs of the second group of attention enhancement modules and the phase section are the input feature tensors of the third group of attention enhancement modules and the phase section; the outputs of the first, second and third sets of attention enhancement modules and the phase section are output as image characteristic information of the backbone network.
As a further improvement of the method for infrared target identification based on difficult sample enhancement loss: the attention enhancing module includes a first portion, a second portion, and a third portion connected in sequence.
As a further improvement of the method for infrared target identification based on difficult sample enhancement loss: in the first part, the attention enhancement module respectively carries out maximum pooling and average pooling on the input characteristic tensor along the channel direction, then carries out splicing convolution on the maximum pooling result and the average pooling result, and takes the splicing convolution result as the input of the second part;
The calculation process of the first part is as follows:
Wherein, Input feature tensor for attention enhancing module,/>For convolution kernel size/>Splicing convolution operation with step length of 1 and filling value of 0,/>Representing a maximum pooling operation along the channel direction,/>Representing an average pooling operation along the channel direction,/>The convolution results are stitched for the first portion.
As a further improvement of the method for infrared target identification based on difficult sample enhancement loss: in the second part, obtaining a weight matrix through the overall attention structure;
Specifically, the overall attention structure includes The mutual attention structure is calculated as follows:
Wherein, For convolution kernel size/>Performing splicing convolution operation with step length of 1 and filling value of 0; /(I)For/>Collections composed of features obtained by multiscale pooling,/>,/>Represents the/>Calculating a mutual attention structure; /(I)And (3) a weight matrix obtained for the second part.
As a further improvement of the method for infrared target identification based on difficult sample enhancement loss: the mutual attention structureThe calculation mode of (a) is as follows:
Wherein, The representation means that the kernel size after splicing two variables is/>Convolution operation with step size 1 and fill value 1,/>Representation means that the kernel size after splicing the two variables together is/>Convolution operation with step size of 1 and padding value of 7, the width and height of input and output of convolution are the same; /(I)、/>And/>Respectively refer to/>First/>, abstracted by convolution operationA query, a key, and a response value.
As a further improvement of the method for infrared target identification based on difficult sample enhancement loss: in the third part, an input feature tensor is usedAnd the element product of the final weight matrix as output:
Wherein, For convolution kernel size/>Splicing convolution operation with step length of 1 and filling value of 0,/>For the output of the first part,/>For the output of the second part,/>For the final weight matrix,/>Is the output of the attention enhancing module.
As a further improvement of the method for infrared target identification based on difficult sample enhancement loss: training the infrared target identification detection model by adopting a sample enhancement loss function; the sample enhancing loss functionIncluding regression loss/>And classification loss/>
As a further improvement of the method for infrared target identification based on difficult sample enhancement loss:
Regression loss The calculation mode of (a) is as follows:
First, the contrast of the object is calculated:
Wherein, Representing a target area manually marked in an input image before training,/>Which represents the contrast of the object,Representing the background areas above, below, to the left and to the right of the target area respectively,Set of background regions representing four directions,/>Representing pixel values in the target area as/>Probability of occurrence of pixel,/>Representing background area/>The middle pixel value is/>Probability of occurrence of pixels of (2);
then calculate additional enhancement coefficients
Wherein,、/>、/>Is a super parameter;
Then for the first Sample regression loss of individual training samples/>The calculation mode of (a) is as follows:
Wherein, The intersection ratio between the target area in the training sample predicted for the infrared target recognition detection model and the artificially marked target area,/>And/>The predicted target area width and the actual target area width in the training sample,/>, respectivelyAnd/>The predicted target area height and the actual target area height in the training sample are respectively;
then calculate regression loss Wherein/>Training the total number of samples;
Classification loss The calculation mode of (a) is as follows:
Wherein, For the total number of target categories,/>Is a sign function if/>The training samples belong to the/>The number of categories is 1, otherwise 0,/>Is the predicted/>, of the infrared target recognition detection modelTargets in the individual training samples belong to the/>Probability of individual categories.
Compared with the prior art, the invention has the following beneficial effects:
1. the invention uses the attention enhancement module in the last three stages of the attention backbone network, enhances the feature points of the interested region and the target while integrating the image context information, and improves the attention degree of the network to the useful information, thereby improving the feature extraction capability of the low-contrast target in the infrared image and the recognition and detection performance of the low-contrast infrared target.
2. The invention provides a sample enhancement loss function based on infrared image physical characteristics, the difficulty of a sample is evaluated by using the proposed contrast measurement, and loss values of a low-contrast target with high difficulty and other targets are weighted respectively so as to balance the counter-propagation gradient of the low-contrast sample and other samples in a network, thereby further improving the overall performance.
Drawings
FIG. 1 is a schematic diagram of an infrared target recognition detection model in the present invention;
FIG. 2 is a schematic diagram of a phase portion of a hot residual backbone network;
FIG. 3 is a schematic diagram of the structure of the attention enhancing module;
FIG. 4 is a schematic diagram of the overall attention structure in the second portion of the attention enhancement module, with 3*3 branches in the diagram being linear mappings based on different scale convolution kernels for the input features;
FIG. 5 is a schematic diagram of a mutual attention structure in the overall attention structure, the mutual attention structure performing mutual attention operations on a plurality of features input to the module, the schematic diagram Representation of/>Performing a mutual attention operation on the inputs, specifically, performing an attention operation on the query vector (Q) and the value vector (V) of the branch and the query vector (K) of the other branch;
Fig. 6 is a schematic diagram of 4 divisions of a target area and a background area, with a center square portion being the target area and a black portion sideways being the selected background area.
Detailed Description
The technical scheme of the invention is described in detail below with reference to the accompanying drawings. It will be apparent that the described embodiments are only some, but not all, embodiments of the invention.
An infrared target recognition method based on difficult sample enhancement loss adopts an infrared target recognition detection model comprising a backbone network, a neck and a head.
As in fig. 1, the backbone network is based on ResNet architecture, comprising a convolutional layer and four phase sections connected in sequence.
The infrared image to be identified is input to the convolution layer as an input image of the backbone network for extracting image features.
Each of the phase portions includes a plurality of convolution blocks, respectively. Moreover, the number of convolutions of each phase section is not necessarily the same. M in fig. 2 represents the number of convolutions blocks stacked in a certain stage section, i.e. the depth of the network. In this embodiment, the structure of the convolution block uses the structure in Res2Net, which performs convolution operation on the input features, then divides the obtained feature tensors into four groups along the channel, and then sequentially performs addition and convolution operation.
The first stage part obtains a set of feature tensors based on the image features output by the convolution layer and then inputs the feature tensors to the second stage part.
In order to supplement the characteristic information of the target in the infrared image by integrating the image context characteristics, the backbone network further comprises three attention enhancement modules which are respectively and correspondingly arranged in parallel with the last three stage parts. For each set of attention enhancement modules and phase sections: the input of the attention enhancement module is identical to the input characteristic tensor of the corresponding stage part, and the output of the attention enhancement module is multiplied by the output of the stage part to be used as the output of the group of the corresponding attention enhancement module and the stage part. The outputs of the first set of attention enhancement modules and phase sections are the input feature tensors of the second set of attention enhancement modules and phase sections, and the outputs of the second set of attention enhancement modules and phase sections are the input feature tensors of the third set of attention enhancement modules and phase sections. The outputs of the first, second and third sets of attention enhancement modules and the phase section are output as image characteristic information of the backbone network.
Further, as shown in fig. 3, the attention enhancing module includes a first portion, a second portion, and a third portion.
In order for the attention enhancement module to focus on spatial information of image features, the present invention employs a joint expression of two different pooling operations on the tensor of input features as input to the proposed overall attention structure. Finally, the output is mapped between 0 and 1 as a spatial weight matrix of the input feature tensor.
In the first part, the attention enhancement module respectively carries out maximum pooling and average pooling on the input characteristic tensor along the channel direction, then carries out splicing convolution on the maximum pooling result and the average pooling result, and then takes the splicing convolution result as the input of the second part.
The calculation process of the first part is as follows:
Wherein, Input feature tensor for attention enhancing module,/>For convolution kernel size/>Splicing convolution operation with step length of 1 and filling value of 0,/>Representing a maximum pooling operation along the channel direction,/>Representing an average pooling operation along the channel direction,/>The convolution results are stitched for the first portion.
In the second part, a weight matrix is obtained by the global attention structure.
In general, the relevance of a local area in an image is inversely proportional to distance, so that the characteristic information of an object is only relevant to the context of a limited range of local areas. The invention designs a new general attention structure in the backbone network. As shown in fig. 4, the overall attention structure calculates the cross-attention of the feature response at different scales for each local region in the feature space response map, while establishing an information path between the different scale local regions for each location of the image.
Specifically, the overall attention structure includesThe mutual attention structure is calculated as follows:
Wherein, For convolution kernel size/>And performing splicing convolution operation with the step length of 1 and the filling value of 0. /(I)For/>Collections composed of features obtained by multiscale pooling,/>. The above calculation of attention to each input feature is a fusion of the calculation results of a plurality of mutual attention structures having different parameters in order to pay attention to information in different subspaces.
Further, as shown in FIG. 5, a mutual attention structureThe calculation mode of (a) is as follows:
Wherein, The representation means that the kernel size after splicing two variables is/>Convolution operation with step size 1 and fill value 1,/>Representation means that the kernel size after splicing the two variables together is/>The convolution operation with step size 1 and fill value 7, the width and height of the convolved input and output are the same. /(I)、/>And/>Respectively refer to/>First/>, abstracted by convolution operationA query, a key, and a response value.
And (3) a weight matrix obtained for the second part.
In the third part, as in fig. 3, an input feature tensor is usedAnd the element product of the final weight matrix as output:
Wherein, For convolution kernel size/>Splicing convolution operation with step length of 1 and filling value of 0,/>For the output of the first part,/>For the output of the second part,/>For the final weight matrix,/>Is the output of the attention enhancing module.
The neck includes a Feature Pyramid Network (FPN). The header includes a regional recommendation network (RPN) and a result prediction network.
The Feature Pyramid Network (FPN) fuses the image feature information extracted by the backbone network, and then inputs the image feature information into the region suggestion network (RPN) to obtain a region of interest (ROI). Finally, the result prediction network combines the region of interest (ROI) and the image feature information to predict the location and class of the object.
Further, for the infrared target recognition detection model, aiming at unbalance between a low-contrast target and other targets, training is performed based on the proposed sample enhancement loss function, and the detection performance of the low-contrast target is improved by optimizing the weight of the low-contrast target and other target loss function values.
The sample enhancing loss functionIncluding regression loss/>And classification loss/>
Wherein regression lossBuild on the basis of CIoU loss functions. Compared with the traditional IoU loss function, CIoU reflects the regression effect of the detection frame more accurately by considering the overlapping area, the relative distance and the aspect ratio of the prediction box and the real box. The sample enhancement loss function optimizes the target loss function weight according to the contrast on the basis of CIoU so as to balance the learning effect of the low-contrast sample and other samples.
To evaluate the difficulty of a sample, the method designs a metric to represent the difference between the target region and the background region. First, the contrast is calculated using the target region and a background region of a certain scale around it. Next, as shown in fig. 6, four different background areas may be selected in four directions around the target area. And finally, calculating the gray value distribution difference between the target area and the background area in each direction, and taking the information entropy as a mathematical model for calculating the local gray value and the distribution dispersion. At the same time, the gray value of the region is multiplied by its corresponding information entropy, so that our measurement can take into account the center of the distribution.
If the extent of flooding of the target by the background areas in each direction is different, the difference between the gray value distribution of the target area and the background areas in each direction is small. Based on this theory, the maximum difference between the distribution of the target region and the background region in each direction is used as the contrast value of the target for evaluating the difficulty thereof.
The contrast of the target is calculated by the following steps:
Wherein, Representing a target area manually marked in an input image before training,/>Which represents the contrast of the object,Representing the background areas above, below, to the left and to the right of the target area respectively,Set of background regions representing four directions,/>Representing pixel values in the target area as/>Probability of occurrence of pixel,/>Representing background area/>The middle pixel value is/>Is used for the probability of occurrence of a pixel of (c).
The method applies a gaussian function as a basic mathematical model of a mapping function of sample enhancement coefficients. In order to improve the learning effect of the difficult sample and accelerate the convergence speed of the simple sample, the sum of the mapping of the contrast coefficient and the super-parameters is adopted as the final additional enhancement coefficient.
The target-based contrast ratio mapping calculation process is as follows:
Wherein, Is the final additional enhancement coefficient involved in the calculation in the loss function,/>、/>、/>The default values of the super parameters are 0.75,100 and 0.1.
In the above formula, the square of the contrast of the target is used as the natural baseAnd multiplying it with a linear coefficient, and then adding a hyper-parameter/>, to the resultAdditional enhancement coefficients as a function of loss.
For the firstSample regression loss of individual training samples/>The calculation mode of (a) is as follows:
Wherein, The intersection ratio between the target area in the training sample predicted for the infrared target recognition detection model and the artificially marked target area,/>And/>The predicted target area width and the actual target area width in the training sample,/>, respectivelyAnd/>The predicted target area and the actual target area in the training sample are high, respectively.
Then calculate regression lossWherein/>For the total number of training samples.
Classification lossThe calculation mode of (a) is as follows:
Wherein, For the total number of target categories,/>Is a sign function if/>The training samples belong to the/>The number of categories is 1, otherwise 0,/>Is the predicted/>, of the infrared target recognition detection modelTargets in the individual training samples belong to the/>Probability of individual categories.
It should be noted that it will be apparent to those skilled in the art that the present invention is not limited to the details of the above-described exemplary embodiments, but may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. The scope of the invention is indicated by the appended claims rather than by the foregoing description.

Claims (2)

1. The infrared target recognition method based on difficult sample enhancement loss adopts an infrared target recognition detection model comprising a backbone network, a neck and a head, and is characterized in that:
The backbone network is based on ResNet structure and comprises a convolution layer and four stage parts which are connected in sequence; the infrared image to be identified is used as an input image of a backbone network to be input into a convolution layer for extracting image characteristics; the backbone network also comprises three attention enhancement modules which are correspondingly and parallelly arranged with the last three stage parts respectively;
The neck comprises a feature pyramid network, and the head comprises a region suggestion network and a result prediction network; the feature pyramid network fuses the image feature information extracted by the backbone network and then inputs the image feature information into the regional suggestion network to obtain a region of interest; the result prediction network predicts the position and the category of the target by combining the region of interest and the image characteristic information;
the method comprises the steps that among four stage parts, a first stage part obtains a group of characteristic tensors based on image characteristics output by a convolution layer, and then the group of characteristic tensors are input into a second stage part;
For each group of corresponding attention enhancement modules and phase sections: the input of the attention enhancement module is the same as the input characteristic tensor of the corresponding stage part, and the output of the attention enhancement module is multiplied by the output of the stage part to be used as the output of the group of the corresponding attention enhancement module and the stage part;
The outputs of the first group of attention enhancement modules and the phase section are the input feature tensors of the second group of attention enhancement modules and the phase section, and the outputs of the second group of attention enhancement modules and the phase section are the input feature tensors of the third group of attention enhancement modules and the phase section; the outputs of the first group, the second group and the third group of attention enhancement modules and the stage part are used as image characteristic information output by a backbone network;
The attention enhancing module comprises a first part, a second part and a third part which are sequentially connected;
in the first part, the attention enhancement module respectively carries out maximum pooling and average pooling on the input characteristic tensor along the channel direction, then carries out splicing convolution on the maximum pooling result and the average pooling result, and takes the splicing convolution result as the input of the second part;
The calculation process of the first part is as follows:
Wherein, Input feature tensor for attention enhancing module,/>For convolution kernel size/>Splicing convolution operation with step length of 1 and filling value of 0,/>Representing a maximum pooling operation along the channel direction,/>Representing an average pooling operation along the channel direction,/>Splicing convolution results for the first part;
in the second part, obtaining a weight matrix through the overall attention structure;
Specifically, the overall attention structure includes The mutual attention structure is calculated as follows:
Wherein, For convolution kernel size/>Performing splicing convolution operation with step length of 1 and filling value of 0; /(I)For/>Collections composed of features obtained by multiscale pooling,/>,/>Represents the/>Calculating a mutual attention structure;
a weight matrix obtained for the second part;
The mutual attention structure The calculation mode of (a) is as follows:
Wherein, The representation means that the kernel size after splicing two variables is/>Convolution operation with step size 1 and fill value 1,/>The representation means that the core size after splicing the two variables isConvolution operation with step size of 1 and padding value of 7, the width and height of input and output of convolution are the same; /(I)And/>Respectively refer to/>First/>, abstracted by convolution operationIndividual query, key, and response values;
in the third part, an input feature tensor is used And the element product of the final weight matrix as output:
Wherein, For convolution kernel size/>Splicing convolution operation with step length of 1 and filling value of 0,/>For the output of the first part,/>For the output of the second part,/>For the final weight matrix,/>An output that is an attention enhancing module;
Training the infrared target identification detection model by adopting a sample enhancement loss function; the sample enhancing loss function Including regression loss/>And classification loss/>
Regression lossThe calculation mode of (a) is as follows:
First, the contrast of the object is calculated:
Wherein, Representing a target area manually marked in an input image before training,/>Which represents the contrast of the object,Representing background regions above, below, to the left and to the right, respectively, of the target region,/>Set of background regions representing four directions,/>Representing pixel values in the target area as/>Probability of occurrence of pixel,/>Representing background area/>The middle pixel value is/>Probability of occurrence of pixels of (2);
then calculate additional enhancement coefficients
Wherein,、/>、/>Is a super parameter;
Then for the first Sample regression loss of individual training samples/>The calculation mode of (a) is as follows:
Wherein, The intersection ratio between the target area in the training sample predicted for the infrared target recognition detection model and the artificially marked target area,/>And/>The predicted target area width and the actual target area width in the training sample,/>, respectivelyAnd/>The predicted target area height and the actual target area height in the training sample are respectively;
then calculate regression loss Wherein/>Training the total number of samples;
Classification loss The calculation mode of (a) is as follows:
Wherein, For the total number of target categories,/>Is a sign function if/>The training samples belong to the/>The number of categories is 1, otherwise 0,/>Is the predicted/>, of the infrared target recognition detection modelTargets in the individual training samples belong to the/>Probability of individual categories.
2. The method for infrared target identification based on difficult sample enhancement loss of claim 1, wherein: each stage part comprises a plurality of convolution blocks; the structure of the convolution block uses the structure in Res2Net, and carries out convolution operation on input characteristics, then divides the obtained characteristic tensor into four groups along a channel, and then sequentially carries out addition and convolution operation.
CN202410332193.4A 2024-03-22 2024-03-22 Infrared target identification method based on difficult sample enhancement loss Active CN117934820B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410332193.4A CN117934820B (en) 2024-03-22 2024-03-22 Infrared target identification method based on difficult sample enhancement loss

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410332193.4A CN117934820B (en) 2024-03-22 2024-03-22 Infrared target identification method based on difficult sample enhancement loss

Publications (2)

Publication Number Publication Date
CN117934820A CN117934820A (en) 2024-04-26
CN117934820B true CN117934820B (en) 2024-06-14

Family

ID=90752304

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410332193.4A Active CN117934820B (en) 2024-03-22 2024-03-22 Infrared target identification method based on difficult sample enhancement loss

Country Status (1)

Country Link
CN (1) CN117934820B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113674247A (en) * 2021-08-23 2021-11-19 河北工业大学 X-ray weld defect detection method based on convolutional neural network
CN114863097A (en) * 2022-04-06 2022-08-05 北京航空航天大学 Infrared dim target detection method based on attention system convolutional neural network
CN115861772A (en) * 2023-02-22 2023-03-28 杭州电子科技大学 Multi-scale single-stage target detection method based on RetinaNet

Family Cites Families (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110619352A (en) * 2019-08-22 2019-12-27 杭州电子科技大学 Typical infrared target classification method based on deep convolutional neural network
CN111259930B (en) * 2020-01-09 2023-04-25 南京信息工程大学 General target detection method of self-adaptive attention guidance mechanism
CN111428718B (en) * 2020-03-30 2023-05-09 南京大学 Natural scene text recognition method based on image enhancement
CN111783772A (en) * 2020-06-12 2020-10-16 青岛理工大学 Grabbing detection method based on RP-ResNet network
WO2022026603A1 (en) * 2020-07-29 2022-02-03 Magic Leap, Inc. Object recognition neural network training using multiple data sources
CN112906623A (en) * 2021-03-11 2021-06-04 同济大学 Reverse attention model based on multi-scale depth supervision
CN113420660B (en) * 2021-06-23 2023-05-26 西安电子科技大学 Infrared image target detection model construction method, prediction method and system
CN113850129A (en) * 2021-08-21 2021-12-28 南京理工大学 Target detection method for rotary equal-variation space local attention remote sensing image
CN114821018B (en) * 2022-04-11 2024-05-31 北京航空航天大学 Infrared dim target detection method for constructing convolutional neural network by utilizing multidirectional characteristics
CN115082672A (en) * 2022-06-06 2022-09-20 西安电子科技大学 Infrared image target detection method based on bounding box regression
CN116109947A (en) * 2022-09-02 2023-05-12 北京航空航天大学 Unmanned aerial vehicle image target detection method based on large-kernel equivalent convolution attention mechanism
CN116071676A (en) * 2022-12-02 2023-05-05 华东理工大学 Infrared small target detection method based on attention-directed pyramid fusion
CN116823686B (en) * 2023-04-28 2024-03-08 长春理工大学重庆研究院 Night infrared and visible light image fusion method based on image enhancement
CN116758340A (en) * 2023-05-31 2023-09-15 王蒙 Small target detection method based on super-resolution feature pyramid and attention mechanism
CN116863305A (en) * 2023-07-13 2023-10-10 天津大学 Infrared dim target detection method based on space-time feature fusion network
CN117496567A (en) * 2023-08-16 2024-02-02 沈阳工业大学 Facial expression recognition method and system based on feature enhancement
CN117197676A (en) * 2023-10-18 2023-12-08 中国人民解放军海军航空大学 Target detection and identification method based on feature fusion
CN117710841A (en) * 2023-12-14 2024-03-15 南京信息工程大学 Small target detection method and device for aerial image of unmanned aerial vehicle

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113674247A (en) * 2021-08-23 2021-11-19 河北工业大学 X-ray weld defect detection method based on convolutional neural network
CN114863097A (en) * 2022-04-06 2022-08-05 北京航空航天大学 Infrared dim target detection method based on attention system convolutional neural network
CN115861772A (en) * 2023-02-22 2023-03-28 杭州电子科技大学 Multi-scale single-stage target detection method based on RetinaNet

Also Published As

Publication number Publication date
CN117934820A (en) 2024-04-26

Similar Documents

Publication Publication Date Title
CN111047554B (en) Composite insulator overheating defect detection method based on instance segmentation
CN109584248B (en) Infrared target instance segmentation method based on feature fusion and dense connection network
CN112884064B (en) Target detection and identification method based on neural network
CN109840556B (en) Image classification and identification method based on twin network
CN113807464B (en) Unmanned aerial vehicle aerial image target detection method based on improved YOLO V5
CN110032925B (en) Gesture image segmentation and recognition method based on improved capsule network and algorithm
Jiang et al. Focal-test-based spatial decision tree learning
CN112926652B (en) Fish fine granularity image recognition method based on deep learning
CN110399820B (en) Visual recognition analysis method for roadside scene of highway
CN111652240B (en) CNN-based image local feature detection and description method
CN116645592B (en) Crack detection method based on image processing and storage medium
Xin et al. Image recognition of crop diseases and insect pests based on deep learning
CN115830004A (en) Surface defect detection method, device, computer equipment and storage medium
CN114998566A (en) Interpretable multi-scale infrared small and weak target detection network design method
CN112766102A (en) Unsupervised hyperspectral video target tracking method based on space-spectrum feature fusion
CN113205502A (en) Insulator defect detection method and system based on deep learning
CN115410081A (en) Multi-scale aggregated cloud and cloud shadow identification method, system, equipment and storage medium
CN116452966A (en) Target detection method, device and equipment for underwater image and storage medium
CN111553337A (en) Hyperspectral multi-target detection method based on improved anchor frame
CN113887649B (en) Target detection method based on fusion of deep layer features and shallow layer features
CN112508863B (en) Target detection method based on RGB image and MSR image double channels
CN113971764A (en) Remote sensing image small target detection method based on improved YOLOv3
CN114049503A (en) Saliency region detection method based on non-end-to-end deep learning network
CN117934820B (en) Infrared target identification method based on difficult sample enhancement loss
CN116704309A (en) Image defogging identification method and system based on improved generation of countermeasure network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant