CN112614136B - Infrared small target real-time instance segmentation method and device - Google Patents

Infrared small target real-time instance segmentation method and device Download PDF

Info

Publication number
CN112614136B
CN112614136B CN202011632333.8A CN202011632333A CN112614136B CN 112614136 B CN112614136 B CN 112614136B CN 202011632333 A CN202011632333 A CN 202011632333A CN 112614136 B CN112614136 B CN 112614136B
Authority
CN
China
Prior art keywords
small target
mask
infrared
infrared small
real
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011632333.8A
Other languages
Chinese (zh)
Other versions
CN112614136A (en
Inventor
荆楠
雷波
李忠
刘班
陈余根
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
717th Research Institute of CSIC
Original Assignee
717th Research Institute of CSIC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 717th Research Institute of CSIC filed Critical 717th Research Institute of CSIC
Priority to CN202011632333.8A priority Critical patent/CN112614136B/en
Publication of CN112614136A publication Critical patent/CN112614136A/en
Application granted granted Critical
Publication of CN112614136B publication Critical patent/CN112614136B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/40Image enhancement or restoration using histogram techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10048Infrared image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20172Image enhancement details

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a method and a device for segmenting an infrared small target real-time instance, which take a large number of infrared small target images with complex backgrounds as training data sources according to the characteristic properties of the infrared small targets, preprocess training samples of the infrared small target images, and manufacture small target binary masks by adopting a labeling tool; and extracting multi-level features of the infrared small target image by utilizing a network structure combining a lightweight backbone network and a feature pyramid bottleneck network, and presetting 3 prior frames with different sizes for each pixel point in the multi-level features. Filtering candidate frames with confidence scores lower than 0.05 by a confidence score threshold method, performing non-maximum suppression operation on the rest candidate frames, and selecting the first 100 candidate frames. Finally, the prototype mask and the mask coefficient are linearly combined to obtain the mask template. The real-time instance segmentation method provided by the invention can be better suitable for complex backgrounds, enhances the anti-interference capability on scenes such as the ground and the like, and expands the application range of the method.

Description

Infrared small target real-time instance segmentation method and device
Technical Field
The invention relates to the field of infrared target image processing, in particular to a method and a device for dividing an infrared small target real-time instance.
Background
Example segmentation is one of the classical subjects in the fields of image processing and computer vision, and is widely applied to aspects such as automatic driving, image retrieval, traffic monitoring and the like. Infrared thermal imaging is widely used in the fields of security monitoring, military investigation, night driving, shipping and the like. The infrared image reflects the relative temperature information of the object. Infrared imaging is less affected by weather factors. In dark and rainy-foggy days, the infrared spectrum imaging has the advantages of long detection distance, high detection reliability and the like compared with equipment such as an illumination camera, night vision and the like, but the infrared imaging has the defects of lower resolution, fuzzy details and the like. The deep learning shows good segmentation performance on the segmentation of the visible light image target, but the target and the background are small due to the fact that a plurality of heat sources exist in the infrared image such as the background, so that the adaptability of the traditional segmentation method to the segmentation of the infrared small target example is poor. In addition, the instance segmentation algorithm based on deep learning has the defects of large calculation amount and complex calculation, and cannot meet the requirement of real-time processing. Therefore, research on a real-time example segmentation method of an infrared small target image, which can process complex scenes and is beneficial to engineering realization, is needed.
In a battlefield environment, only the enemy, the material enemy, and the machine can be used for making the person first. This requires that the optoelectronic device be able to detect and mark the target at a distance when the target is imaged with fewer pixels and with less contrast. In a real scene, the background is very complex, so that a small target is often submerged in other non-attention background, the signal-to-noise ratio is low, and the recognition is difficult. When the training method of the deep neural network is broken through, the convolutional neural network has shown great advantages in the aspect of target detection application compared with the traditional algorithm, and the example segmentation algorithm developed on the basis of the convolutional neural network has obtained good results in the aspects of target detection, positioning and segmentation application. However, for battlefield application, such object segmentation algorithms also have a large limitation, and are not applicable to smaller objects. The example segmentation algorithm based on deep learning is to detect, identify and segment larger targets with rich textures, and in order to realize target detection under multiple scales, multiple downsampling operations are used in a network structure, and after multiple downsampling, small targets have no information on a feature map. Furthermore, the human eye recognizes small objects, and the motion change characteristic is also a very important feature.
The infrared image small target has the defects of fuzzy details, lower resolution and the like, and the existing deep learning method shows good segmentation performance on the detection of a visible light example, but the algorithm has the defects of large calculation amount and complex calculation and cannot meet the requirement of real-time processing.
Disclosure of Invention
In order to solve the above problems, embodiments of the present invention provide a method and an apparatus for real-time instance segmentation of an infrared small object, which overcome or at least partially solve the above problems.
In a first aspect, an embodiment of the present invention provides a method for partitioning real-time instances of small infrared targets, including:
S1, acquiring an infrared small target image data training sample set under various scenes;
s2, preprocessing an infrared small target image training sample, and manufacturing a small target binary instance mask by using a marking tool;
S3, adopting a lightweight network as a main network of the neural network model, inputting an infrared small target image training sample into the main network, extracting multi-level features of the infrared small target image, and generating a multi-level feature map candidate frame based on the multi-level features;
S4, filtering candidate frames with confidence scores lower than a preset confidence threshold, and performing non-maximum suppression operation on the rest candidate frames to obtain confidence scores, coordinates and mask coefficients corresponding to the candidate frames with the confidence satisfying the conditions;
S5, generating a prototype mask based on a backbone network, and performing matrix multiplication operation on the mask coefficient obtained in the step S4 and the prototype mask to obtain a final binary mask;
And S6, calculating a loss function of the neural network model, and iterating the neural network model by using the loss function to generate a trained infrared small target real-time instance segmentation model.
Further, in step S1, a training sample set of infrared small target image data under various scenes is collected, which specifically includes:
S101, acquiring infrared small target images of different acquisition targets in different complex scenes by adopting a medium-wave infrared camera; wherein acquisition targets include, but are not limited to, pedestrians, automobiles, and trucks;
s102, shooting parameters of a data acquisition place and a medium wave infrared camera are changed for a plurality of times, and an infrared small target image data training sample set is obtained.
Further, in step S2, preprocessing is performed on the training sample of the infrared small target image, which specifically includes:
s201, processing an infrared small target image training sample by a nonlinear dynamic range compression method;
S202, local detail enhancement is carried out on the compressed infrared small target image training sample by adopting a limited contrast self-adaptive histogram equalization method.
S203, drawing the outline of the infrared small target by drawing software, and then giving different colors to various targets and backgrounds of the infrared small target image training sample, so as to generate a binary mask template diagram;
S204, carrying out data enhancement on the infrared small target image training sample by using rotation transformation, translation transformation, scale transformation, turnover transformation, scaling transformation, projection transformation, random pruning, color dithering, contrast transformation and noise disturbance methods;
S205, based on the characteristic of unbalanced data category of the infrared small target image training sample, performing data augmentation by adopting a category balance strategy; and randomly sequencing the infrared small target image training samples.
Further, the step S3 specifically includes:
S301, aiming at the real-time instance segmentation requirement, a lightweight backbone network and a feature pyramid network are combined, and infrared small target image multistage features are extracted;
S302, three prior frames with different sizes are preset for each pixel point in the multi-level feature, and a plurality of candidate frames are obtained.
Further, the step S4 specifically includes:
s401, filtering candidate frames with confidence scores lower than a preset confidence threshold, performing non-maximum suppression method operation on the rest candidate frames, and reserving the candidate frames with the confidence scores being ranked at the top 100;
S402, carrying out boundary frame decoding on the candidate frames with the top 100 rank of the confidence score, and setting a variance super parameter to adjust a decoding predicted value.
Further, in step S5, a prototype mask is generated based on the backbone network, and the matrix multiplication operation is performed on the mask coefficient obtained in step S4 and the prototype mask, which specifically includes:
s501, acquiring a prototype mask based on a backbone network;
S502, performing matrix multiplication operation on the mask coefficient obtained in the step S4 and the prototype mask to obtain a mask template M:
M=σ(PCT)
Wherein P is a prototype mask set, C is an n x k mask coefficient set, and represents n candidate frame instances subjected to non-maximum suppression and threshold filtering, wherein each candidate frame instance corresponds to k mask coefficients;
S503, up-sampling the mask template M obtained by matrix multiplication in S502 to the original image size to obtain a final binary mask.
Further, the loss function L of the infrared small target real-time instance segmentation model is:
Wherein x is the category information of the predicted frame, c is the confidence coefficient of the category information of the predicted frame, l is the position information of the predicted frame, g is the position information of the real frame, N is the number of priori frames matched with the pre-marked real target frame, alpha is a weight coefficient, and alpha is 1; l conf (x, c) is class loss, and a cross-over-point loss function is adopted; l loc (x, L, g) is the position loss, and the Smooth L1 loss function is used.
In a second aspect, an embodiment of the present invention further provides an apparatus for dividing an infrared small target real-time instance, including:
The acquisition module is used for acquiring an infrared small target image data training sample set under various scenes;
the preprocessing module is used for preprocessing the infrared small target image training sample and manufacturing a small target binary instance mask by adopting a marking tool;
The multi-stage feature extraction module is used for adopting a lightweight network as a main network of the neural network model, inputting an infrared small target image training sample into the main network, extracting multi-stage features of the infrared small target image, and generating a multi-stage feature map candidate frame based on the multi-stage features;
The candidate frame filtering module is used for filtering candidate frames with confidence scores lower than a preset confidence threshold, and then performing non-maximum suppression operation on the rest candidate frames to obtain confidence scores, coordinates and mask coefficients corresponding to the candidate frames with the confidence satisfying the conditions;
The mask obtaining module is used for generating a prototype mask based on a backbone network, and performing matrix multiplication operation on the mask coefficient obtained in the step S4 and the prototype mask to obtain a final binary mask;
and the iterative training module is used for calculating a loss function of the neural network model, iterating the neural network model by using the loss function, and generating a trained infrared small target real-time instance segmentation model.
In a third aspect, an embodiment of the present invention provides an electronic device, including a processor, a memory, a communication interface, and a bus; the processor, the memory and the communication interface complete communication with each other through the bus; the memory stores program instructions executable by the processor, and the processor invokes the program instructions to perform the method for real-time instance segmentation of infrared small objects provided by the embodiment of the first aspect.
In a fourth aspect, embodiments of the present invention provide a non-transitory computer readable storage medium storing computer instructions that cause a computer to perform the method for real-time instance segmentation of infrared small targets provided by the embodiments of the first aspect.
According to the method and the device for real-time example segmentation of the infrared small target, provided by the embodiment of the invention, the example segmentation algorithm based on deep learning is modified and optimized aiming at the characteristics of the infrared image small target, and the lightweight real-time example segmentation algorithm is provided, so that the small target detection, identification and segmentation are carried out on the small target characteristics when the target with larger pixel number is identified, positioned and segmented. The real-time instance segmentation method provided by the invention can be better suitable for complex backgrounds, enhances the anti-interference capability on scenes such as the ground and the like, and expands the application range of the method.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method for dividing an infrared small target into real-time examples according to an embodiment of the present invention;
FIG. 2 is an exemplary diagram of an example partition in an embodiment of the invention;
FIG. 3 is a schematic diagram of a real-time example segmentation apparatus for small infrared targets according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.
The infrared image small target has the defects of fuzzy details, lower resolution and the like, and the existing deep learning method shows good segmentation performance on the detection of a visible light example, but the algorithm has the defects of large calculation amount and complex calculation and cannot meet the requirement of real-time processing.
Aiming at the problems in the prior art, the embodiment of the invention provides an infrared small target real-time instance segmentation method, which modifies and optimizes an instance segmentation algorithm based on deep learning according to the characteristics of infrared image small targets, and provides a lightweight real-time instance segmentation algorithm, so that small target detection, identification and segmentation are carried out on small target features when targets with larger pixel numbers are identified, positioned and segmented. The real-time instance segmentation method provided by the invention can be better suitable for complex backgrounds, enhances the anti-interference capability on scenes such as the ground and the like, and expands the application range of the method. The following description and description will be made with reference to the drawings by way of various embodiments.
Fig. 1 is a flow chart of a method for dividing an infrared small target into real-time examples, which is provided in an embodiment of the present invention, as shown in fig. 1, and the method for dividing an infrared small target into real-time examples provided in an embodiment of the present invention includes, but is not limited to, the following steps:
S1, acquiring an infrared small target image data training sample set under various scenes;
in this embodiment, S1 may specifically include the following steps:
S101, acquiring infrared small target images of different acquisition targets in different complex scenes by adopting a medium-wave infrared camera; wherein acquisition targets include, but are not limited to, pedestrians, automobiles, and trucks;
s102, shooting parameters of a data acquisition place and a medium wave infrared camera are changed for a plurality of times, and an infrared small target image data training sample set is obtained.
S2, preprocessing the infrared small target image training sample, and manufacturing a small target binary instance mask by using a marking tool. Wherein, the labeling tool can adopt LabelImg labeling tools, and the invention is not limited in particular.
Wherein, S2 may specifically include the following steps:
S201, processing the infrared small target image training sample by a nonlinear dynamic range compression method. In this embodiment, a nonlinear dynamic range compression method is used to compress the original training sample of the 14-bit infrared small target image into an 8-bit gray scale image.
S202, local detail enhancement is carried out on the compressed infrared small target image training sample by adopting a limited contrast self-adaptive histogram equalization method.
S203, drawing the outline of the infrared small target by drawing software, and then giving different colors to various targets and backgrounds of the infrared small target image training sample, so as to generate a binary mask template diagram; drawing software includes, but is not limited to, photoshop software.
S204, carrying out data enhancement on the infrared small target image training sample through rotation transformation, translation transformation, scale transformation, turnover transformation, scaling transformation, projection transformation, random clipping, color dithering, contrast transformation and noise disturbance methods.
S205, based on the characteristic of unbalanced data category of the infrared small target image training sample, performing data augmentation by adopting a category balance strategy; and randomly sequencing the infrared small target image training samples.
And S3, adopting a lightweight network as a main network of the neural network model, inputting the infrared small target image training sample into the main network, extracting the multi-level features of the infrared small target image, and generating a multi-level feature map candidate frame based on the multi-level features.
Firstly, aiming at the real-time instance segmentation requirement, a lightweight backbone network and a feature pyramid network are combined to extract the infrared small target image multi-level features. Specifically, aiming at the real-time instance segmentation requirement, the lightweight backbone network is adopted to extract the infrared small target image multi-level features, and the calculation amount of the lightweight backbone network is low, so that the calculation complexity is low. In this embodiment, the lightweight backbone network is a forward network composed of 13 convolution layers and 6 largest pooling layers alternately, which results in loss of information during layer-to-layer transmission, and failure to fully utilize characteristic information of the convolution layers, which results in poor segmentation accuracy, especially poor segmentation effect on small targets, and after a plurality of convolution operations, image features become very small, which is easy to cause missed segmentation and incorrect segmentation on distant objects or small targets.
Aiming at the problem of the lightweight backbone network, the embodiment also adopts a feature pyramid network, and the feature pyramid network adopts a bottom-up (bottom-up), top-down (top-down) and transverse connection (lateral connection) structure, so that shallow feature map information with high resolution and deep feature map information with rich semantic features are fused, and the segmentation accuracy is improved while the segmentation speed is not greatly reduced. The algorithm performs 2 times up-sampling on the deep feature map with stronger semantic information, and then transversely connects the features to the features of the shallower layer, so that the deep feature information is enhanced, and the segmentation accuracy is improved.
The feature extraction combining the lightweight backbone network and the feature pyramid network is adopted, and the generation of the candidate frames is facilitated by the feature graphs with three sizes of 56×56×256, 28×28×256 and 14×14×256 of the infrared small target image.
And then, presetting three prior frames with different sizes for each pixel point in the multi-level feature to obtain a plurality of candidate frames. For a feature map of size m×n, there are m×n grids, i.e., pixel points. The number of prior frames arranged in each grid is denoted as a, then (c+ 4+k) a predicted values are needed in total for each grid, and (c+4) kmn predicted values are needed in total for all grids, and as the segmentation algorithm adopts convolution for detection, the detection segmentation process of the feature map is completed by (c+ 4+k) a convolution kernels. Taking the 56×56 feature map as an example, all grids generate (4+4+32) ×3×56×56 predictors in total, where a=3, k=32, c=4, m=n=56.
Further, the real-time instance segmentation algorithm employed herein employs a regression offset method to obtain the predicted frame coordinates and size.
lcx=(bcx-dcx)/dw
lcy=(bcy-dcy)/dh
lw=log(bw/dw)
lh=log(bh/dh)
Wherein: l cx、lcy、lw and l h are target offsets and values that the network needs to learn; d w and d h are the width and height of the preset anchor frame; d cx and d cy are coordinate values of the feature points at the upper left corner of the corresponding feature map; b cx、bcy、bw and b h are the coordinates of the prediction frame and the width and height values, respectively.
The candidate box-based detection model predicts 4 values for each candidate box for characterizing box information and C values for characterizing class scores, for a total of (4+C) values.
The real-time instance segmentation algorithm employed herein predicts (4+c+k) values for each candidate box, with the additional k values being the mask coefficients. Furthermore, in order to be able to obtain the final desired mask by linear combination, it is important to be able to subtract the prototype mask from the final mask. In other words, the mask factor must be positive or negative. Therefore, a tanh function is used for nonlinear activation in mask coefficient prediction because the range of the tanh function is (-1, 1).
And S4, filtering candidate frames with confidence scores lower than a preset confidence threshold, and performing non-maximum suppression operation on the rest candidate frames to obtain confidence scores, coordinates and mask coefficients corresponding to the candidate frames with the confidence satisfying the conditions.
In step S4, first, candidate frames with confidence scores lower than a preset confidence threshold are filtered, where the preset confidence threshold may be set to 0.05, which is not specifically limited in the present invention. The present embodiment filters (56×56+28×28+14×14) ×3= 12348 candidate boxes with confidence scores below 0.05 according to the confidence score thresholding method. Next, non-maximum suppression method operations are performed on the remaining candidate boxes, retaining the candidate boxes with confidence scores ranked top 100.
Then, the candidate box with the top 100 rank confidence score is subjected to boundary box decoding. At present, only the predicted position of the target frame with the same size as the feature map is obtained, and the final position of the target frame with the same size as the input image is finally obtained through scaling processing. The position information of the bounding box corresponding to the decoding predicted value is:
lcx=(bcx-dcx)/dw
lcy=(bcy-dcy)/dh
lw=log(bw/dw)
lh=log(bh/dh)
Wherein: l cx、lcy、lw and l h are target offsets and values that the network needs to learn; d w and d h are the width and height of the preset anchor frame; d cx and d cy are coordinate values of the feature points at the upper left corner of the corresponding feature map; b cx、bcy、bw and b h are the coordinates of the prediction frame and the width and height values, respectively.
Setting a variance super parameter to adjust a decoding predicted value, and after adjusting the decoding predicted value, the position information of the final boundary frame is as follows:
bcx=dw(variance[0]*lcx)+dcx
bcy=dh(variance[1]*lcy)+dcy
bw=dwexp(variance[2]*lw)
bh=dhexp(variance[3]*lh)
S5, generating a prototype mask based on a backbone network, and performing matrix multiplication operation on the mask coefficient obtained in the step S4 and the prototype mask to obtain a final binary mask.
In this embodiment, a prototype mask is obtained based on a backbone network. Prototype masks derived from deeper backbone networks can yield more robust masks, with high resolution prototype masks being advantageous for improving target segmentation accuracy and segmentation effect of small targets.
The patent adopts a feature pyramid as a bottleneck network, and simultaneously the prototype mask is up-sampled to 1/4 of the original image size so as to improve the segmentation effect on small targets. The prototype mask is implemented based on a full convolution network, and finally k channel feature maps are output, and each channel can be regarded as a prototype mask.
Performing matrix multiplication operation on the mask coefficient obtained in the step S4 and the prototype mask to obtain a mask template M:
M=σ(PCT)
Where P is the prototype mask set and C is the n k mask coefficient set, representing n candidate box instances with non-maximum suppression and threshold filtering, each candidate box instance corresponding to k mask coefficients.
The mask template M obtained by matrix multiplication is up-sampled to the original image size, obtaining the final binary mask.
And S6, calculating a loss function of the neural network model, and iterating the neural network model by using the loss function to generate a trained infrared small target real-time instance segmentation model. In this embodiment, the steps S4 and S5 may be repeated three times, and the training samples of all the small infrared target image data are iterated, so as to finally obtain the real-time example segmentation model of the small infrared target.
In this embodiment, the loss function L (x, c, L, g) of the infrared small target real-time instance segmentation model is:
Wherein x is the category information of the predicted frame, c is the confidence coefficient of the category information of the predicted frame, l is the position information of the predicted frame, g is the position information of the real frame, N is the number of priori frames matched with the pre-marked real target frame, alpha is a weight coefficient, and alpha is 1; l conf (x, c) is class loss, and a cross-over-point loss function is adopted; l loc (x, L, g) is the position loss, and the Smooth L1 loss function is used.
Fig. 2 is an example diagram of example segmentation in the embodiment of the present invention, as shown in fig. 2, after a trained infrared small target real-time example segmentation model is obtained, an infrared small target image to be segmented is input into the infrared small target real-time example segmentation model, and a corresponding real-time example segmentation result image is obtained.
In one embodiment, fig. 3 is a schematic structural diagram of an infrared small-target real-time instance segmentation apparatus according to an embodiment of the present invention, where the infrared small-target real-time instance segmentation apparatus according to the embodiment of the present invention is used to execute the method for segmenting an infrared small-target real-time instance in the embodiment of the method described above. As shown in fig. 3, the apparatus includes:
The acquisition module 301 is configured to acquire training sample sets of infrared small target image data under various scenes.
The preprocessing module 302 is used for preprocessing the infrared small target image training sample and adopting a labeling tool to manufacture a small target binary instance mask;
The multi-stage feature extraction module 303 is configured to use a lightweight network as a backbone network of the neural network model, input the training sample of the infrared small target image into the backbone network, extract multi-stage features of the infrared small target image, and generate a multi-stage feature map candidate frame based on the multi-stage features;
The candidate frame filtering module 304 is configured to filter candidate frames with confidence scores lower than a preset confidence threshold, and perform non-maximum suppression operation on the remaining candidate frames to obtain confidence scores, coordinates and mask coefficients corresponding to candidate frames with confidence satisfying the condition;
The mask obtaining module 305 is configured to generate a prototype mask based on the backbone network, and perform matrix multiplication operation on the mask coefficient obtained in the step S4 and the prototype mask to obtain a final binary mask;
the iterative training module 306 is configured to calculate a loss function of the neural network model, iterate the neural network model using the loss function, and generate a trained infrared small target real-time instance segmentation model.
Specific how to use the above modules to perform real-time division of the infrared small target can refer to the foregoing method embodiments, and the embodiments of the present invention are not described herein again.
In one embodiment, the embodiment of the present invention provides an electronic device, as shown in fig. 4, where the electronic device may include: processor 401, communication interface (Communications Interface) 402, memory 403 and communication bus 404, wherein processor 401, communication interface 402 and memory 403 complete communication with each other through communication bus 404. The processor 401 may call logic instructions in the memory 403 to perform the steps of the infrared small target real-time instance segmentation method provided in the above embodiments, for example, including: s1, acquiring an infrared small target image data training sample set under various scenes; s2, preprocessing an infrared small target image training sample, and manufacturing a small target binary instance mask by using a marking tool; s3, adopting a lightweight network as a main network of the neural network model, inputting an infrared small target image training sample into the main network, extracting multi-level features of the infrared small target image, and generating a multi-level feature map candidate frame based on the multi-level features; s4, filtering candidate frames with confidence scores lower than a preset confidence threshold, and performing non-maximum suppression operation on the rest candidate frames to obtain confidence scores, coordinates and mask coefficients corresponding to the candidate frames with the confidence satisfying the conditions; s5, generating a prototype mask based on a backbone network, and performing matrix multiplication operation on the mask coefficient obtained in the step S4 and the prototype mask to obtain a final binary mask; and S6, calculating a loss function of the neural network model, and iterating the neural network model by using the loss function to generate a trained infrared small target real-time instance segmentation model.
In one embodiment, the present invention further provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform the steps of the method for real-time instance segmentation of small infrared targets provided in the above embodiments, for example, including: s1, acquiring an infrared small target image data training sample set under various scenes; s2, preprocessing an infrared small target image training sample, and manufacturing a small target binary instance mask by using a marking tool; s3, adopting a lightweight network as a main network of the neural network model, inputting an infrared small target image training sample into the main network, extracting multi-level features of the infrared small target image, and generating a multi-level feature map candidate frame based on the multi-level features; s4, filtering candidate frames with confidence scores lower than a preset confidence threshold, and performing non-maximum suppression operation on the rest candidate frames to obtain confidence scores, coordinates and mask coefficients corresponding to the candidate frames with the confidence satisfying the conditions; s5, generating a prototype mask based on a backbone network, and performing matrix multiplication operation on the mask coefficient obtained in the step S4 and the prototype mask to obtain a final binary mask; and S6, calculating a loss function of the neural network model, and iterating the neural network model by using the loss function to generate a trained infrared small target real-time instance segmentation model.
In summary, the embodiment of the invention provides a method and a device for real-time example segmentation of an infrared small target, which are used for modifying and optimizing an example segmentation algorithm based on deep learning aiming at the characteristics of the infrared image small target, and provides a lightweight real-time example segmentation algorithm, so that small target detection, identification and segmentation are carried out on the characteristics of the small target when the target with larger pixel number is identified, positioned and segmented. The real-time instance segmentation method provided by the invention can be better suitable for complex backgrounds, enhances the anti-interference capability on scenes such as the ground and the like, and expands the application range of the method.
The embodiments of the present invention may be arbitrarily combined to achieve different technical effects.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (7)

1. The real-time example segmentation method of the infrared small target is characterized by comprising the following steps of:
S1, acquiring an infrared small target image data training sample set under various scenes;
s2, preprocessing an infrared small target image training sample, and manufacturing a small target binary instance mask by using a marking tool;
S3, adopting a lightweight network as a main network of the neural network model, inputting an infrared small target image training sample into the main network, extracting multi-level features of the infrared small target image, and generating a multi-level feature map candidate frame based on the multi-level features;
s4, filtering candidate frames with confidence scores lower than a preset confidence threshold, and performing non-maximum suppression operation on the rest candidate frames to obtain confidence scores, coordinates and mask coefficients corresponding to the candidate frames with the confidence satisfying the conditions;
s5, generating a prototype mask based on a backbone network, and performing matrix multiplication operation on the mask coefficient obtained in the step S4 and the prototype mask to obtain a final binary mask;
S6, calculating a loss function of the neural network model, and iterating the neural network model by using the loss function to generate a trained infrared small target real-time instance segmentation model;
The step S3 specifically comprises the following steps:
S301, aiming at the real-time instance segmentation requirement, a lightweight backbone network and a feature pyramid network are combined, and infrared small target image multistage features are extracted;
S302, three prior frames with different sizes are preset for each pixel point in the multi-level feature, and a plurality of candidate frames are obtained;
the step S4 specifically comprises the following steps:
s401, filtering candidate frames with confidence scores lower than a preset confidence threshold, performing non-maximum suppression method operation on the rest candidate frames, and reserving the candidate frames with the confidence scores being ranked at the top 100;
S402, carrying out boundary frame decoding on the candidate frames with the top 100 rank of the confidence score, and setting a variance super parameter to adjust a decoding predicted value;
the loss function L (x, c, L, g) of the infrared small target real-time instance segmentation model is as follows:
Wherein x is the category information of the predicted frame, c is the confidence coefficient of the category information of the predicted frame, l is the position information of the predicted frame, g is the position information of the real frame, N is the number of priori frames matched with the pre-marked real target frame, alpha is a weight coefficient, and alpha is 1; l conf (x, c) is class loss, and a cross-over-point loss function is adopted; l loc (x, L, g) is the position loss, and the Smooth L1 loss function is used.
2. The method for real-time example segmentation of small infrared targets according to claim 1, wherein in step S1, a training sample set of small infrared target image data under various scenes is collected, specifically comprising:
S101, acquiring infrared small target images of different acquisition targets in different complex scenes by adopting a medium-wave infrared camera; wherein, the acquisition targets comprise pedestrians, automobiles and trucks;
s102, shooting parameters of a data acquisition place and a medium wave infrared camera are changed for a plurality of times, and an infrared small target image data training sample set is obtained.
3. The method for real-time example segmentation of small infrared targets according to claim 1, wherein in step S2, the training sample of the small infrared targets is preprocessed, specifically comprising:
s201, processing an infrared small target image training sample by a nonlinear dynamic range compression method;
S202, local detail enhancement is carried out on the compressed infrared small target image training sample by adopting a limited contrast self-adaptive histogram equalization method;
s203, drawing the outline of the infrared small target by drawing software, and then giving different colors to various targets and backgrounds of the infrared small target image training sample, so as to generate a binary mask template diagram;
S204, carrying out data enhancement on the infrared small target image training sample by using rotation transformation, translation transformation, scale transformation, turnover transformation, scaling transformation, projection transformation, random pruning, color dithering, contrast transformation and noise disturbance methods;
S205, based on the characteristic of unbalanced data category of the infrared small target image training sample, performing data augmentation by adopting a category balance strategy; and randomly sequencing the infrared small target image training samples.
4. The method for real-time instance segmentation of small infrared targets according to claim 1, it is characterized in that the method comprises the steps of, in the step S5 of the process, a prototype mask is generated based on the backbone network, and (3) performing matrix multiplication operation on the mask coefficient obtained in the step (S4) and the prototype mask, wherein the matrix multiplication operation specifically comprises the following steps:
s501, acquiring a prototype mask based on a backbone network;
S502, performing matrix multiplication operation on the mask coefficient obtained in the step S4 and the prototype mask to obtain a mask template M:
M=σ(PCT)
wherein P is a prototype mask set, C is an n x k mask coefficient set, and represents n candidate frame instances subjected to non-maximum suppression and threshold filtering, wherein each candidate frame instance corresponds to k mask coefficients;
S503, up-sampling the mask template M obtained by matrix multiplication in S502 into the original image size to obtain the final binary mask.
5. An infrared small target real-time instance segmentation device, comprising:
The acquisition module is used for acquiring an infrared small target image data training sample set under various scenes;
the preprocessing module is used for preprocessing the infrared small target image training sample and manufacturing a small target binary instance mask by adopting a marking tool;
The multi-stage feature extraction module is used for adopting a lightweight network as a main network of the neural network model, inputting an infrared small target image training sample into the main network, extracting multi-stage features of the infrared small target image, and generating a multi-stage feature map candidate frame based on the multi-stage features;
The candidate frame filtering module is used for filtering candidate frames with confidence scores lower than a preset confidence threshold, and then performing non-maximum suppression operation on the rest candidate frames to obtain confidence scores, coordinates and mask coefficients corresponding to the candidate frames with the confidence satisfying the conditions;
The mask obtaining module is used for generating a prototype mask based on a backbone network, and performing matrix multiplication operation on the mask coefficient obtained by the candidate frame filtering module and the prototype mask to obtain a final binary mask;
the iterative training module is used for calculating a loss function of the neural network model, iterating the neural network model by using the loss function, and generating a trained infrared small target real-time instance segmentation model;
the method for generating the multi-stage feature map candidate frame based on the multi-stage features specifically comprises the steps of inputting an infrared small target image training sample into a backbone network by using a lightweight network as the backbone network of a neural network model, extracting the multi-stage features of the infrared small target image, and generating the multi-stage feature map candidate frame based on the multi-stage features:
Aiming at the real-time instance segmentation requirement, a lightweight backbone network and a characteristic pyramid network are combined to extract the multistage characteristics of the infrared small target image;
presetting three prior frames with different sizes for each pixel point in the multi-level feature to obtain a plurality of candidate frames;
after the filtering confidence score is lower than the candidate frame with the preset confidence threshold, performing non-maximum suppression operation on the rest candidate frames, and obtaining the confidence score corresponding to the candidate frame with the confidence meeting the condition, wherein the coordinates and the mask coefficient specifically comprise:
Filtering candidate frames with confidence scores lower than a preset confidence threshold, performing non-maximum suppression method operation on the rest candidate frames, and reserving the candidate frames with the confidence scores being ranked at the top 100;
performing boundary frame decoding on the candidate frames with the confidence scores ranked at the top 100, and setting variance super parameters to adjust decoding predicted values;
the loss function L (x, c, L, g) of the infrared small target real-time instance segmentation model is as follows:
Wherein x is the category information of the predicted frame, c is the confidence coefficient of the category information of the predicted frame, l is the position information of the predicted frame, g is the position information of the real frame, N is the number of priori frames matched with the pre-marked real target frame, alpha is a weight coefficient, and alpha is 1; l conf (x, c) is class loss, and a cross-over-point loss function is adopted; l loc (x, L, g) is the position loss, and the Smooth L1 loss function is used.
6. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the infrared small target real-time instance segmentation method according to any one of claims 1 to 4 when executing the program.
7. A non-transitory computer readable storage medium having stored thereon a computer program, which when executed by a processor, implements the steps of the infrared small target real-time instance segmentation method according to any one of claims 1 to 4.
CN202011632333.8A 2020-12-31 2020-12-31 Infrared small target real-time instance segmentation method and device Active CN112614136B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011632333.8A CN112614136B (en) 2020-12-31 2020-12-31 Infrared small target real-time instance segmentation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011632333.8A CN112614136B (en) 2020-12-31 2020-12-31 Infrared small target real-time instance segmentation method and device

Publications (2)

Publication Number Publication Date
CN112614136A CN112614136A (en) 2021-04-06
CN112614136B true CN112614136B (en) 2024-05-14

Family

ID=75252916

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011632333.8A Active CN112614136B (en) 2020-12-31 2020-12-31 Infrared small target real-time instance segmentation method and device

Country Status (1)

Country Link
CN (1) CN112614136B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113673505A (en) * 2021-06-29 2021-11-19 北京旷视科技有限公司 Example segmentation model training method, device and system and storage medium
CN113643235B (en) * 2021-07-07 2023-12-29 青岛高重信息科技有限公司 Chip counting method based on deep learning
CN113269171B (en) * 2021-07-20 2021-10-12 魔视智能科技(上海)有限公司 Lane line detection method, electronic device and vehicle
CN113724290B (en) * 2021-07-22 2024-03-05 西北工业大学 Multi-level template self-adaptive matching target tracking method for infrared image
CN113705387B (en) * 2021-08-13 2023-11-17 国网江苏省电力有限公司电力科学研究院 Interference object detection and tracking method for removing overhead line foreign matters by laser
CN114283260A (en) * 2021-11-16 2022-04-05 北京航空航天大学 AR navigation method and system for corneal transplantation suture operation based on example segmentation network
CN115761518B (en) * 2023-01-10 2023-04-11 云南瀚哲科技有限公司 Crop classification method based on remote sensing image data

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111046880A (en) * 2019-11-28 2020-04-21 中国船舶重工集团公司第七一七研究所 Infrared target image segmentation method and system, electronic device and storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10679351B2 (en) * 2017-08-18 2020-06-09 Samsung Electronics Co., Ltd. System and method for semantic segmentation of images

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111046880A (en) * 2019-11-28 2020-04-21 中国船舶重工集团公司第七一七研究所 Infrared target image segmentation method and system, electronic device and storage medium

Also Published As

Publication number Publication date
CN112614136A (en) 2021-04-06

Similar Documents

Publication Publication Date Title
CN112614136B (en) Infrared small target real-time instance segmentation method and device
CN111798400B (en) Non-reference low-illumination image enhancement method and system based on generation countermeasure network
CN113052210B (en) Rapid low-light target detection method based on convolutional neural network
CN111915530B (en) End-to-end-based haze concentration self-adaptive neural network image defogging method
CN108510504B (en) Image segmentation method and device
CN111046880A (en) Infrared target image segmentation method and system, electronic device and storage medium
CN111797712B (en) Remote sensing image cloud and cloud shadow detection method based on multi-scale feature fusion network
CN110766020A (en) System and method for detecting and identifying multi-language natural scene text
CN111861925A (en) Image rain removing method based on attention mechanism and gate control circulation unit
CN112562255B (en) Intelligent image detection method for cable channel smoke and fire conditions in low-light-level environment
CN112381061B (en) Facial expression recognition method and system
CN114972107A (en) Low-illumination image enhancement method based on multi-scale stacked attention network
CN113657528B (en) Image feature point extraction method and device, computer terminal and storage medium
CN112288026B (en) Infrared weak and small target detection method based on class activation diagram
CN114037938B (en) NFL-Net-based low-illumination target detection method
CN113392711A (en) Smoke semantic segmentation method and system based on high-level semantics and noise suppression
CN112347805A (en) Multi-target two-dimensional code detection and identification method, system, device and storage medium
CN112149526A (en) Lane line detection method and system based on long-distance information fusion
CN115482529A (en) Method, equipment, storage medium and device for recognizing fruit image in near scene
CN114882278A (en) Tire pattern classification method and device based on attention mechanism and transfer learning
CN113096023A (en) Neural network training method, image processing method and device, and storage medium
CN114067225A (en) Unmanned aerial vehicle small target detection method and system and storable medium
CN111666813B (en) Subcutaneous sweat gland extraction method of three-dimensional convolutional neural network based on non-local information
CN113139431A (en) Image saliency target detection method based on deep supervised learning
CN116229528A (en) Living body palm vein detection method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant