CN114943834B - Full-field Jing Yuyi segmentation method based on prototype queue learning under few labeling samples - Google Patents
Full-field Jing Yuyi segmentation method based on prototype queue learning under few labeling samples Download PDFInfo
- Publication number
- CN114943834B CN114943834B CN202210390663.3A CN202210390663A CN114943834B CN 114943834 B CN114943834 B CN 114943834B CN 202210390663 A CN202210390663 A CN 202210390663A CN 114943834 B CN114943834 B CN 114943834B
- Authority
- CN
- China
- Prior art keywords
- prototype
- foreground
- background
- queue
- segmentation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000011218 segmentation Effects 0.000 title claims abstract description 88
- 238000000034 method Methods 0.000 title claims abstract description 29
- 238000002372 labelling Methods 0.000 title claims abstract description 18
- 230000006870 function Effects 0.000 claims abstract description 21
- 238000011176 pooling Methods 0.000 claims abstract description 12
- 238000012549 training Methods 0.000 claims description 17
- 238000004364 calculation method Methods 0.000 claims description 13
- 238000011156 evaluation Methods 0.000 claims description 10
- 238000005457 optimization Methods 0.000 claims description 9
- 238000010586 diagram Methods 0.000 claims description 3
- 238000005070 sampling Methods 0.000 claims description 3
- 230000000694 effects Effects 0.000 abstract description 9
- 238000012935 Averaging Methods 0.000 abstract description 4
- 238000004422 calculation algorithm Methods 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 5
- 238000004088 simulation Methods 0.000 description 4
- 238000012360 testing method Methods 0.000 description 4
- 238000002474 experimental method Methods 0.000 description 3
- 238000005259 measurement Methods 0.000 description 3
- 238000003909 pattern recognition Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- BQCADISMDOOEFD-UHFFFAOYSA-N Silver Chemical compound [Ag] BQCADISMDOOEFD-UHFFFAOYSA-N 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000007123 defense Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 229910052709 silver Inorganic materials 0.000 description 1
- 239000004332 silver Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/143—Segmentation; Edge detection involving probabilistic approaches, e.g. Markov random field [MRF] modelling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/194—Segmentation; Edge detection involving foreground-background segmentation
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Probability & Statistics with Applications (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a full-field Jing Yuyi segmentation method based on prototype queue learning under a few labeling samples, which comprises the steps of firstly carrying out prototype queue segmentation, carrying out mask averaging pooling on a feature map by utilizing a label image to generate a foreground prototype and a background prototype, storing the foreground prototype and the background prototype into a prototype queue, and then calculating the cosine distance of the feature map to obtain a new prediction probability map; and calculating the predictive probability map by adopting an argmax function to obtain a segmentation result mask label, carrying out mask averaging pooling on the feature map by utilizing the mask label, generating a foreground prototype and a background prototype of the second stage, storing the foreground prototype and the background prototype into a prototype queue, and calculating the cosine distance between the foreground prototype and the feature map to obtain a final segmentation result. The invention reduces the dependence on model parameters, improves generalization, and realizes better segmentation effect by using fewer labeling samples.
Description
Technical Field
The invention belongs to the technical field of pattern recognition, and particularly relates to a full scene semantic segmentation method.
Background
Image semantic segmentation (Semantic Segmentation) is the classification of images at the pixel level according to the semantic class to which the pixels in the scene belong. The semantic segmentation method based on deep learning often needs a large number of dense pixel-level labels, but labeling samples in an actual task is time-consuming and labor-consuming, and labeling samples in a specific task is difficult to obtain. Based on this, the full field Jing Yuyi segmentation under few labeling samples related to the invention aims to realize the division of all pixels in an image according to the belonging semantic category under the condition that only a small number of samples are labeled. The technology plays a key role in practical high-complexity and strong dynamic scene application such as urban planning, accurate agriculture, forest inspection, national defense and military, and the like.
Along with the development of deep learning, the semantic segmentation field has advanced, and the small sample semantic segmentation technology under the small labeling sample has advanced to a certain extent by combining the migration effect of meta learning and the suitability of the small sample of metric learning. However, current small sample semantic segmentation mainly focuses on segmentation of foreground objects and background, while the need for multi-category semantic segmentation is often ignored. How to fully utilize a small number of marked samples to guide a test sample in a measurement learning mode is an important problem in the small sample semantic segmentation technology. The process of prototype guided segmentation is reverse aligned regularized in the literature "Kaixin Wang, jun Hao Liew, YIngtian Zou, daquan Zhou, and Jiashi Feng. Panet: few-shot image semantic segmentation with prototype alignment.In IEEE International Conference on Computer Vision,2019, pp.9197-9206. The correlation of pixels to pixels was established in the literature "Haochen Wang, xudong Zhang, yutao Hu, yandan Yang, xian bin Cao, and xian dong zhen. Few-shot semantic segmentation with democratic attention works. In European Conference on Computer Vision,2020, pp.730-746," by Wang et al, replacing prototypes generated by mask pooling to deepen the guided segmentation of the test samples by the sample labels.
Furthermore, the use of potentially new classes of information in the background helps to alleviate the problem of feature confusion, i.e. to further enhance the effective representation of different semantic classes. Additional branching networks are introduced by Yang et al in the literature "Lihe Yang, wei Zhuo, lei Qi, YInghuan Shi, and Yang gao.mining latent classes for few-shot segment.In IEEE International Conference on Computer Vision,2021, pp.8721-8730," to take advantage of potential new classes of information, on the basis of which more stable prototype guidance is achieved by correcting for foreground and background. In addition, the conventional small sample segmentation method extracts the prototype process to be rough, thereby causing loss of detail information when mask is pooled on average. The loss of detail information can be reduced by iterative optimization of the prototype extraction process, and important and comprehensive semantic information is reserved, such as C.Zhang et al in the literature "Chi Zhang, guosheng Lin, fayao Liu, rui Yao, and Chunhua Shan. Canet: class-agnostic segmentation networks with iterative refinement and attentive few-shot learning.In IEEE Conference on Computer Vision and Pattern Recognition,2019, pp.5217-5226," an iterative optimization module is designed to optimize the segmentation process, but the method does not directly update the prototype, resulting in difficult completion of the extracted prototype lost detail information. The loss of detail information can be further reduced by means of iterative optimization, but the optimization of the prototype extraction process is still insufficient.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a full-field Jing Yuyi segmentation method based on prototype queue learning under a few labeling samples, which comprises the steps of firstly carrying out prototype queue segmentation, carrying out mask averaging pooling on a feature map by using a label image to generate a foreground prototype and a background prototype, storing the foreground prototype and the background prototype into a prototype queue, and then calculating the cosine distance of the feature map to obtain a new prediction probability map; and calculating the predictive probability map by adopting an argmax function to obtain a segmentation result mask label, carrying out mask averaging pooling on the feature map by utilizing the mask label, generating a foreground prototype and a background prototype of the second stage, storing the foreground prototype and the background prototype into a prototype queue, and calculating the cosine distance between the foreground prototype and the feature map to obtain a final segmentation result. The invention reduces the dependence on model parameters, improves generalization, and realizes better segmentation effect by using fewer labeling samples.
The technical scheme adopted by the invention for solving the technical problems comprises the following steps:
step 1: prototype queue segmentation;
step 1-1: uniformly cutting the training image and the corresponding label image pair into fixed sizes; establishing an empty prototype queue;
step 1-2: taking a training image as input data, and generating a feature map F through a feature extractor;
step 1-3: mask-averaged pooling of feature map F using label image M to generate foreground prototype p c And background prototype p bg :
Wherein, (x, y) represents pixel coordinates, 1[ ] represents an indication function, i.e., the function value is 1 when the formula in brackets is correct, otherwise is 0; c is the foreground class set, C is the front Jing Leibie in the figure, and h and w are the length and width of the input image, respectively;
step 1-4: prototype foreground p c And background prototype p bg Storing the foreground categories into a prototype queue, wherein the prototype queue has a plurality of foreground categories, and only has one background category;
step 1-5: repeating the steps 1-2 to 1-4, and traversing all training images and corresponding label images; when the foreground prototype or the background prototype is stored in the prototype queue, if the foreground prototype or the background prototype generated later has the foreground prototype or the background prototype with the same category in the prototype queue, covering the foreground prototype or the background prototype with the same category in the prototype queue;
step 1-6: respectively calculating cosine distances between foreground prototypes and background prototypes of different categories in a prototype queue and each pixel position in a feature map F to obtain a preliminary prediction probability mapConnecting P with F, and performing convolution calculation to obtain a new predictive probability map P final The calculation is as follows:
P final =Conv(Concat(F,P)) (3)
predictive probability map P final Namely, a preliminary prediction segmentation result;
step 2: a second stage of segmentation constraint;
step 2-1: prediction probability map P using argmax function final Calculating to obtain a segmentation result mask label, and binarizing to uniformly mark non-foreground categories as background categories to obtain a mask label only comprising the foreground categories and the background categories;
step 2-2: carrying out mask average pooling on the feature map F by using a mask label to generate a foreground prototype and a background prototype of the second stage;
step 2-3: storing the foreground prototype and the background prototype of the second stage into a prototype queue, and covering the foreground prototype or the background prototype in the prototype queue if the foreground prototype or the background prototype of the same class exists in the prototype queue;
step 2-4: respectively calculating cosine distances between foreground prototypes and background prototypes of different categories in the prototype queue obtained in the step 2-3 and each pixel position in the feature map F to obtain a second-stage predictive probability mapSecond stage predictive probability mapThe final segmentation result is obtained;
step 3: training according to the overall loss function to obtain a final segmentation model;
step 3-1: evaluating the loss;
using predictive probability map P final And the label image M calculates the preliminary segmentation result evaluation loss for the foreground category as follows:
wherein,c, for each position in the input image, the probability of being predicted as foreground fg Is a foreground category label; n represents the product of h and w;
predicting probability maps using a second stageAnd the label image M calculates the evaluation loss of the segmentation result of the second stage for the foreground category as follows:
wherein,representing the second stageThe probability that each position of the input image in the prediction result is predicted as a foreground;
the evaluation loss was calculated as follows:
L eval =L seg +L t-s (6)
step 3-2: multi-category loss;
multi-class loss L mult The calculation is as follows:
wherein the pseudo tagThe method comprises the steps of calculating a preliminary prediction probability map P by adopting an argmax function; multi-class predictive probability mapThe characteristic diagram F is obtained through convolution operation and up-sampling calculation; />Representing a probability that each position of the input image in the multi-class prediction result is predicted as a class cl;
step 3-3: background hiding class loss functions;
constraint loss is calculated for the background region of the input image, and the label image M and the predictive probability map P are utilized through a cross entropy formula final Calculating the false positive rate of the background area, namely the background Entropy loss Entropy bg Background Entropy loss Entropy bg Describing the probability that the background region is not mispredicted as foreground, the following is calculated:
to prevent background regions from being predicted as foreground, increasing the background Entropy value, reducing the probability that hidden classes of the background regions are mispredicted, losing the background Entropy Entropy bg The loss-in constraint is as follows:
wherein lambda is a background optimization weight parameter;
step 3-4: overall loss function:
Loss=L eval +L blr +α×L mult (10)
wherein alpha is a multi-class constraint weight parameter, and the value range is between 0 and 1.
Preferably, in the step 1-1, the training image and the corresponding label image pair are uniformly cut to a fixed size of 512×512.
Preferably, the lambda value ranges between 1 and 2.
The beneficial effects of the invention are as follows:
1. and expanding the small sample foreground and background segmentation to full scene multi-category semantic segmentation. Prototype queues proposed by the present invention may be used to update and store different class prototypes and to guide multi-class segmentation. Different from the prior method which is applicable to simple scenes, the method can realize the analysis of multi-category scenes.
2. The multi-class segmentation effect is better, and multi-class segmentation can be realized by inputting single class labels. The multi-class guide branch designed by the invention adopts the preliminary multi-class segmentation result as a pseudo tag to replace a single-class tag guide model to learn multi-class characteristics, thereby realizing better multi-class segmentation effect.
3. Segmentation robustness is stronger in the absence of sample labeling. The invention is based on small sample learning and metric learning, extracts image features and maps to feature metric space. The pixel-level multi-category segmentation is completed in a measurement mode, dependence on model parameters is reduced, generalization is improved, better segmentation effect is achieved by using fewer labeling samples, and robustness is higher in environments where sample labeling is lacking.
4. The segmentation results are higher in accuracy and average cross-over ratio. The background hiding type optimization module and the two-stage segmentation module provided by the invention further optimize the segmentation result, and can help the model framework to better analyze the scene.
5. The technology has more practical and industrial value. The invention promotes the small sample segmentation to more practical multi-category semantic segmentation, can meet the industrial requirements of city planning, accurate agriculture, automatic driving and the like, only needs fewer labeling samples, reduces the labeling cost, and is more suitable for practical application scenes.
Drawings
FIG. 1 is a flow chart of the method of the present invention.
FIG. 2 is a graph of semantic segmentation results generated by the method of the present invention and the comparison method.
Detailed Description
The invention will be further described with reference to the drawings and examples.
The invention discloses a full-scene semantic segmentation framework based on prototype queue learning under a small sample marking, which mainly solves the problems of multi-category semantic segmentation and background potential categories in small sample semantic segmentation. Specifically, the present invention aims to solve the following aspects:
1. the existing small sample semantic segmentation technology only segments the foreground and the background without analyzing the background in the complex scene, so that more practical multi-category small sample semantic segmentation is realized.
2. The prior art lacks the full utilization of potential new class information contained in the background class in the training samples.
3. The prior art adopts the mask average pooling method for extracting semantic category feature prototypes, and local detail information is easy to lose.
A full-field Jing Yuyi segmentation method based on prototype queue learning under a few labeled samples comprises the following steps:
step 1: prototype queue segmentation;
step 1-1: uniformly cutting the training image and the corresponding label image pair into fixed sizes; establishing an empty prototype queue;
step 1-2: taking a training image as input data, and generating a feature map F through a feature extractor;
step 1-3: using label imagesM carrying out mask average pooling on the feature map F to generate a foreground prototype p c And background prototype p bg :
Wherein, (x, y) represents pixel coordinates, C is a foreground class set, C is the front Jing Leibie in the figure, and h and w are the length and width of the input image respectively;
step 1-4: prototype foreground p c And background prototype p bg Storing the foreground categories into a prototype queue, wherein the prototype queue has a plurality of foreground categories, and only has one background category;
step 1-5: repeating the steps 1-2 to 1-4, and traversing all training images and corresponding label images; when the foreground prototype or the background prototype is stored in the prototype queue, if the foreground prototype or the background prototype generated later has the foreground prototype or the background prototype with the same category in the prototype queue, covering the foreground prototype or the background prototype with the same category in the prototype queue;
step 1-6: respectively calculating cosine distances between foreground prototypes and background prototypes of different categories in a prototype queue and each pixel position in a feature map F to obtain a preliminary prediction probability mapConnecting P with F, and performing convolution calculation to obtain a new predictive probability map P final The calculation is as follows:
P final =Conv(Concat(F,P)) (3)
predictive probability map P final Namely, a preliminary prediction segmentation result;
step 2: two-stage segmentation constraint;
step 2-1: prediction probability map P using argmax function final Calculating to obtain a segmentation result mask label, and binarizing to obtain a non-foreground mask labelThe category is uniformly marked as a background category, and a mask label only comprising a foreground category and a background category is obtained;
step 2-2: carrying out mask average pooling on the feature map F by using a mask label to generate a foreground prototype and a background prototype of the second stage;
step 2-3: storing the foreground prototype and the background prototype of the second stage into a prototype queue, and covering the foreground prototype or the background prototype in the prototype queue if the foreground prototype or the background prototype of the same class exists in the prototype queue;
step 2-4: respectively calculating cosine distances between foreground prototypes and background prototypes of different categories in the prototype queue obtained in the step 2-3 and each pixel position in the feature map F to obtain a second-stage predictive probability mapSecond stage predictive probability mapNamely, the segmentation result of the second stage;
step 3: training according to the overall loss function to obtain a final segmentation model;
step 3-1: evaluating the loss;
using predictive probability map P final And the label image M calculates the preliminary segmentation result evaluation loss for the foreground category as follows:
wherein,c, for each position in the input image, the probability of being predicted as foreground fg Is a foreground category label;
predicting probability maps using a second stageAnd a label image M pairThe evaluation loss of the segmentation result in the second stage of foreground class calculation is as follows:
the evaluation loss was calculated as follows:
L eval =L seg +L t-s (6)
step 3-2: multi-category loss;
multi-class loss L mult The calculation is as follows:
wherein the pseudo tagThe method comprises the steps of calculating a preliminary prediction probability map P by adopting an argmax function; multi-class predictive probability mapThe feature diagram F is directly obtained by convolution and up-sampling calculation;
step 3-3: background hiding class loss functions;
constraint loss is calculated for the background region of the input image, and the label image M and the predictive probability map P are utilized through a cross entropy formula final Calculating the false positive rate of the background area, namely the background Entropy loss Entropy bg Background Entropy loss Entropy bg Describing the probability that the background region is not mispredicted as foreground, the following is calculated:
to prevent background regions from being predicted as foreground, increasing the background Entropy value, reducing the probability that hidden classes of the background regions are mispredicted, losing the background Entropy Entropy bg The loss-in constraint is as follows:
wherein lambda is a background optimization weight parameter;
step 3-4: overall loss function:
Loss=L eval +L blr +α×L mult (10)
wherein alpha is a multi-class constraint weight parameter, and the value range is between 0 and 1.
Specific examples:
1. simulation conditions
The invention is a simulation performed by using Pytorch on an operating system with a central processing unit of Intel (R) Xeon (R) Silver 4110CPU@2.10GHz and a memory 40G, linux. The data used in the simulation is a public data set.
2. Emulation content
The data used in the simulation are from the UDD and Vaihingen datasets. The UDD dataset contained 141 RGB pictures taken with the drone, six categories, image blocks cut out as 2439 720 x 720 pixels. The Vaihingen dataset is an aerial photograph dataset published by ISPRS, and is composed of a total of 33 RGB pictures, including six categories, cut out into 426 512 x 512 image blocks. Five pictures and corresponding class labels of the pictures are selected as small samples for model training in each class, and the rest pictures are used for testing. To ensure that the fairness training sample of the experiment is randomly selected five times, the test index selects the average value of five groups of experiment indexes.
To demonstrate the effectiveness of the algorithm, the present invention selected PANet, HRNet and HRNet+ for comparison on both datasets. Among them, PANet is literature "Kaixin Wang, jun Hao Liew, YIngtian Zou, daquan Zhou, and Jiashi Feng. Panet: few-shot image semantic segmentation with prototype alignment in IEEE International Conference on Computer Vision,2019, pp.9197-9206, "is a classical thumbnail semantic segmentation algorithm; HRNet is proposed in the documents "Ke Sun, bin Xiao, dong Liu, and JingdongWang. Deep high-resolution representation learning for human pose estimation, in IEEE Conference on Computer Vision and Pattern Recognition,2019, pp.5693-5703," and is a classical semantic segmentation algorithm, which is used for verifying the effect of a fine tuning method on multiple types of small sample segmentation tasks; the HRNet+ is a model which is obtained by taking the HRNet as a feature extractor and adopting a small sample segmentation experimental method based on measurement, and is a basic network of the invention. PQLNet is the method proposed in the invention, OA and mIoU are evaluation indexes for small sample semantic segmentation quality, and comparison results are shown in Table 1:
table 1 comparative results
It can be seen from Table 1 that the present invention is superior to other algorithms in terms of OA and mIoU metrics over the UDD dataset and Vaihingen dataset.
FIG. 2 is a graph of the semantic segmentation results of the inventive method and the generation of a contrast algorithm. Compared with a comparison algorithm, the method has more accurate multi-category segmentation edges, which can prove that the method effectively utilizes multi-category joint information and increases the distinction degree of different category characteristics. In addition, the invention also achieves the effects of eliminating particles and thinning edges, thereby proving the effects of the background hiding type distribution optimization and the two-stage segmentation module.
Claims (3)
1. The full-field Jing Yuyi segmentation method based on prototype queue learning under the condition of few labeling samples is characterized by comprising the following steps of:
step 1: prototype queue segmentation;
step 1-1: uniformly cutting the training image and the corresponding label image pair into fixed sizes; establishing an empty prototype queue;
step 1-2: taking a training image as input data, and generating a feature map F through a feature extractor;
step 1-3: mask-averaged pooling of feature map F using label image M to generate foreground prototype p c And background prototype p bg :
Wherein, (x, y) represents pixel coordinates, 1[ ] represents an indication function, i.e., the function value is 1 when the formula in brackets is correct, otherwise is 0; c is the foreground class set, C is the front Jing Leibie in the figure, and h and w are the length and width of the input image, respectively;
step 1-4: prototype foreground p c And background prototype p bg Storing the foreground categories into a prototype queue, wherein the prototype queue has a plurality of foreground categories, and only has one background category;
step 1-5: repeating the steps 1-2 to 1-4, and traversing all training images and corresponding label images; when the foreground prototype or the background prototype is stored in the prototype queue, if the foreground prototype or the background prototype generated later has the foreground prototype or the background prototype with the same category in the prototype queue, covering the foreground prototype or the background prototype with the same category in the prototype queue;
step 1-6: respectively calculating cosine distances between foreground prototypes and background prototypes of different categories in a prototype queue and each pixel position in a feature map F to obtain a preliminary prediction probability mapConnecting P with F, and performing convolution calculation to obtain a new predictive probability map P final The calculation is as follows:
P final =Conv(Concat(F,P)) (3)
predictive probability map P final Namely, a preliminary prediction segmentation result;
step 2: a second stage of segmentation constraint;
step 2-1: prediction probability map P using argmax function final Calculating to obtain a segmentation result mask label, and binarizing to uniformly mark non-foreground categories as background categories to obtain a mask label only comprising the foreground categories and the background categories;
step 2-2: carrying out mask average pooling on the feature map F by using a mask label to generate a foreground prototype and a background prototype of the second stage;
step 2-3: storing the foreground prototype and the background prototype of the second stage into a prototype queue, and covering the foreground prototype or the background prototype in the prototype queue if the foreground prototype or the background prototype of the same class exists in the prototype queue;
step 2-4: respectively calculating cosine distances between foreground prototypes and background prototypes of different categories in the prototype queue obtained in the step 2-3 and each pixel position in the feature map F to obtain a second-stage predictive probability mapSecond stage predictive probability map->The final segmentation result is obtained;
step 3: training according to the overall loss function to obtain a final segmentation model;
step 3-1: evaluating the loss;
using predictive probability map P final And the label image M calculates the preliminary segmentation result evaluation loss for the foreground category as follows:
wherein,c, for each position in the input image, the probability of being predicted as foreground fg Is a foreground category label; n represents the product of h and w;
predicting probability maps using a second stageAnd the label image M calculates the evaluation loss of the segmentation result of the second stage for the foreground category as follows:
wherein,representing a probability that each position of the input image in the second stage prediction result is predicted to be foreground;
the evaluation loss was calculated as follows:
L eval =L seg +L t-s (6)
step 3-2: multi-category loss;
multi-class loss L mult The calculation is as follows:
wherein the pseudo tagThe method comprises the steps of calculating a preliminary prediction probability map P by adopting an argmax function; multi-class predictive probability map->The characteristic diagram F is obtained through convolution operation and up-sampling calculation; />Representing a probability that each position of the input image in the multi-class prediction result is predicted as a class cl;
step 3-3: background hiding class loss functions;
constraint loss is calculated for the background region of the input image, and the label image M and the predictive probability map P are utilized through a cross entropy formula final Calculating the false positive rate of the background area, namely the background Entropy loss Entropy bg Background Entropy loss Entropy bg Describing the probability that the background region is not mispredicted as foreground, the following is calculated:
to prevent background regions from being predicted as foreground, increasing the background Entropy value, reducing the probability that hidden classes of the background regions are mispredicted, losing the background Entropy Entropy bg The loss-in constraint is as follows:
wherein lambda is a background optimization weight parameter;
step 3-4: overall loss function:
Loss=L eval +L blr +α×L mult (10)
wherein alpha is a multi-class constraint weight parameter, and the value range is between 0 and 1.
2. The full field Jing Yuyi segmentation method based on prototype queue learning with few labeling samples according to claim 1, wherein the training image and the corresponding label image pair are uniformly cut to a fixed size of 512×512 in step 1-1.
3. The full field Jing Yuyi segmentation method based on prototype queue learning with few labeling samples of claim 1, wherein the λ range is between 1 and 2.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210390663.3A CN114943834B (en) | 2022-04-14 | 2022-04-14 | Full-field Jing Yuyi segmentation method based on prototype queue learning under few labeling samples |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210390663.3A CN114943834B (en) | 2022-04-14 | 2022-04-14 | Full-field Jing Yuyi segmentation method based on prototype queue learning under few labeling samples |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114943834A CN114943834A (en) | 2022-08-26 |
CN114943834B true CN114943834B (en) | 2024-02-23 |
Family
ID=82907661
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210390663.3A Active CN114943834B (en) | 2022-04-14 | 2022-04-14 | Full-field Jing Yuyi segmentation method based on prototype queue learning under few labeling samples |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114943834B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117422879B (en) * | 2023-12-14 | 2024-03-08 | 山东大学 | Prototype evolution small sample semantic segmentation method and system |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112150471A (en) * | 2020-09-23 | 2020-12-29 | 创新奇智(上海)科技有限公司 | Semantic segmentation method and device based on few samples, electronic equipment and storage medium |
RU2742701C1 (en) * | 2020-06-18 | 2021-02-09 | Самсунг Электроникс Ко., Лтд. | Method for interactive segmentation of object on image and electronic computing device for realizing said object |
CN114049384A (en) * | 2021-11-09 | 2022-02-15 | 北京字节跳动网络技术有限公司 | Method and device for generating video from image and electronic equipment |
-
2022
- 2022-04-14 CN CN202210390663.3A patent/CN114943834B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
RU2742701C1 (en) * | 2020-06-18 | 2021-02-09 | Самсунг Электроникс Ко., Лтд. | Method for interactive segmentation of object on image and electronic computing device for realizing said object |
CN112150471A (en) * | 2020-09-23 | 2020-12-29 | 创新奇智(上海)科技有限公司 | Semantic segmentation method and device based on few samples, electronic equipment and storage medium |
CN114049384A (en) * | 2021-11-09 | 2022-02-15 | 北京字节跳动网络技术有限公司 | Method and device for generating video from image and electronic equipment |
Non-Patent Citations (1)
Title |
---|
结合上下文特征与CNN多层特征融合的语义分割;罗会兰;张云;;中国图象图形学报;20191231(第12期);第2200-2209页 * |
Also Published As
Publication number | Publication date |
---|---|
CN114943834A (en) | 2022-08-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
RU2697649C1 (en) | Methods and systems of document segmentation | |
CN111369581A (en) | Image processing method, device, equipment and storage medium | |
CN110334709B (en) | License plate detection method based on end-to-end multi-task deep learning | |
CN111932577B (en) | Text detection method, electronic device and computer readable medium | |
CN114399644A (en) | Target detection method and device based on small sample | |
CN114067118B (en) | Processing method of aerial photogrammetry data | |
CN109886330A (en) | Method for text detection, device, computer readable storage medium and computer equipment | |
CN110852327A (en) | Image processing method, image processing device, electronic equipment and storage medium | |
CN111738319B (en) | Clustering result evaluation method and device based on large-scale samples | |
CN114943834B (en) | Full-field Jing Yuyi segmentation method based on prototype queue learning under few labeling samples | |
CN112581446A (en) | Method, device and equipment for detecting salient object of image and storage medium | |
CN111274964B (en) | Detection method for analyzing water surface pollutants based on visual saliency of unmanned aerial vehicle | |
CN113850135A (en) | Dynamic gesture recognition method and system based on time shift frame | |
CN115527133A (en) | High-resolution image background optimization method based on target density information | |
CN113688839B (en) | Video processing method and device, electronic equipment and computer readable storage medium | |
CN113888586A (en) | Target tracking method and device based on correlation filtering | |
CN111582057B (en) | Face verification method based on local receptive field | |
CN116403071B (en) | Method and device for detecting few-sample concrete defects based on feature reconstruction | |
CN113704276A (en) | Map updating method and device, electronic equipment and computer readable storage medium | |
CN110503049B (en) | Satellite video vehicle number estimation method based on generation countermeasure network | |
CN112053439A (en) | Method, device and equipment for determining instance attribute information in image and storage medium | |
CN113807354B (en) | Image semantic segmentation method, device, equipment and storage medium | |
CN111860289B (en) | Time sequence action detection method and device and computer equipment | |
CN114511862A (en) | Form identification method and device and electronic equipment | |
CN114283428A (en) | Image processing method and device and computer equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |