CN110837836A - Semi-supervised semantic segmentation method based on maximized confidence - Google Patents
Semi-supervised semantic segmentation method based on maximized confidence Download PDFInfo
- Publication number
- CN110837836A CN110837836A CN201911071629.4A CN201911071629A CN110837836A CN 110837836 A CN110837836 A CN 110837836A CN 201911071629 A CN201911071629 A CN 201911071629A CN 110837836 A CN110837836 A CN 110837836A
- Authority
- CN
- China
- Prior art keywords
- loss
- class probability
- image
- network
- segmentation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000011218 segmentation Effects 0.000 title claims abstract description 79
- 238000000034 method Methods 0.000 title claims abstract description 46
- 238000012549 training Methods 0.000 claims abstract description 23
- 238000012360 testing method Methods 0.000 claims abstract description 5
- 238000013528 artificial neural network Methods 0.000 claims description 11
- 238000010586 diagram Methods 0.000 claims description 10
- 230000006870 function Effects 0.000 claims description 9
- 230000008485 antagonism Effects 0.000 claims description 6
- 239000011159 matrix material Substances 0.000 claims description 4
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 3
- 238000011478 gradient descent method Methods 0.000 claims 1
- 238000010200 validation analysis Methods 0.000 description 9
- 238000013459 approach Methods 0.000 description 6
- 238000002474 experimental method Methods 0.000 description 6
- 230000000052 comparative effect Effects 0.000 description 4
- 238000012795 verification Methods 0.000 description 4
- 238000013527 convolutional neural network Methods 0.000 description 3
- 230000002860 competitive effect Effects 0.000 description 2
- 230000002708 enhancing effect Effects 0.000 description 2
- 101100295091 Arabidopsis thaliana NUDT14 gene Proteins 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000003042 antagnostic effect Effects 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000010339 dilation Effects 0.000 description 1
- 239000000945 filler Substances 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000003709 image segmentation Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 238000000638 solvent extraction Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a semi-supervised semantic segmentation method based on maximized confidence, which comprises the following steps: selecting a part of images from the existing training data set as marked images, and using the rest images as unmarked images; constructing a network model, and predicting a prediction class probability map of a marked image and an unmarked image through a segmentation network in the network model; adopting a mode of supervised learning and generation countermeasure to maximize the confidence of the labeled image prediction class probability map; predicting a segmentation error region in the unmarked image prediction class probability map by adopting an unsupervised learning mode; training the network model by combining the loss of supervised learning and the loss of unsupervised learning; and in the testing stage, inputting the unmarked image to be segmented into the trained network model to obtain the segmented semantic image. According to the scheme of the embodiment of the invention, the unmarked image can be accurately subjected to semantic segmentation.
Description
Technical Field
The invention relates to the field of image semantic segmentation, in particular to a semi-supervised semantic segmentation method based on maximized confidence.
Background
The image segmentation means that an image is divided into a plurality of mutually disjoint areas according to characteristics such as gray scale, color, spatial texture, geometric shape and the like, so that the characteristics show consistency or similarity in the same area and obviously differ among different areas. In short, in one image, different objects are separated from the background, and it is clear from the segmentation result what object is segmented. Overall, semantic segmentation is a highly difficult task aimed at scene understanding. Scene understanding is a core problem of computer vision, and is widely applied to the information society at present. These applications include: autopilot, human-computer interaction, computer photography, image search engines, and augmented reality. These problems have been attempted to be solved using a variety of computer vision and machine learning methods.
Recently, methods using convolutional neural networks have achieved the most advanced performance in image semantic segmentation. These methods extract features of the neural network from a model trained on large-scale pixel-level annotated datasets. For example, PSPNet (pyramid scene analysis network), FCN (full convolutional neural network), and the like. However, annotating accurate pixel-level tags on large-scale data is very time consuming, labor intensive, and inefficient. To reduce the need to construct an accurate pixel-level annotation data set, the unsupervised learning approach seems to be a more suitable approach. However, to date, unsupervised learning approaches have not been successful due to the lack of detailed information about the semantic segmentation task. Therefore, weakly supervised and semi-supervised learning methods have also been proposed for semantic segmentation. These methods typically use unmarked or weakly marked data, and sometimes they also use additional fully annotated data to improve performance. Weakly labeled images may be partially annotated, but all may be annotated with some limited area, such as image-level annotations, box annotations, graffiti annotations, and so forth. However, this approach also has its drawbacks which are not negligible, such as:
1) due to the lack of detailed boundary location information, the weakly supervised approach performs far less well than the fully supervised approach.
2) Some semi-supervised learning methods are inefficient in using unlabeled data because they ignore the large amount of available misclassification information.
Disclosure of Invention
The invention aims to provide a semi-supervised semantic segmentation method based on maximized confidence coefficient, which can accurately perform semantic segmentation on an unmarked image.
The purpose of the invention is realized by the following technical scheme:
a semi-supervised semantic segmentation method based on maximized confidence includes:
constructing a training data set by using the marked images and the unmarked images in a specified proportion;
constructing a network model, and predicting a prediction class probability map of a marked image and an unmarked image through a segmentation network in the network model; adopting a supervised learning mode to maximize the confidence of the labeled image prediction class probability map; predicting a segmentation error region in the unmarked image prediction class probability map by adopting an unsupervised learning mode;
training the network model by combining the loss of supervised learning and the loss of unsupervised learning to obtain a trained segmented network;
and in the testing stage, inputting the unmarked image to be segmented into the trained segmentation network model, obtaining a predicted class probability map, and searching the index of the maximum value in the channel dimension in the predicted class probability map to obtain a segmented semantic image.
According to the technical scheme provided by the invention, the method improves the accuracy of semantic segmentation from the perspective of enhancing the confidence degree of the class probability map and paying attention to the wrongly classified area, and researches the data distribution of unmarked data through a segmentation network so as to generate a more reliable prediction result for the unmarked image.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.
Fig. 1 is a schematic diagram illustrating information entropy comparison between a verification set and a training set according to an embodiment of the present invention;
FIG. 2 is a flowchart of a semi-supervised semantic segmentation method based on maximized confidence level according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a network model according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a tag error map and a predicted segmentation error map according to an embodiment of the present invention;
fig. 5 is a schematic representation of the performance of the protocols participating in the comparative experiment on the PASCAL VOC2012 validation set provided by the example of the present invention;
fig. 6 is a schematic diagram of a performance result of the scheme participating in the comparative experiment on the PASCAL-CONTEXT validation set according to the embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention provides a semi-supervised semantic segmentation method based on maximized confidence coefficient, which provides a semi-supervised learning framework, combines supervised learning and unsupervised learning, and solves the problem from the perspective of enhancing the confidence coefficient of a class probability map. At the same time, the regions of misclassification, in particular in the border regions, are of greater interest. Also, the data distribution of the unlabeled data is studied by the segmentation network to produce a more reliable prediction of the unlabeled image.
In the embodiment of the invention, a generation countermeasure framework is constructed for the marked image. The segmentation network is considered as a generator that takes the image as input and outputs a prediction class probability map. The recognizer is constructed in a full convolution mode and is used for distinguishing whether the input is from a prediction class probability map of the prediction marked image or a class probability map consisting of 0 and 1 generated by a label map; the generator and recognizer compete with each other with the goal of maximizing the confidence of the prediction class probability map (i.e., the confidence of the segmented network). For unlabeled data, segmentation networks trained using labeled images achieve high confidence for correctly classified pixels with the help of resistant learning. Therefore, classified pixels with high uncertainty are considered segmentation error pixels. Next, an entropy of information representing the segmentation probability map is calculated to infer a segmentation error map. When the information entropy of a pixel is maximized, its prediction class probability approximates a uniform probability distribution, indicating that the feature under study cannot classify the pixel, and the weights of the model should be optimized to obtain more representative features.
Part (a) of fig. 1 shows the entropy of information on the verification set, and part (b) of fig. 1 shows the entropy of information on the training set, and it is apparent that the entropy of information on the verification set is greater than the entropy of information on the training set, which indicates that the segmented network is less trustworthy when the image is predicted without prior training, particularly in the boundary region. In this work, the mean entropy of information in the misclassified regions of unlabeled data is calculated and used as an additional supervised learning signal to optimize the segmented network. Therefore, the present invention focuses more on misclassified regions, especially in the border regions. Segmentation networks study the data distribution of unlabeled data to produce more reliable predictions for the unlabeled images.
As shown in fig. 2, it is a flowchart of a semi-supervised semantic segmentation method based on maximized confidence provided in the embodiment of the present invention; it mainly comprises:
Typically, the training data set is constructed using a small number of labeled images and a large number of unlabeled images, and the images used may be from an existing training data set. The specific proportion of the marked image to the unmarked image can be set by the user according to the actual situation.
Illustratively, a more challenging data set may be chosen: PASCAL VOC2012 and pascalontext. The PASCAL VOC2012 data set includes 20 foreground object classes and a background class that contains 1464, 1456 and 1449 pixel-level annotation images for training, testing and validation, respectively, and additionally, 10582 training images are obtained using additional annotation images from the segmentation boundary data Set (SBD) for the enhancement data set. The PASCAL-CONTEXT dataset provides detailed pixel-level annotations on two objects (e.g., cars) and filler (e.g., sky), and the present invention evaluates the 59 most frequent classes and one background class in the dataset, resulting in 4998 training images. Finally, 10%, 30% and 50% of the images were randomly drawn from the training dataset as labeled images and the remaining data was used for unlabeled images. It should be noted that the data sets and the ratio of marked images to unmarked images mentioned herein are examples and are not intended to be limiting.
Step 2, constructing a network model, and predicting a prediction class probability map of a marked image and an unmarked image through a segmentation network in the network model; adopting a supervised learning mode to maximize the confidence of the labeled image prediction class probability map; and predicting the segmentation error region in the unmarked image prediction class probability map by adopting an unsupervised learning mode.
As shown in fig. 3, a schematic diagram of a network model structure is constructed. The network model is divided into a main network and a sub-network. The master network is a segmented network, i.e. a generator in a generating confrontation network, whose inputs are marked images and unmarked images and whose output is a prediction class probability map. The sub-network is a fully-convoluted recognizer, i.e. a recognizer in the generation countermeasure network, which takes as input a prediction class probability map output by the segmentation network or a class probability map consisting of 0 and 1 generated by the label map, and outputs a two-channel class probability map to distinguish whether the input is from the prediction class probability map or the class probability map consisting of 0 and 1 generated by the label map.
The following description is directed to a split network and a full convolution neural network.
1) Network partitioning: in the present example, the Deeplab-v2 model pre-trained on the MSCOCO and ImageNet datasets served as the baseline network. However, to simplify the experiment and reduce memory consumption, only the ASPP output layer (the aperture Spatial pyramid dilation pyramid pooling layer) was retained without using Conditional Random Fields (CRF) and multi-scale inputs to maximize fusion. To match the size of the input image, an upsampling layer and a Softmax function are applied to predict the final class probability map.
2) Full convolution neural network: in the embodiment of the invention, the input of the full convolution neural network is two, one is a class probability map generated after the label map is subjected to downsampling and Onehot coding, and the other is a prediction class probability map generated after the marked image is subjected to a segmentation network and Softmax. The full convolution neural network as the identifier is composed of 5 expansion convolution layers with step size 1 containing 3 x 3 convolution kernel and {64, 128, 256, 512, 2} channels, and the expansion ratio is set to {1,1,2,4,1} in each layer respectively. In addition, each expanded convolutional layer is followed by a ReLU activation function, except for the last layer.
It should be noted that the structural forms and the values of the relevant parameters mentioned in the above descriptions for the segmented network and the full convolutional neural network are only examples and are not limiting.
The following describes a supervised learning mode, an unsupervised learning mode and a related loss function.
1) There are ways of supervised learning and associated loss functions.
Supervised learning has two main goals: the first is the basic task of assigning semantic labels to each pixel, and the second is to maximize the confidence of the prediction class probability map using a way of generating countermeasures. To this end, a generative confrontation framework is constructed, wherein the generator is a segmentation network and the recognizer is a full convolution neural network. In the generation of the countermeasure network, the segmentation network is used as a generator to predict the prediction class probability map of the marked image; the full convolution neural network is used as a recognizer, the input of the full convolution neural network is a labeled image prediction class probability graph, and a class probability graph consisting of 0 and 1 generated after downsampling and onehot coding (one-hot coding) of a label graph, and the type of the input is recognized through the recognizer; the generator and recognizer compete against each other with the goal of maximizing the confidence of the prediction class probability map.
In the generator network, spatial multi-class cross entropy penalties are used to force the segmentation network to independently predict the correct semantic label class at each pixel location, expressed as:
wherein x isnFor the marked image input to the segmented network, ynEncoding the label map for onehot of the corresponding labeled image, (h)1,w1,c1) The predicted class probability map is of size H for the position coordinates of the pixels in the map1×W1×C1,H1、W1Respectively representing the height, width, C of the image1Represents the number of categories (channels); s (x)n) Tagged image x predicted for segmented networksnThe prediction class probability map of (1).
In the recognizer, a spatial binary class entropy penalty is used to distinguish whether the input is a predicted labeled image prediction class probability map or a class probability map generated from a label map, the spatial binary class entropy penalty being expressed as:
Yn=one_hot(ones(H2,W2)×SG)
wherein p isnA labeled image prediction class probability map representing a prediction or a class probability map generated from a label map, D (-) representing a recognizer, Y ·nIs a comment for distinguishing the input source, C22 because the recognizer is a binary classification network; one _ hot (·) is the onehot encoding function, ons (H)2,W2) For generating a size of H2×W2Matrix of (H)2、W2Respectively representing the row number and the column number of the matrix, wherein the values of all elements are 1; SG is 0, and represents a prediction class probability map of the marked image input as prediction by the recognizer; SG is 1, and represents that the recognizer inputs are class probability maps composed of 0 and 1 generated by the label map; the above-mentioned entropy loss of the spatial binary class is mainly used to train the recognizer.
Adding an antagonistic loss to the split network facilitates it to increase the predicted class probability to near 1. The resistance loss can be written as follows:
in embodiments of the invention, loss is calculated when the input is from a split networkadv. In addition, to confuse the recognition network, SG is set to 1.
2) Unsupervised learning mode and associated loss function.
The information entropy of the unlabeled image class probability map characterizes the uncertainty of the segmentation result of the image, which is closely related to the segmentation error map of the image. Therefore, the invention uses the information entropy of the predicted class probability map to infer the segmentation error map, and FIG. 4 shows the labeled error map and the predicted segmentation error map. Wrongly classified pixels are mainly located around the boundary, which means that the segmentation error map contains rich classification information, especially in the boundary region. Fig. 4 shows an original image in part (a), an error map label in part (b), and a predicted segmentation error map in part (c); the artwork in fig. 4 is from the data set PASCAL VOC 2012. After obtaining the segmentation error map, the average entropy of information in the error classification region is calculated as unsupervised loss.
Given sizeIs as small as H1×W1X 3 unmarked image xn', segmentation of network predicted unmarked image xn' the prediction class probability map is S (x)n') and an information entropy diagram H (x) is calculated in the following mannern'):
Wherein E [. C]Represents to all C1(ii) a desire for a category;
the information entropy indicates the uncertainty of the prediction of the segmentation network, given an uncertainty threshold T, a binary map is obtained representing the segmentation error map EM (x)n') expressed as:
where h1∈H1,w1∈W1
wherein,representing (h) in an information entropy diagram1,w1) The pixel point at the position indicates EM (x)n') is obtained by determining whether each pixel value of the information entropy map is greater than a threshold value.
By means of an information entropy diagram H (x)n') and a segmentation error map EM (x)n') get unsupervised losses and feed back to the split network, the unsupervised losses are expressed as:
in the embodiment of the invention, a mixing loss function is used, and the mixing loss function combines spatial multi-class entropy loss, antagonism loss and unsupervised loss. The amount of mixing loss was calculated as follows:
lossseg=lossmce+λadvlossadv+λinflossinf
therein, lossmce,lossadvAnd lossinfRespectively representing spatial multi-class entropy loss, antagonism loss and unsupervised loss of a maximized unmarked image prediction class probability graph; lambda [ alpha ]advAnd λinfAre two weights that balance the corresponding losses. lossmceAnd lossadvFor guiding supervised learning, but lossinfIs used as an unsupervised learning signal to study the data distribution of the unlabeled images.
And 3, training the network model by combining the loss of supervised learning and the loss of unsupervised learning to obtain a trained segmented network.
The marked images and the unmarked images are combined together according to the batch size as input, various hyper-parameters of a network model are set, a weight initialization mode is set, the loss of supervised learning and the loss of unsupervised learning are provided, a segmentation network is trained by using a Stochastic Gradient Descent (SGD) method and a polygon learning rate strategy, a recognizer is trained by using an Adam optimizer and an exponential decay learning rate strategy, and the trained model weight is stored.
By way of example, some specific settings for network model training are given below:
the proposed network is implemented by a sensor-Flow framework running on a GPU (Tesla V100). The training images obtained in the step 1 are randomly scaled and cut to 321 × 321 pixel Size, and the training models are combined together as input according to the Batch Size (Batch Size) of 10 for 20K iterations. With respect to the hyper-parameter of the proposed method, λadvIs set to 0.02 and lambdainfSet to 0.1. Further, the threshold T for obtaining the segmentation error map is set to 0.2.
In training the segmented network, Stochastic Gradient Descent (SGD) optimization was applied, using a momentum of 0.9 and a weight decay of 5E-4. And saving the trained model weight.
When the recognition network is trained, an Adam optimizer is adopted: the initial learning rate is set to 1E-4. And saving the trained model weight.
And 4, in the testing stage, inputting the unmarked image to be segmented into the trained segmentation network model to obtain a predicted class probability map, and searching the index of the maximum value in the channel dimension in the predicted class probability map to obtain the segmented semantic image.
The scheme of the embodiment of the invention has the following beneficial effects:
1) the invention develops a generative confrontation framework, treats the segmented network as a generator, and uses the full convolutional network as a recognizer. With the help of this generation of the countermeasure framework, the segmentation network can generate class probability maps with higher confidence.
2) The invention provides an unsupervised learning method for researching data distribution of unmarked images. In order to focus the unsupervised learning signal on misclassified regions, especially at boundary regions, segmentation error regions of the unlabeled image are predicted instead of predicting potentially reliable regions, and then the uncertainty of the unlabeled image prediction is minimized.
3) We propose a semi-supervised learning framework that combines supervised learning and unsupervised learning. The experimental results on the PASCAL VOC2012 and the PASCAL-CONTEXT data sets show that the semi-supervised learning method provided by the invention is competitive.
In order to demonstrate the performance of the above-described embodiment of the present invention, a comparative experiment was performed as follows.
In the experiment, images are selected in a manner similar to the step 1 to form a verification set. For example, 1449 images of the standard validation set were obtained in the PASCAL VOC2012 data set to evaluate the trained network model, and 5105 images of the standard validation set were used in the PASCAL-CONTEXT data set to evaluate the trained network model.
Protocols participating in comparative experiments include: (1) a baseline network; (2) baseline network + lossadv(ii) a (3) Baseline network + lossadv+lossinf。
And (3) analyzing an experimental result:
1) results on PASCAL VOC 2012. The quantitative results of the present method on the PASCAL VOC2012 validation set are shown in table 1. The qualitative results of some sample images are shown in fig. 5. In fig. 5, (a) shows the original image, and (b) shows the label of the semantic division image. (c) Sections (a) to (e) correspond to schemes (1) to (3) in this order.
Table 1 results on the PASCAL VOC2012 validation set
As shown in Table 1, loss of antagonismadvResulting in an improvement of mIOU (average cross-over ratio) of 1.1% to 1.4%. This indicates a loss of antagonism lossadvSegmentation performance can be improved by increasing the confidence of the prediction on the labeled image. Incorporating unsupervised lossinfMinimizing the uncertainty of the prediction on the unmarked images, the proposed method achieves an improvement of 1.9% to 2.7% over the baseline network. The qualitative results presented in fig. 5 show that using models for loss tolerance and unsupervised loss achieves some improvement in the misclassified regions of the baseline network, particularly in some of the border regions.
2) The result of PASCAL-CONTEXT. The results of the quantitative assessment on the PASCAL CONTEXT dataset are shown in table 2. Furthermore, the qualitative results of some sample images are visualized in fig. 6. In fig. 6, (a) shows an original image, and (b) shows a semantic division image tag. (c) Sections (a) to (e) correspond to schemes (1) to (3) in this order.
Data volume | 10% | 30% | 50% | 100% |
Baseline network | 34.6 | 38.0 | 40.1 | 42.3 |
Baseline network + lossadv | 35.1 | 38.7 | 40.8 | 42.9 |
Baseline network + lossadv+lossinf | 35.9 | 39.6 | 41.3 | — |
TABLE 2 results on PASCAL-CONTEXT validation set
It can be found that the proposed method is still effective, and in a complex scenario, the proposed method improves the average cross-binding by 1.2% to 1.6%, with the antagonism loss accounting for about 0.5% to 0.7% of the performance improvement. The performance assessment of the PASCAL-CONTEXT dataset was worse than that on the PASCAL VOC2012 dataset. This is because the PASCAL-CONTEXT data set containing objects and fill annotations is more complex, resulting in the proposed method not being able to accurately infer tag mismappings.
3) Compared to the most advanced methods. The method proposed by the present invention is first compared with several of the most advanced methods of weak supervision. All of these weakly supervised methods used a DeepLab-v2 based on ResNet-101 as the baseline network. The weakly supervised method was trained on the PASCAL VOC2012 dataset using image level annotation, while the proposed method was trained on the same dataset using 440 pixel level annotated images and 10142 unlabeled images. As shown in Table 3, the proposed method has a mIOU (mean cross-over ratio) of 68.9%, which is at least 4.0% better than all weakly supervised methods. These large improvements can be attributed to the proposed method of obtaining more detailed information of the border area. Weakly supervised learning methods directly use image-level annotations, which lead to difficulties in locating the boundary regions. In contrast, the proposed method first learns how to locate the bounding regions by adversarial learning with limited pixel-level annotations; it then predicts segmentation error regions of the unlabeled image, making the unsupervised learning signal more focused on misclassified regions, especially in the border regions. Thus, the proposed method achieves a more competitive performance compared to the weakly supervised method.
Table 3 results of the inventive method compared to the advanced weak surveillance method on the PASCAL VOC2012 validation set
Through the above description of the embodiments, it is clear to those skilled in the art that the above embodiments can be implemented by software, and can also be implemented by software plus a necessary general hardware platform. With this understanding, the technical solutions of the embodiments can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, a usb disk, a removable hard disk, etc.), and includes several instructions for enabling a computer device (which can be a personal computer, a server, or a network device, etc.) to execute the methods according to the embodiments of the present invention.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
Claims (6)
1. A semi-supervised semantic segmentation method based on maximized confidence is characterized by comprising the following steps:
constructing a training data set by using the marked images and the unmarked images in a specified proportion;
constructing a network model, and predicting a prediction class probability map of a marked image and an unmarked image through a segmentation network in the network model; adopting a supervised learning mode to maximize the confidence of the labeled image prediction class probability map; predicting a segmentation error region in the unmarked image prediction class probability map by adopting an unsupervised learning mode;
training the network model by combining the loss of supervised learning and the loss of unsupervised learning to obtain a trained segmented network;
and in the testing stage, inputting the unmarked image to be segmented into the trained segmentation network model, obtaining a predicted class probability map, and searching the index of the maximum value in the channel dimension in the predicted class probability map to obtain a segmented semantic image.
2. The semi-supervised semantic segmentation method based on maximizing confidence of claim 1, wherein the maximizing confidence of the prediction class probability map by adopting supervised learning for the labeled image comprises:
for the marked images, a generating countermeasure network is adopted, and the confidence of the prediction class probability map is maximized through a generating countermeasure mode in supervised learning;
the generation countermeasure network is composed of a segmentation network and a full convolution neural network in a network model;
in the generation of the countermeasure network, the segmentation network is used as a generator to predict a class probability graph of the marked image; the full convolution neural network is used as a recognizer, the input of the full convolution neural network is a marked image prediction class probability graph and a class probability graph generated after downsampling and onehot coding of a label graph, and the type of the input is recognized through the recognizer;
the generator and recognizer compete against each other with the goal of maximizing the confidence of the prediction class probability map.
3. The semi-supervised semantic segmentation method based on maximized confidence as claimed in claim 2, wherein the loss of supervised learning comprises: multi-class cross entropy loss and antagonism loss;
the spatial multi-class cross entropy loss is used to facilitate the segmentation network to independently predict the correct semantic label class at each pixel position, and is expressed as:
wherein x isnFor the marked image input to the segmented network, ynEncoding the label map for onehot of the corresponding labeled image, (h)1,w1,c1) The predicted class probability map is of size H for the position coordinates of the pixels in the map1×W1×C1,H1、W1Respectively representing the height, width, C of the image1Representing the number of categories, i.e. the number of channels; s (x)n) Tagged image x predicted for segmented networksnA prediction class probability map of (1);
in the recognizer, a spatial binary class entropy penalty is used to distinguish whether the input is a predicted labeled image prediction class probability map or a class probability map generated from a label map, the spatial binary class entropy penalty being expressed as:
Yn=one_hot(ones(H2,W2)×SG)
wherein p isnA labeled image prediction class probability map representing a prediction or a class probability map generated from a label map, D (-) representing a recognizer, Y ·nIs used for a regionComments into the source, C22, one _ hot (·) is an onehot coding function, ons (H)2,W2) For generating a size of H2×W2Matrix of (H)2、W2Respectively representing the row number and the column number of the matrix, wherein the values of all elements are 1; SG is 0, and represents a labeled image prediction class probability map for which the recognizer input is prediction; SG is 1, representing the recognizer input as a class probability map generated from the tag map; the spatial binary entropy loss is used for training a recognizer;
the resistance loss is expressed as:
4. the semi-supervised semantic segmentation method based on the maximized confidence coefficient according to claim 1, wherein the predicting segmentation error regions in the class probability map by adopting an unsupervised learning manner for the unlabeled images comprises:
the information entropy of the unlabeled image class probability graph represents the uncertainty of the segmentation result of the corresponding image, and the segmentation error graph is related;
deducing a segmentation error map by using the information entropy of the predicted class probability map, and calculating the average information entropy in the error classification area as unsupervised loss after obtaining the segmentation error map;
given size H1×W1×C1Unmarked image x ofn', segmentation of network predicted unmarked image xn' the prediction class probability map is S (x)n') and an information entropy diagram H (x) is calculated in the following mannern'):
Wherein E [. C]Represents to all C1(ii) a desire for a category;
information entropy indicates the uncertainty of a segmented network prediction, given an uncertainty thresholdValue T, obtaining a binary map representing a segmentation error map EM (x)n') expressed as:
where h1∈H1,w1∈W1
wherein (h)1,w1) Representing the position coordinates of the pixel points;
by means of an information entropy diagram H (x)n') and a segmentation error map EM (x)n') get unsupervised losses and feed back to the split network, the unsupervised losses are expressed as:
5. the semi-supervised semantic segmentation method based on maximized confidence as claimed in claim 1, wherein the supervised learning loss and the unsupervised learning loss constitute a total loss of the network model, and are expressed as:
lossseg=lossmce+λadvlossadv+λinflossinf
wherein the supervised learning penalty comprises a spatial multi-class entropy penalty loss that facilitates a segmentation network to independently predict correct semantic label classes at each pixel locationmceAnd a loss of antagonism loss (loss) to maximize confidence in the labeled image prediction class probability mapadv(ii) a Unsupervised loss to maximize unmarked image prediction class probability mapinf;λadvAnd λinfAre two weights that balance the corresponding losses.
6. The semi-supervised semantic segmentation method based on maximized confidence as recited in claim 2,
the marked images and the unmarked images are combined together according to batch size as input, each hyper-parameter of a network model is set, a weight initialization mode is set, the loss of supervised learning and the loss of unsupervised learning are provided, a segmentation network is trained by using a random gradient descent method and a polygon learning rate strategy, and a recognizer is trained by using an Adam optimizer and a Poly learning rate strategy.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911071629.4A CN110837836B (en) | 2019-11-05 | 2019-11-05 | Semi-supervised semantic segmentation method based on maximized confidence |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911071629.4A CN110837836B (en) | 2019-11-05 | 2019-11-05 | Semi-supervised semantic segmentation method based on maximized confidence |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110837836A true CN110837836A (en) | 2020-02-25 |
CN110837836B CN110837836B (en) | 2022-09-02 |
Family
ID=69576198
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911071629.4A Active CN110837836B (en) | 2019-11-05 | 2019-11-05 | Semi-supervised semantic segmentation method based on maximized confidence |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110837836B (en) |
Cited By (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111612010A (en) * | 2020-05-21 | 2020-09-01 | 京东方科技集团股份有限公司 | Image processing method, device, equipment and computer readable storage medium |
CN111651998A (en) * | 2020-05-07 | 2020-09-11 | 中国科学技术大学 | Weakly supervised deep learning semantic analysis method under virtual reality and augmented reality scenes |
CN111666953A (en) * | 2020-06-04 | 2020-09-15 | 电子科技大学 | Tidal zone surveying and mapping method and device based on semantic segmentation |
CN111798471A (en) * | 2020-07-27 | 2020-10-20 | 中科智脑(北京)技术有限公司 | Training method of image semantic segmentation network |
CN111870279A (en) * | 2020-07-31 | 2020-11-03 | 西安电子科技大学 | Method, system and application for segmenting left ventricular myocardium of ultrasonic image |
CN112132149A (en) * | 2020-09-10 | 2020-12-25 | 武汉汉达瑞科技有限公司 | Semantic segmentation method and device for remote sensing image |
CN112419327A (en) * | 2020-12-10 | 2021-02-26 | 复旦大学附属肿瘤医院 | Image segmentation method, system and device based on generation countermeasure network |
CN112801107A (en) * | 2021-02-01 | 2021-05-14 | 联想(北京)有限公司 | Image segmentation method and electronic equipment |
CN113269197A (en) * | 2021-04-25 | 2021-08-17 | 南京三百云信息科技有限公司 | Certificate image vertex coordinate regression system and identification method based on semantic segmentation |
CN113516130A (en) * | 2021-07-19 | 2021-10-19 | 闽江学院 | Entropy minimization-based semi-supervised image semantic segmentation method |
CN113537365A (en) * | 2021-07-20 | 2021-10-22 | 北京航空航天大学 | Multitask learning self-adaptive balancing method based on information entropy dynamic weighting |
CN113610807A (en) * | 2021-08-09 | 2021-11-05 | 西安电子科技大学 | New coronary pneumonia segmentation method based on weak supervision multitask learning |
CN114004817A (en) * | 2021-11-03 | 2022-02-01 | 深圳大学 | Segmented network semi-supervised training method, system, equipment and storage medium |
CN114037720A (en) * | 2021-10-18 | 2022-02-11 | 北京理工大学 | Pathological image segmentation and classification method and device based on semi-supervised learning |
CN114118167A (en) * | 2021-12-04 | 2022-03-01 | 河南大学 | Action sequence segmentation method based on self-supervision less-sample learning and aiming at behavior recognition |
CN114565755A (en) * | 2022-01-17 | 2022-05-31 | 北京新氧科技有限公司 | Image segmentation method, device, equipment and storage medium |
CN114565812A (en) * | 2022-03-01 | 2022-05-31 | 北京地平线机器人技术研发有限公司 | Training method and device of semantic segmentation model and semantic segmentation method of image |
CN115100491A (en) * | 2022-08-25 | 2022-09-23 | 山东省凯麟环保设备股份有限公司 | Abnormal robust segmentation method and system for complex automatic driving scene |
CN116403074A (en) * | 2023-04-03 | 2023-07-07 | 上海锡鼎智能科技有限公司 | Semi-automatic image labeling method and device based on active labeling |
CN116721250A (en) * | 2023-04-17 | 2023-09-08 | 重庆邮电大学 | Medical image graffiti segmentation algorithm based on low-quality pseudo tag refinement |
Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104091333A (en) * | 2014-07-01 | 2014-10-08 | 黄河科技学院 | Multi-class unsupervised color texture image segmentation method based on credible regional integration |
CN104537676A (en) * | 2015-01-12 | 2015-04-22 | 南京大学 | Gradual image segmentation method based on online learning |
US20180129912A1 (en) * | 2016-11-07 | 2018-05-10 | Nec Laboratories America, Inc. | System and Method for Learning Random-Walk Label Propagation for Weakly-Supervised Semantic Segmentation |
CN108062753A (en) * | 2017-12-29 | 2018-05-22 | 重庆理工大学 | The adaptive brain tumor semantic segmentation method in unsupervised domain based on depth confrontation study |
US20180260957A1 (en) * | 2017-03-08 | 2018-09-13 | Siemens Healthcare Gmbh | Automatic Liver Segmentation Using Adversarial Image-to-Image Network |
CN108549895A (en) * | 2018-04-17 | 2018-09-18 | 深圳市唯特视科技有限公司 | A kind of semi-supervised semantic segmentation method based on confrontation network |
US20180276825A1 (en) * | 2017-03-23 | 2018-09-27 | Petuum, Inc. | Structure Correcting Adversarial Network for Chest X-Rays Organ Segmentation |
CN109409240A (en) * | 2018-09-28 | 2019-03-01 | 北京航空航天大学 | A kind of SegNet remote sensing images semantic segmentation method of combination random walk |
CN109614921A (en) * | 2018-12-07 | 2019-04-12 | 安徽大学 | A kind of cell segmentation method for the semi-supervised learning generating network based on confrontation |
CN109741332A (en) * | 2018-12-28 | 2019-05-10 | 天津大学 | A kind of image segmentation and mask method of man-machine coordination |
CN109993770A (en) * | 2019-04-09 | 2019-07-09 | 西南交通大学 | A kind of method for tracking target of adaptive space-time study and state recognition |
US10430946B1 (en) * | 2019-03-14 | 2019-10-01 | Inception Institute of Artificial Intelligence, Ltd. | Medical image segmentation and severity grading using neural network architectures with semi-supervised learning techniques |
CN110363772A (en) * | 2019-08-22 | 2019-10-22 | 西南大学 | Cardiac MRI dividing method and system based on confrontation network |
-
2019
- 2019-11-05 CN CN201911071629.4A patent/CN110837836B/en active Active
Patent Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104091333A (en) * | 2014-07-01 | 2014-10-08 | 黄河科技学院 | Multi-class unsupervised color texture image segmentation method based on credible regional integration |
CN104537676A (en) * | 2015-01-12 | 2015-04-22 | 南京大学 | Gradual image segmentation method based on online learning |
US20180129912A1 (en) * | 2016-11-07 | 2018-05-10 | Nec Laboratories America, Inc. | System and Method for Learning Random-Walk Label Propagation for Weakly-Supervised Semantic Segmentation |
US20180260957A1 (en) * | 2017-03-08 | 2018-09-13 | Siemens Healthcare Gmbh | Automatic Liver Segmentation Using Adversarial Image-to-Image Network |
US20180276825A1 (en) * | 2017-03-23 | 2018-09-27 | Petuum, Inc. | Structure Correcting Adversarial Network for Chest X-Rays Organ Segmentation |
CN108062753A (en) * | 2017-12-29 | 2018-05-22 | 重庆理工大学 | The adaptive brain tumor semantic segmentation method in unsupervised domain based on depth confrontation study |
CN108549895A (en) * | 2018-04-17 | 2018-09-18 | 深圳市唯特视科技有限公司 | A kind of semi-supervised semantic segmentation method based on confrontation network |
CN109409240A (en) * | 2018-09-28 | 2019-03-01 | 北京航空航天大学 | A kind of SegNet remote sensing images semantic segmentation method of combination random walk |
CN109614921A (en) * | 2018-12-07 | 2019-04-12 | 安徽大学 | A kind of cell segmentation method for the semi-supervised learning generating network based on confrontation |
CN109741332A (en) * | 2018-12-28 | 2019-05-10 | 天津大学 | A kind of image segmentation and mask method of man-machine coordination |
US10430946B1 (en) * | 2019-03-14 | 2019-10-01 | Inception Institute of Artificial Intelligence, Ltd. | Medical image segmentation and severity grading using neural network architectures with semi-supervised learning techniques |
CN109993770A (en) * | 2019-04-09 | 2019-07-09 | 西南交通大学 | A kind of method for tracking target of adaptive space-time study and state recognition |
CN110363772A (en) * | 2019-08-22 | 2019-10-22 | 西南大学 | Cardiac MRI dividing method and system based on confrontation network |
Non-Patent Citations (5)
Title |
---|
HONGZHEN WANG 等: "Gated Convolutional Neural Network for Semantic Segmentation in High-Resolution Images", 《REMOTE SENSING 》 * |
TUAN-HUNG VU 等: "ADVENT: Adversarial Entropy Minimization for Domain Adaptation in Semantic Segmentation", 《ARXIV:1811.12833V2》 * |
WEI-CHIH HUNG 等: "Adversarial Learning for Semi-Supervised Semantic Segmentation", 《ARXIV:1802.07934V2》 * |
吴飞 等: "深度学习的可解释性", 《航空兵器》 * |
张桂梅 等: "基于自适应对抗学习的半监督图像语义分割", 《南昌航空大学学报:自然科学版》 * |
Cited By (35)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111651998A (en) * | 2020-05-07 | 2020-09-11 | 中国科学技术大学 | Weakly supervised deep learning semantic analysis method under virtual reality and augmented reality scenes |
US12039766B2 (en) * | 2020-05-21 | 2024-07-16 | Boe Technology Group Co., Ltd. | Image processing method, apparatus, and computer product for image segmentation using unseen class obtaining model |
US20220292805A1 (en) * | 2020-05-21 | 2022-09-15 | Boe Technology Group Co., Ltd. | Image processing method and apparatus, and device, storage medium, and image segmentation method |
CN111612010A (en) * | 2020-05-21 | 2020-09-01 | 京东方科技集团股份有限公司 | Image processing method, device, equipment and computer readable storage medium |
CN111666953A (en) * | 2020-06-04 | 2020-09-15 | 电子科技大学 | Tidal zone surveying and mapping method and device based on semantic segmentation |
CN111798471A (en) * | 2020-07-27 | 2020-10-20 | 中科智脑(北京)技术有限公司 | Training method of image semantic segmentation network |
CN111798471B (en) * | 2020-07-27 | 2024-04-02 | 中科智脑(北京)技术有限公司 | Training method of image semantic segmentation network |
CN111870279A (en) * | 2020-07-31 | 2020-11-03 | 西安电子科技大学 | Method, system and application for segmenting left ventricular myocardium of ultrasonic image |
CN112132149A (en) * | 2020-09-10 | 2020-12-25 | 武汉汉达瑞科技有限公司 | Semantic segmentation method and device for remote sensing image |
CN112132149B (en) * | 2020-09-10 | 2023-09-05 | 武汉汉达瑞科技有限公司 | Semantic segmentation method and device for remote sensing image |
CN112419327A (en) * | 2020-12-10 | 2021-02-26 | 复旦大学附属肿瘤医院 | Image segmentation method, system and device based on generation countermeasure network |
CN112419327B (en) * | 2020-12-10 | 2023-08-04 | 复旦大学附属肿瘤医院 | Image segmentation method, system and device based on generation countermeasure network |
CN112801107A (en) * | 2021-02-01 | 2021-05-14 | 联想(北京)有限公司 | Image segmentation method and electronic equipment |
CN113269197A (en) * | 2021-04-25 | 2021-08-17 | 南京三百云信息科技有限公司 | Certificate image vertex coordinate regression system and identification method based on semantic segmentation |
CN113269197B (en) * | 2021-04-25 | 2024-03-08 | 南京三百云信息科技有限公司 | Certificate image vertex coordinate regression system and identification method based on semantic segmentation |
CN113516130B (en) * | 2021-07-19 | 2024-01-05 | 闽江学院 | Semi-supervised image semantic segmentation method based on entropy minimization |
CN113516130A (en) * | 2021-07-19 | 2021-10-19 | 闽江学院 | Entropy minimization-based semi-supervised image semantic segmentation method |
CN113537365A (en) * | 2021-07-20 | 2021-10-22 | 北京航空航天大学 | Multitask learning self-adaptive balancing method based on information entropy dynamic weighting |
CN113537365B (en) * | 2021-07-20 | 2024-02-06 | 北京航空航天大学 | Information entropy dynamic weighting-based multi-task learning self-adaptive balancing method |
CN113610807A (en) * | 2021-08-09 | 2021-11-05 | 西安电子科技大学 | New coronary pneumonia segmentation method based on weak supervision multitask learning |
CN113610807B (en) * | 2021-08-09 | 2024-02-09 | 西安电子科技大学 | New coronaries pneumonia segmentation method based on weak supervision multitask learning |
CN114037720A (en) * | 2021-10-18 | 2022-02-11 | 北京理工大学 | Pathological image segmentation and classification method and device based on semi-supervised learning |
CN114004817A (en) * | 2021-11-03 | 2022-02-01 | 深圳大学 | Segmented network semi-supervised training method, system, equipment and storage medium |
CN114004817B (en) * | 2021-11-03 | 2024-04-02 | 深圳大学 | Semi-supervised training method, system, equipment and storage medium for segmentation network |
CN114118167A (en) * | 2021-12-04 | 2022-03-01 | 河南大学 | Action sequence segmentation method based on self-supervision less-sample learning and aiming at behavior recognition |
CN114118167B (en) * | 2021-12-04 | 2024-02-27 | 河南大学 | Action sequence segmentation method aiming at behavior recognition and based on self-supervision less sample learning |
CN114565755A (en) * | 2022-01-17 | 2022-05-31 | 北京新氧科技有限公司 | Image segmentation method, device, equipment and storage medium |
CN114565812A (en) * | 2022-03-01 | 2022-05-31 | 北京地平线机器人技术研发有限公司 | Training method and device of semantic segmentation model and semantic segmentation method of image |
CN114565812B (en) * | 2022-03-01 | 2024-10-18 | 北京地平线机器人技术研发有限公司 | Training method and device of semantic segmentation model and semantic segmentation method of image |
CN115100491B (en) * | 2022-08-25 | 2022-11-18 | 山东省凯麟环保设备股份有限公司 | Abnormal robust segmentation method and system for complex automatic driving scene |
CN115100491A (en) * | 2022-08-25 | 2022-09-23 | 山东省凯麟环保设备股份有限公司 | Abnormal robust segmentation method and system for complex automatic driving scene |
US11954917B2 (en) | 2022-08-25 | 2024-04-09 | Shandong Kailin Environmental Protection Equipment Co., Ltd. | Method of segmenting abnormal robust for complex autonomous driving scenes and system thereof |
CN116403074A (en) * | 2023-04-03 | 2023-07-07 | 上海锡鼎智能科技有限公司 | Semi-automatic image labeling method and device based on active labeling |
CN116403074B (en) * | 2023-04-03 | 2024-05-14 | 上海锡鼎智能科技有限公司 | Semi-automatic image labeling method and device based on active labeling |
CN116721250A (en) * | 2023-04-17 | 2023-09-08 | 重庆邮电大学 | Medical image graffiti segmentation algorithm based on low-quality pseudo tag refinement |
Also Published As
Publication number | Publication date |
---|---|
CN110837836B (en) | 2022-09-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110837836B (en) | Semi-supervised semantic segmentation method based on maximized confidence | |
CN109934293B (en) | Image recognition method, device, medium and confusion perception convolutional neural network | |
CN110619369B (en) | Fine-grained image classification method based on feature pyramid and global average pooling | |
US10410353B2 (en) | Multi-label semantic boundary detection system | |
US11670071B2 (en) | Fine-grained image recognition | |
Hu et al. | Qs-attn: Query-selected attention for contrastive learning in i2i translation | |
Xiao et al. | A weakly supervised semantic segmentation network by aggregating seed cues: the multi-object proposal generation perspective | |
Kao et al. | Hierarchical aesthetic quality assessment using deep convolutional neural networks | |
CN111126488B (en) | Dual-attention-based image recognition method | |
US11640714B2 (en) | Video panoptic segmentation | |
CN112100346B (en) | Visual question-answering method based on fusion of fine-grained image features and external knowledge | |
US7672915B2 (en) | Method and system for labelling unlabeled data records in nodes of a self-organizing map for use in training a classifier for data classification in customer relationship management systems | |
CN112800292B (en) | Cross-modal retrieval method based on modal specific and shared feature learning | |
CN109063719B (en) | Image classification method combining structure similarity and class information | |
Guan et al. | A unified probabilistic model for global and local unsupervised feature selection | |
CN110826609B (en) | Double-current feature fusion image identification method based on reinforcement learning | |
EP1903479A1 (en) | Method and system for data classification using a self-organizing map | |
CN116844179A (en) | Emotion analysis method based on multi-mode cross attention mechanism image-text fusion | |
CN111860823A (en) | Neural network training method, neural network training device, neural network image processing method, neural network image processing device, neural network image processing equipment and storage medium | |
CN112528058A (en) | Fine-grained image classification method based on image attribute active learning | |
CN113298184B (en) | Sample extraction and expansion method and storage medium for small sample image recognition | |
Sun et al. | Perceptual multi-channel visual feature fusion for scene categorization | |
CN113762041A (en) | Video classification method and device, computer equipment and storage medium | |
Zhang et al. | A small target detection algorithm based on improved YOLOv5 in aerial image | |
CN110853072B (en) | Weak supervision image semantic segmentation method based on self-guided reasoning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20231227 Address after: 230026 Jinzhai Road, Baohe District, Hefei, Anhui Province, No. 96 Patentee after: University of Science and Technology of China Patentee after: Zhu Changan Patentee after: Jin Yi Address before: 230026 Jinzhai Road, Baohe District, Hefei, Anhui Province, No. 96 Patentee before: University of Science and Technology of China |
|
TR01 | Transfer of patent right |