CN111738343B

CN111738343B - Image labeling method based on semi-supervised learning

Info

Publication number: CN111738343B
Application number: CN202010589985.1A
Authority: CN
Inventors: 宫恩来; 杭丽君; 熊攀; 何远彬; 沈磊; 丁明旭; 张尧
Original assignee: Hangzhou Dianzi University
Current assignee: Hangzhou Dianzi University
Priority date: 2020-06-24
Filing date: 2020-06-24
Publication date: 2024-07-26
Anticipated expiration: 2040-06-24
Also published as: CN111738343A

Abstract

The invention discloses an image labeling method based on semi-supervised learning, which is characterized in that different classifiers are designed for samples of different categories, partial labeled samples are utilized to train the classifier, the results of the different classifiers are voted, and the category with the highest accuracy is selected, so that unknown samples are labeled. In order to reduce the influence caused by error classification, the samples in each category obtained by the classifier and the samples in the labeled corresponding category are subjected to random linear mixing operation, so that the error classification result also contains the characteristics of the corresponding category, and a new idea is provided for the fields of deep learning and machine learning by semi-supervised learning.

Description

Image labeling method based on semi-supervised learning

Technical Field

The invention belongs to the field of semi-supervised learning, and relates to a method for image labeling based on semi-supervised learning.

Background

In recent years, the introduction of deep learning technology and the maturation of machine learning technology make breakthrough progress in the field of computer vision, and the traditional problems in the field of computer vision, such as classification problems, detection problems and semantic segmentation problems, are better solved. However, most computer vision tasks are supervised learning tasks, which means that all input data needs to be manually marked, people need to acquire related pictures, and carefully mark the pictures by using related software, which is very labor and material consuming, in the field of object detection, such as the coco data set contains eighty classes, and seventy-five thousands of pictures in total, each of which needs to manually find out the objects to be marked one by one and mark the corresponding objects with rectangular frames, and the marking of such large data sets generally needs hundreds of people to spend several worship or even months. Errors can be avoided during manual annotation, errors in annotation can cause some influence on later training, and error correction is difficult, so that a method is needed to solve the problem of data set annotation, improve the annotation accuracy and save some labor cost. Semi-supervised learning is always a research hotspot in the field of machine learning, is a method for combining supervised learning and non-supervised learning, trains a model through a part of marked files and a large number of unmarked files, can greatly lighten the burden of staff, but the current semi-supervised learning method cannot enable the training result of the model to be the same as the supervised training result, so that the supervised learning is still the main stream in the field of computer vision, the semi-supervised learning process and the supervised learning have a common place, the semi-supervised learning needs to mark part of data, a small amount of marked data can be used for classifying and marking the unmarked data, and the model trained by a small amount of samples also has a good classification effect through the prior art, so the method for assisting the marking by using the semi-supervised learning method is provided.

Disclosure of Invention

In order to solve the problems, the technical scheme of the invention is a method for image labeling based on semi-supervised learning, which comprises the following steps:

The method comprises the following steps:

S10, adding background types: the target is divided into A class and B class, and the background class formed by random sampling of other classes than A or B class is introduced first;

S20, constructing a cross network classification model: marking M classified picture data, training a model between every two classes by using a deep learning network, so that M (M-1) models are all used, different networks are selected during training of the classes A, B in different sequences, the marked data are trained through the M (M-1) models, and M-1 independent classification models of one class are marked as a class learner, and M class learners are all used;

s30, forming a sub-voter: predicting the class of unlabeled data through M class learners, forming a voting subset by all results comprising a certain class, and sharing M sub-voters according to the number of the classes, wherein each sub-voter comprises 2M-2 groups of different prediction results, each group comprises prediction of a certain class, and when the overlapping condition of class Avs class B and class Bvs class A occurs, the training networks of the class Avs class B and the class Bvs class A are different, and the prediction is also different;

S40, voting according to a mutual exclusion voter: each sub-voter in the M sub-voters generates 2M-2 prediction probabilities for the same picture, a series of rules and thresholds are set, so long as the voting result generated by the sub-voter exceeds the thresholds, the sample is considered to belong to the category corresponding to the set, only one picture of the prediction category of the voting result is reserved, and the prediction label is marked.

S50, correcting the error label based on random linear mixing: the newly marked samples are mixed with the original same-label samples in random proportion, so that interference caused by the erroneously marked samples to network training is restrained.

Preferably, the voting comprises the steps of:

S41, calculating a weighted average of accuracy by each sub-voter, wherein the accuracy of each classification network is different, and calculating a corresponding weight coefficient according to the accuracy, so that the predicted result of the model with high accuracy has larger influence on the final annotation;

S42, each sub-voter set scores the picture, and counts the number N of sub-voters exceeding a threshold;

s43, only the data with n=1 is reserved and labeled with a predictive label.

In the voting process, when the probability of two sub-voters appearing exceeds a threshold value, the representative sample may belong to two different labels, and the sub-voters judge that the voting fails and discard the sample.

Preferably, the random linear mixture fuses similar data, no new label is generated, and the calculation formula is as follows:

Wherein x _i is a marked picture of the same category as x _j, x _j is a marked file output by the model, beta is a random parameter, and the value range is 0-1.

Preferably, the threshold is 0.95.

The beneficial effects of the invention are as follows: aiming at the condition that a small amount of marked data and a large amount of unmarked data exist in the image marking, the invention provides a method for predicting the large amount of unmarked data by adopting semi-supervised integrated learning, thereby realizing accurate category prediction of the large amount of unmarked data, providing a random linear mixing method according to the process of learning the image data by a deep learning network, and realizing more accurate image marking.

Drawings

FIG. 1 is a flow chart of steps of a method for image annotation based on semi-supervised learning according to an embodiment of the present invention;

Fig. 2 is a schematic diagram of a method for image labeling based on semi-supervised learning according to an embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

On the contrary, the invention is intended to cover any alternatives, modifications, equivalents, and variations as may be included within the spirit and scope of the invention as defined by the appended claims. Further, in the following detailed description of the present invention, certain specific details are set forth in order to provide a better understanding of the present invention. The present invention will be fully understood by those skilled in the art without the details described herein.

Referring to fig. 1, a flowchart of steps of a method for image labeling based on semi-supervised learning according to an embodiment of the present invention includes the following steps:

First, a background class is introduced for every two kinds of basic classifiers to resist the predicted interference problem of unknown class samples. In this way, the two classification models of the base learner are converted into three classification models, the proportion of the three classes A-B-background classes is 1:1:1, and the background class pictures are randomly extracted from other classes in equal proportion;

Taking M as a specific embodiment, 20 is taken as a specific embodiment, and the data labeling system is constructed based on ensemble learning, so that weak classifiers trained by deep learning between every two different classes A-B are integrated. Finally, the pictures are marked on the basis of the integrated and learned models. The 20 classified picture datase:Sub>A are marked, ase:Sub>A model is trained between every two classes by using ase:Sub>A deep learning network, so 380 models (20X 19) are combined, weak classifiers participating in integration are independent as much as possible, the class A is trained by using the same network when the class B-A is repeated in order to avoid A-B, the class A is indexed, when the index A is larger than B, the deep learning model PnasNet (batch_size=64) is selected, and when the index X is smaller than Y, the deep learning model SeNet (batch_size=32) is selected. Different batch_size may make the characteristics learned by the network inconsistent, so different batch_size is selected for both cases. Training a small amount of marked data through 380 models, and recording 19 independent classification models of a certain class as a class learner, so that 20 class learners are all provided

S30, forming a sub-voter: predicting the class of unlabeled data through M class learners, forming a voting subset sum of all results comprising a certain class, wherein according to the number of the classes, M voting subset sums are shared, each voting subset sum comprises 2M-2 groups of different prediction results, each group comprises prediction of a certain determined class, and when the overlapping condition of class Avs class B and class Bvs class A occurs, the training networks of the class Avs class B and the class Bvs class A are different, and the prediction is also different;

A large amount of unlabeled data is passed through 20 classics to predict categories. All results comprising a certain class are formed into a sub-voter. So there are a total of 20 sub-voters, based on the number of categories, each sub-voter containing 38 different sets of predictions, each of the 38 sets of predictions containing predictions for a particular category. As described above, when the overlapping cases of "class Avs class B" and "class Bvs class a" occur, the networks trained by the two are different, so the predictions are also different.

S40, voting according to a mutual exclusion voter: each sub-voter in the M sub-voters generates 2M-2 prediction probabilities for the same picture, a series of rules and thresholds are set, and as long as the voting result generated by the sub-voter exceeds the thresholds, the sample is considered to belong to the category corresponding to the sub-voter set, only one picture of the prediction category of the voting result is reserved, and the prediction label is marked;

Each sub-voter set contains 38 prediction probabilities for the same picture, each sub-voter calculates a weighted average of accuracy, each classification network has different accuracy, and corresponding weight coefficient is calculated according to the accuracy, so that the predicted result of the model with high accuracy has larger influence on final annotation; then, each sub-voter set scores the picture, and counts the number N of sub-voters exceeding a threshold value of 0.95. Finally, only data with N=1 is reserved, and a prediction label is marked on the data to complete mutual exclusion voting prediction;

S50, correcting the error label based on random linear mixing: mixing the newly marked sample with the original same-label sample in random proportion, and inhibiting interference brought by the erroneously marked sample on network training;

The invention provides an RLB data mixing enhancement algorithm (Random linear blending), wherein the newly marked sample is mixed with the original same-tag sample in a random proportion, so that the interference of the erroneously marked sample on the network training is inhibited.

The voting comprises the following steps:

s43, only the data with n=1 is reserved and labeled with a predictive label.

The random linear mixing fuses similar data, no new label is generated, and the calculation formula is as follows:

Wherein x _i is a marked picture of the same category as x _j, x _j is a marked file output by the model, beta is a random parameter, and the value range is 0-1. Referring to fig. 2, a specific effect of the operation of the method of the present invention is shown.

The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, and alternatives falling within the spirit and principles of the invention.

Claims

1. The image labeling method based on semi-supervised learning is characterized by comprising the following steps of:

S30, forming a sub-voter: predicting the class of unlabeled data through M class learners, forming all results comprising a certain class into one sub-voter, forming a total of M sub-voters according to the number of the classes, wherein each sub-voter comprises 2M-2 groups of different prediction results, each group comprises prediction of a certain determined class, and when the overlapping condition of class Avs class B and class Bvs class A occurs, the training networks of the class Avs class B and the class Bvs class A are different, and the prediction is different;

S40, voting according to a mutual exclusion voter: each sub-voter in the M sub-voters generates 2M-2 prediction probabilities for the same picture, a series of rules and thresholds are set, so long as the voting result generated by the sub-voter exceeds the thresholds, the sample is considered to belong to the category corresponding to the set, only one picture of the prediction category of the voting result is reserved, and the prediction label is marked;

2. The method of claim 1, wherein the voting comprises the steps of:

s43, only reserving data with N=1, and marking a predictive label for the data;

In the voting process, when the probability of occurrence of two sub-voters exceeds a threshold value, the representative sample may belong to two different labels, the voter judges that the voting fails, and the sample is abandoned.

3. The method of claim 1, wherein the random linear mixture fuses homogeneous data without generating new tags, and the calculation formula is:

4. The method of claim 1, wherein the threshold is 0.95.