CN113869333A

CN113869333A - Image identification method and device based on semi-supervised relationship measurement network

Info

Publication number: CN113869333A
Application number: CN202111428246.5A
Authority: CN
Inventors: 房体品; 王瑞丰; 袭肖明; 杨光远
Original assignee: Shandong Liju Robot Technology Co ltd
Current assignee: Shandong Liju Robot Technology Co ltd
Priority date: 2021-11-29
Filing date: 2021-11-29
Publication date: 2021-12-31
Anticipated expiration: 2041-11-29
Also published as: CN113869333B

Abstract

The invention relates to an image identification method and device based on a semi-supervised relation measurement network, wherein the method comprises the following steps: performing data expansion on all tagged data and non-tagged data in the image data set to obtain an expanded image data set; clustering the labeled data in the expanded image data set to obtain the category atomic image data of each category; randomly adding noise to the expanded image data set, and inputting the image data set with noise and the class atom image data into a semi-supervised relationship measurement network model to obtain different class template comparison scores of the labeled data and the unlabeled data; calculating cross entropy loss and mean square error loss according to the comparison scores of different types of templates; training according to cross entropy loss and mean square error loss to obtain a trained semi-supervised relationship measurement network model; and identifying the image to be identified through the trained semi-supervised relationship measurement network model so as to determine the category of the image to be identified.

Description

Image identification method and device based on semi-supervised relationship measurement network

Technical Field

The invention relates to the technical field of image recognition, in particular to an image recognition method and device based on a semi-supervised relation measurement network.

Background

With the rapid development of machine learning technology, many scholars at home and abroad use machine learning methods to solve the problems in various industries, and the machine learning methods also show great value in the field of image identification and classification. Since the end of the last century, many scholars have conducted extensive and intensive research on image classification methods, and many image classification methods have been developed in the field, including common classification techniques based on single patterns such as wavelets, neural networks, bayesian networks, association rules, decision trees, rough sets, and the like, and then various classifiers and distributed systems have been combined. Wherein the deep learning method is excellent.

Although the research in this field has been extensive and intensive, and the achievement is very excellent, some application image fields (such as medical image fields) have strong specialization, complicated sample data, large data quantity difference, various characteristics and large sample data labeling cost, so that researchers face huge challenges, and various classification methods are not mature at present, and a large number of researchers are still required to continuously and intensively research the image recognition and classification fields.

Some image data must be labeled manually by experts in the field, and the labeling of sample data is time-consuming and labor-consuming, so that the labeling cost of the sample data is huge, and therefore, the obtained common data set contains no label or only contains a small amount of labels. This is certainly a disaster for the existing big data driven based deep learning. The semi-supervised learning algorithm is specially proposed for solving the problem, only a small number of labels are needed for learning as deep learning, and the learning effect can be even better than that of partial pure supervised learning.

Some special application fields have privacy or relate to commercial confidentiality, only a small part of the data sets are externally disclosed, the number of the data sets is small, the categories of the data sets are unbalanced, especially the data sets of some medical cold departments can only be hundreds of images, and the learning process of a deep learning network cannot be supported obviously.

Disclosure of Invention

In order to overcome the problems in the related art, the invention provides an image identification method and device based on a semi-supervised relationship measurement network, so that the problems of small data volume, unbalanced category and few data set labels in an image data set are effectively solved, and a better classification effect in the field of image classification is realized.

According to a first aspect of the embodiments of the present invention, there is provided an image identification method based on a semi-supervised relationship metric network, the method including:

performing data expansion on all tagged data and non-tagged data in the image data set to obtain an expanded image data set;

clustering the labeled data in the expanded image data set to obtain the category atomic image data of each category;

randomly adding noise to the expanded image data set, and inputting the image data set with noise and the class atom image data into a semi-supervised relationship measurement network model to obtain different class template comparison scores of the labeled data and the unlabeled data;

calculating cross entropy loss and mean square error loss according to the comparison scores of different types of templates;

training according to cross entropy loss and mean square error loss to obtain a trained semi-supervised relationship measurement network model;

and identifying the image to be identified through the trained semi-supervised relationship measurement network model so as to determine the category of the image to be identified.

In one embodiment, preferably, the data expansion is performed on all tagged data and non-tagged data in the image data set to obtain an expanded image data set, and the method includes:

performing size transformation on all labeled data and unlabeled data in the image data set to convert the labeled data and the unlabeled data into preset sizes;

and performing data expansion on all the labeled data and the unlabeled data by using a random data enhancement technology to obtain an expanded image data set.

In one embodiment, preferably, the randomly denoising the extended image dataset, and inputting the denoised image dataset and the category atom image data into the semi-supervised relational metric network model to obtain different category template comparison scores of the labeled data and the unlabeled data, including:

carrying out random noise addition on the expanded image data set, and inputting the noise-added image data set and the category atom image data into a feature extraction network of a semi-supervised relationship measurement network model to obtain a feature vector of labeled data, a feature vector of unlabelled data and a category atom vector;

integrating the characteristic vector of the labeled data, the characteristic vector of the unlabeled data and the category atom vector, inputting the integrated characteristic vector and the category atom vector into a comparison network of a semi-supervised relational metric network model to obtain comparison scores of different categories of templates of the labeled data and the unlabeled data, and weighting and recording the comparison scores of all the data.

In one embodiment, preferably, the cross-entropy loss and mean square error loss are calculated from different classes of template comparison scores, including:

determining the category corresponding to the maximum score value of the labeled data as a first target prediction category by using the comparison score of the labeled data;

calculating cross entropy loss between the first target prediction category and the real label category;

determining the category corresponding to the maximum score value as a second target prediction category by using the comparison scores and the historical weighted comparison scores of all the data;

a loss of mean square error between the second target prediction class and the prediction class corresponding to the historical weighting is calculated.

In one embodiment, preferably, the training is performed according to cross entropy loss and mean square error loss to obtain a trained semi-supervised relationship metric network model, including:

carrying out weighted summation on the cross entropy loss and the mean square error loss to obtain total loss;

training by using the total loss until the training round reaches a set value;

and determining the semi-supervised relation metric network model with the minimum total loss as the trained semi-supervised relation metric network model.

According to a second aspect of the embodiments of the present invention, there is provided an image recognition apparatus based on a semi-supervised relational metric network, the apparatus including:

the expansion module is used for performing data expansion on all the tagged data and the non-tagged data in the image data set to obtain an expanded image data set;

the clustering module is used for carrying out clustering operation on the labeled data in the expanded image data set to obtain the category atomic image data of each category;

the comparison module is used for randomly adding noise to the expanded image data set, inputting the image data set with noise and the class atom image data into the semi-supervised relationship measurement network model so as to obtain different class template comparison scores of the labeled data and the unlabeled data;

the calculation module is used for calculating cross entropy loss and mean square error loss according to the comparison scores of the templates of different categories;

the training module is used for training according to cross entropy loss and mean square error loss to obtain a trained semi-supervised relationship measurement network model;

and the recognition module is used for recognizing the image to be recognized through the trained semi-supervised relationship measurement network model so as to determine the category of the image to be recognized.

In one embodiment, preferably, the expansion module comprises:

the processing unit is used for carrying out size conversion on all labeled data and unlabeled data in the image data set to convert the labeled data and the unlabeled data into preset sizes;

and the expansion unit is used for performing data expansion on all the labeled data and the unlabeled data by utilizing a random data enhancement technology to obtain an expanded image data set.

In one embodiment, preferably, the comparison module includes:

the feature extraction unit is used for randomly adding noise to the expanded image data set, inputting the image data set and the category atom image data which are added with noise into a feature extraction network of the semi-supervised relationship measurement network model, and obtaining feature vectors of labeled data, feature vectors of unlabeled data and category atom vectors;

and the score calculating unit is used for integrating the characteristic vector of the labeled data, the characteristic vector of the unlabeled data and the class atom vector and inputting the integrated characteristic vector into a comparison network of the semi-supervised relationship measurement network model to obtain the comparison scores of different classes of templates of the labeled data and the unlabeled data, and weighting and recording the comparison scores of all the data.

In one embodiment, preferably, the calculation module comprises:

the first determining unit is used for determining the category corresponding to the maximum score value of the labeled data as a first target prediction category by using the comparison score of the labeled data;

a first calculation unit for calculating a cross entropy loss between the first target prediction class and the real label class;

the second determining unit is used for determining the category corresponding to the maximum score value as a second target prediction category by using the comparison scores of all the data and the history weighted comparison scores;

and the second calculation unit is used for calculating the mean square error loss between the second target prediction class and the prediction class corresponding to the history weight.

In one embodiment, preferably, the training module comprises:

the third calculation unit is used for carrying out weighted summation on the cross entropy loss and the mean square error loss to obtain the total loss;

the training unit is used for training by using the total loss until the training round reaches a set value;

and the model determining unit is used for determining the semi-supervised relationship measurement network model with the minimum total loss as the trained semi-supervised relationship measurement network model.

According to a third aspect of the embodiments of the present invention, there is provided an image recognition apparatus based on a semi-supervised relational metric network, the apparatus including:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to:

According to a fourth aspect of embodiments of the present invention, there is provided a computer-readable storage medium having stored thereon computer instructions which, when executed by a processor, implement the steps of the method of any one of the first aspects.

The technical scheme provided by the embodiment of the invention can have the following beneficial effects:

in the embodiment of the invention, a small sample measurement learning idea is utilized aiming at the particularity of the image classification field, the problems of too few original data samples and unbalanced category are solved, meanwhile, a random data enhancement technology used in a preprocessing stage also makes a contribution to the problem, a semi-supervised learning idea is utilized aiming at the particularity of the image classification field, a deep learning training process can be completed only by using a small amount of labeled data, and a good learning effect is achieved. The algorithm of basic small sample learning and semi-supervised learning idea can meet the deep learning requirement of most image classification tasks. The invention integrates the two ideas into the same depth network model, realizes the combination of the advantages of the two ideas and realizes better classification effect in the field of image classification.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.

Fig. 1 is a flowchart illustrating an image recognition method based on a semi-supervised relational metric network according to an exemplary embodiment.

Fig. 2 is a flowchart illustrating step S101 in an image recognition method based on a semi-supervised relational metric network according to an exemplary embodiment.

Fig. 3 is a flowchart illustrating step S103 in an image recognition method based on a semi-supervised relational metric network according to an exemplary embodiment.

Fig. 4 is a flowchart illustrating step S104 in an image recognition method based on a semi-supervised relational metric network according to an exemplary embodiment.

Fig. 5 is a flowchart illustrating step S105 of an image recognition method based on a semi-supervised relational metric network according to an exemplary embodiment.

Fig. 6 is a block diagram illustrating a semi-supervised relationship metric network based image recognition apparatus according to an exemplary embodiment.

Fig. 7 is a block diagram illustrating an expansion module in an image recognition device based on a semi-supervised relationship metric network according to an exemplary embodiment.

Fig. 8 is a block diagram illustrating a comparison module in an image recognition apparatus based on a semi-supervised relationship metric network according to an exemplary embodiment.

Fig. 9 is a block diagram illustrating computing modules in a semi-supervised relationship metric network based image recognition device, according to an example embodiment.

FIG. 10 is a block diagram illustrating a training module in a semi-supervised relationship metric network based image recognition apparatus according to an exemplary embodiment.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the invention, as detailed in the appended claims.

Fig. 1 is a flowchart illustrating an image recognition method based on a semi-supervised relationship metric network according to an exemplary embodiment, and the method includes:

step S101, performing data expansion on all tagged data and non-tagged data in an image data set to obtain an expanded image data set;

step S102, clustering the labeled data in the expanded image data set to obtain the category atomic image data of each category;

step S103, carrying out random noise adding processing on the expanded image data set, and inputting the noise added image data set and the class atom image data into a semi-supervised relationship measurement network model to obtain different class template comparison scores of the labeled data and the unlabeled data;

step S104, calculating cross entropy loss and mean square error loss according to the comparison scores of the templates of different categories;

step S105, training according to cross entropy loss and mean square error loss to obtain a trained semi-supervised relationship measurement network model;

and S106, identifying the image to be identified through the trained semi-supervised relationship measurement network model so as to determine the category of the image to be identified.

In one embodiment, preferably, step S101 includes steps S201 to S202:

step S201, performing size conversion on all labeled data and unlabeled data in the image data set to convert the labeled data and the unlabeled data into preset sizes;

the original data sample may have a condition of inconsistent image size, which is not favorable for the deep network model to learn. Therefore, an existing dataset is first scaled to a uniform size using transforms in a pytorech.

Step S202, data expansion is carried out on all the labeled data and the unlabeled data by utilizing a random data enhancement technology, and an expanded image data set is obtained.

Due to the large amount of missing data, the original data needs to be augmented. And (4) expanding the data set by using a random data enhancement module, wherein the labels of the labeled data set and the unlabeled data set are unchanged after the expansion. It is important to note that the augmented tagged data must ensure that each type of data is balanced, i.e., that each type of data is substantially uniform in volume.

As shown in fig. 3, in one embodiment, preferably, the step S103 includes:

step S301, carrying out random noise adding processing on the expanded image data set, and inputting the noise added image data set and the category atom image data into a feature extraction network of a semi-supervised relationship measurement network model to obtain feature vectors of labeled data, feature vectors of unlabeled data and category atom vectors;

and carrying out random noise addition operation on the labeled data and the unlabeled data, wherein the noise comprises the random combination of four modes of displacement, image brightness change, contrast and saturation. The change values of the displacement value, the image brightness, the contrast and the saturation all adopt random numbers in a certain range.

Loading the noisy data set and the category atom image data at the same time, inputting the data into a feature extraction network of a semi-supervised relational metric network model, and enabling the data to pass through a feature extraction modulef(x,

) (whereinxIs the input vector of the input vector,

is a model parameter), and then extracts the key characteristic information vector of the image, and simultaneously obtains a category atom vector V = &v ₁ ,v ₂ ,...,v _c}⊂R^m(whereincIndicating the number of categories). The tagged dataset vector isX _l={x ₁ ,x ₂ ,...,x _n}⊂R^m(whereinnRepresenting the amount of tagged data), the unlabeled dataset vector isX _u={x _n+1 ,x _n+2 ,...,x _N}⊂R^m(whereinN-nIndicating the amount of unlabeled data). The feature extraction module is composed of a shallow convolutional neural network. The feature vectors are extracted from the same network model and thus belong to the same feature space.

Step S302, integrating the characteristic vector of the labeled data, the characteristic vector of the unlabeled data and the category atom vector, inputting the integrated characteristic vector into a comparison network of a semi-supervised relational metric network model to obtain the comparison scores of different types of templates of the labeled data and the unlabeled data, and weighting and recording the comparison scores of all the data.

Feature vector of data to be taggedX _lAnd unlabeled data feature vectorsX _uAnd a class atom vector V throughC(f(x,

),f(j,

) (wherein)CRepresenting two vector composite processes) and then input into a comparison network modelg(y,φ) (wherein y isCThe result value of (c,),φis a model parameter), the contrast scores of different types of templates of the labeled data and the unlabeled data are respectively z_l ,z_uThe sum is denoted z, and the history is updated

(where α is a hyperparameter).

As shown in fig. 4, in one embodiment, preferably, the step S104 includes:

step S401, determining the category corresponding to the maximum score value as a first target prediction category by using the comparison score of the labeled data;

step S402, calculating cross entropy loss between the first target prediction category and the real label category;

comparing scores z using labeled data_lCalculating the category corresponding to the maximum score as the predicted categoryc _l. To predict classesc _lAnd true tag categoriesy _lPerforming cross entropy loss calculation

。

Step S403, determining the category corresponding to the maximum score value as a second target prediction category by using the comparison scores and the history weighted comparison scores of all the data;

in step S404, the mean square error loss between the second target prediction class and the prediction class corresponding to the history weight is calculated.

Using the whole dataThe contrast score z and the historical contrast score

Respectively calculating the category corresponding to the maximum score as the predicted category c,

. The prediction result c output at present and the prediction result output in history are output

Mean square error loss calculation

(whereinω(. cndot.) represents a gradient function,tis the global number of iterations).

As shown in fig. 5, in one embodiment, preferably, the step S105 includes:

step S501, carrying out weighted summation on cross entropy loss and mean square error loss to obtain total loss;

step S502, training by using the total loss until the training round reaches a set value;

step S503, determining the semi-supervised relationship metric network model with the minimum total loss as the trained semi-supervised relationship metric network model.

Loss of cross entropy Loss_cSum-mean-square Loss_mWeighted sum as total Loss = Loss_c+λLoss_m(whereinλIs a hyper-parameter), the training is continuously carried out, so that the Loss presents a descending trend until the training round reaches a set value or the Loss presents a steady trend. Network model obtained by saving minimum loss valuef(x,

)，g(y,φ)。

As shown in fig. 6, according to a second aspect of the embodiments of the present invention, there is provided an image recognition apparatus based on a semi-supervised relationship metric network, the apparatus including:

an expansion module 61, configured to perform data expansion on all tagged data and non-tagged data in the image data set to obtain an expanded image data set;

the clustering module 62 is configured to perform clustering operation on the tagged data in the expanded image data set to obtain category atomic image data of each category;

the comparison module 63 is configured to perform random noise addition on the extended image data set, and input the noise-added image data set and the category atom image data into the semi-supervised relationship measurement network model to obtain different category template comparison scores of labeled data and unlabeled data;

the calculation module 64 is used for calculating cross entropy loss and mean square error loss according to the comparison scores of the templates of different categories;

the training module 65 is configured to perform training according to the cross entropy loss and the mean square error loss to obtain a trained semi-supervised relationship metric network model;

and the identification module 66 is used for identifying the image to be identified through the trained semi-supervised relationship metric network model so as to determine the category of the image to be identified.

As shown in fig. 7, in one embodiment, the expansion module 61 preferably includes:

a processing unit 71, configured to perform size conversion on all tagged data and non-tagged data in the image data set to convert the tagged data and non-tagged data into a preset size;

and an expansion unit 72, configured to perform data expansion on all tagged data and non-tagged data by using a random data enhancement technique, so as to obtain an expanded image data set.

As shown in fig. 8, in one embodiment, the comparison module 63 preferably includes:

the feature extraction unit 81 is configured to perform random noise addition on the extended image data set, and input the noise-added image data set and the category atom image data into a feature extraction network of the semi-supervised relationship metric network model to obtain a feature vector of labeled data, a feature vector of unlabeled data, and a category atom vector;

and the score calculating unit 82 is used for integrating the feature vectors of the labeled data, the feature vectors of the unlabeled data and the class atom vectors, inputting the integrated feature vectors and the class atom vectors into a comparison network of the semi-supervised relational metric network model, obtaining the comparison scores of different classes of templates of the labeled data and the unlabeled data, and weighting and recording the comparison scores of all the data.

As shown in fig. 9, in one embodiment, the calculation module 64 preferably includes:

a first determining unit 91 configured to determine, by using the comparison score of the labeled data, a category corresponding to a maximum score value as a first target prediction category;

a first calculation unit 92 for calculating a cross entropy loss between the first target prediction class and the real label class;

a second determining unit 93, configured to determine, by using the comparison scores of all the data and the history weighted comparison score, a category corresponding to a maximum score of the comparison scores as a second target prediction category;

a second calculation unit 94 for calculating a mean square error loss between the second target prediction class and the prediction class corresponding to the history weight.

As shown in fig. 10, in one embodiment, preferably, the training module 65 includes:

a third calculating unit 1001, configured to perform weighted summation on the cross entropy loss and the mean square error loss to obtain a total loss;

a training unit 1002, configured to perform training using the total loss until the training round reaches a set value;

a model determining unit 1003, configured to determine the semi-supervised relationship metric network model with the smallest total loss as the trained semi-supervised relationship metric network model.

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to:

It is further understood that the term "plurality" means two or more, and other terms are analogous. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. The singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

It will be further understood that the terms "first," "second," and the like are used to describe various information and that such information should not be limited by these terms. These terms are only used to distinguish one type of information from another and do not denote a particular order or importance. Indeed, the terms "first," "second," and the like are fully interchangeable. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present invention.

It is further to be understood that while operations are depicted in the drawings in a particular order, this is not to be understood as requiring that such operations be performed in the particular order shown or in serial order, or that all illustrated operations be performed, to achieve desirable results. In certain environments, multitasking and parallel processing may be advantageous.

Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.

It will be understood that the invention is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.

Claims

1. An image identification method based on a semi-supervised relationship measurement network is characterized by comprising the following steps:

2. The method of claim 1, wherein data augmenting all tagged data and non-tagged data in the image dataset to obtain an augmented image dataset comprises:

3. The method of claim 1, wherein randomly noising the augmented image dataset and inputting the noised image dataset and the class atom image data into a semi-supervised relational metric network model to obtain different class template comparison scores for tagged data and untagged data, comprises:

4. The method of claim 1, wherein calculating cross-entropy loss and mean-square-error loss based on different classes of template comparison scores comprises:

5. The method of claim 1, wherein the training according to cross entropy loss and mean square error loss to obtain the trained semi-supervised relationship metric network model comprises:

training by using the total loss until the training round reaches a set value;

6. An image recognition apparatus based on a semi-supervised relational metric network, the apparatus comprising:

7. The apparatus of claim 6, wherein the expansion module comprises:

8. The apparatus of claim 6, wherein the comparison module comprises:

9. The apparatus of claim 6, wherein the computing module comprises:

10. The apparatus of claim 6, wherein the training module comprises: