CN112232374B

CN112232374B - Irrelevant label filtering method based on depth feature clustering and semantic measurement

Info

Publication number: CN112232374B
Application number: CN202010992837.4A
Authority: CN
Inventors: 蒋雯; 苗旺; 耿杰; 曾庆捷
Original assignee: Northwestern Polytechnical University
Current assignee: Northwestern Polytechnical University
Priority date: 2020-09-21
Filing date: 2020-09-21
Publication date: 2023-04-07
Anticipated expiration: 2040-09-21
Also published as: CN112232374A

Abstract

The invention discloses an irrelevant label filtering method based on depth feature clustering and semantic measurement, which comprises the following steps of: firstly, a sensor acquires an image set; step two, establishing a label set corresponding to the image set; step three, extracting the depth characteristics of the image set image; fourthly, clustering the depth features to obtain a cluster; constructing a related semantic label set of the clustering cluster; step six, constructing a label set to be measured of the cluster; step seven, generating a semantic vector; step eight, calculating the correlation degree of the semantic vector; and step nine, filtering the irrelevant labels according to the relevance. The invention clusters huge sample image data to obtain clustering clusters for pre-classifying the sample image data, and analyzes the clustered sample image data to have higher effectiveness and correctness, and measures the relevance of the label semantics, thereby realizing the automatic filtration of irrelevant labels and improving the generalization and robustness of a deep network.

Description

Irrelevant label filtering method based on depth feature clustering and semantic measurement

Technical Field

The invention belongs to the technical field of deep learning, and particularly relates to an irrelevant label filtering method based on deep feature clustering and semantic measurement.

Background

With the development of artificial intelligence technology, deep learning technology has been widely applied, and has become an indispensable part of people in work and life, which is particularly shown in the fields of computer vision and artificial intelligence. The deep learning technology is a branch of machine learning, and is an algorithm for performing characterization learning on data by taking an artificial neural network as a framework.

The convolutional neural network proposed by YannLecun et al is widely and successfully applied to various image fields such as detection, segmentation, object recognition and the like. These applications use large amounts of tagged data. The premise that the deep learning technology can achieve good results is that the deep learning technology has massive training data, and the acquisition of the massive training data requires a large number of personnel to label the training data, so that the process needs expensive labor and material cost. Even if a pre-training model is obtained by combining label-free data with an unsupervised technology training network, the model with stronger generalization capability can be obtained only by the correlation between the semantic distribution of the training data and the data to be predicted.

The process of manually labeling labels is complicated. For different tasks of deep learning, such as target detection, semantic segmentation and the like, due to the diversity of data sources, some sample information and labels in sample data are irrelevant, and the keyword labels of the sample information play a key role in auditing, retrieving and organizing the sample, so that the irrelevant labeling easily causes that the labeled information cannot accurately reflect the characteristics of the sample data, and the time for fitting parameters of a deep learning model is prolonged, the efficiency is low, and the method is particularly suitable for the deep learning neural network with a complex structure and the deep learning neural network with a large number of layers. The problem of mislabeling of data is always the key research field of computer vision and artificial intelligence, so in order to improve the efficiency of a deep learning model, a technology for filtering irrelevant labels in a data set needs to be researched.

The prior art at present cannot meet the requirement of filtering irrelevant tags in a data set, so that a data set irrelevant tag filtering method which is used for filtering irrelevant tags in the data set, provides convenience for subsequent deep learning tasks and can improve the generalization and robustness of a deep network is urgently needed.

Disclosure of Invention

The technical problem to be solved by the invention is to provide an irrelevant label filtering method based on depth feature clustering and semantic measurement aiming at the defects in the prior art, the irrelevant label filtering method is simple in structure and reasonable in design, a large number of sample image data are clustered to obtain a cluster for pre-classifying the sample image data, the clustered sample image data are analyzed, higher effectiveness and correctness are achieved, and relevance measurement is performed on label semantics, so that the irrelevant label is automatically filtered, and the generalization and robustness of a depth network can be improved.

In order to solve the technical problems, the invention adopts the technical scheme that: the irrelevant label filtering method based on depth feature clustering and semantic measurement is characterized by comprising the following steps of: the method comprises the following steps:

the method comprises the following steps: the sensor acquires an image set X, and stores the image set X in a storage unit, X = { X = { [ X ] ₁ ,...x _i ,...x _n In which x _i Representing the ith sample image data, wherein i is more than or equal to 1 and less than or equal to n, and n is a positive integer;

step two: establishing a label set corresponding to the image set X in a storage unit;

step three, extracting the depth characteristics of the image set image: for sample image data X in image set X _i Extracting depth characteristics to obtain depth characteristics phi (x) _i )；

Step four, clustering depth features to obtain cluster clusters: using a predetermined number k as the cluster number to the depth feature phi (x) _i ) The clustering is carried out and the cluster is obtained, obtaining a cluster set A, A = { A = { (A) ₁ ,...,A _f ,...A _k F is more than or equal to 1 and less than or equal to k, and k is a positive integer;

step five, constructing a related semantic label set of the clustering cluster: obtaining each cluster A according to the original category label set U in the step two _f To cluster a, with a semantic tag of the cluster center _f Semantic label of clustering center as related semantic label y _f Obtaining a related semantic label set Y, Y = { Y) corresponding to the cluster set A ₁ ,...,y _f ,...y _k }；

Step six, constructing a label set to be measured of the clustering cluster: obtaining a label set P to be measured, P = { P = { (P) ₁ ,...,P _f ,...P _k }，P _f Representing a cluster A _f Obtaining each cluster A according to the original category label set U in the step two by the corresponding label set to be measured _f The other category labels except the clustering center are added, and the cluster A is clustered _f Adding labels of other classes except the clustering center into a label set P to be measured _f ，

t is a positive integer;

step seven, generating semantic vectors: taking the related semantic tag set Y and the tag set P to be measured as input, and acquiring all semantic vectors H of the related semantic tag set Y _f And P in the label set P to be measured _f All semantic vectors K of _fg ；

Step eight, calculating the correlation degree of the semantic vector: computer according to formula

Calculating related semantic label set Y and f-th clustering cluster label set P to be measured _f Correlation of each label in _fg In which H _f Representing Guan Yuyi tag Y in the set of related semantic tags Y _f Semantic vector of (2), K _fg Representing P in a set P of labels to be measured _f The semantic vector of the g-th label;

and step nine, irrelevant label filtering is carried out according to the relevance: cluster A is clustered _f Intermediate correlation Sim _fg Tags below the threshold η are deleted.

The irrelevant label filtering method based on depth feature clustering and semantic measurement is characterized by comprising the following steps of: in the third step, the sample image data X in the image set X is subjected to deep convolution residual error neural network model pre-trained in the large image data set Imagenet _i And extracting depth features, wherein the network model consists of a convolutional layer, a residual layer and a full connection layer.

The irrelevant label filtering method based on depth feature clustering and semantic measurement is characterized by comprising the following steps of: in the fourth step, the depth characteristic phi (x) is subjected to spectral clustering algorithm _i ) And clustering, specifically comprising the following steps:

step 401: constructing a depth feature phi (x) _i ) W is the similarity matrix of s _ij The similarity matrix of the composition is formed,

step 402: computing a diagonal matrix D, wherein

Wherein w _ij Carrying out the element for expressing the ith row and the jth column in the similarity matrix W;

step 403: obtaining a depth feature phi (x) from L = D-W _i ) A laplacian matrix L of;

step 404: performing eigenvalue decomposition on the Laplace matrix L to construct an eigenvector space, clustering eigenvectors in the eigenvector space through a clustering algorithm to obtain a cluster set A, A = { A = } ₁ ,...,A _f ,...A _k }。

The irrelevant label filtering method based on depth feature clustering and semantic measurement is characterized by comprising the following steps of: and seventhly, generating semantic vectors by using a near word network model Synonyms.

The irrelevant label filtering method based on depth feature clustering and semantic measurement is characterized by comprising the following steps of: in the ninth step, a correlation threshold η is set by using the cosine distance between the semantic vector of the original category tag set U and the semantic vector of the related semantic tag set Y and the difference between the cosine distance between the semantic vector of the mis-labeled tag set V and the semantic vector of the related semantic tag set Y.

Compared with the prior art, the invention has the following advantages:

1. the invention has simple structure, reasonable design and convenient realization, use and operation.

2. The method adopts the depth convolution residual error neural network model pre-trained in the large image data set Imagenet to extract the depth characteristics, integrates the characteristics of strong learning capability of the depth convolution neural network and good convergence of residual error learning, has more robustness in characteristic extraction and selection, protects the integrity of image information, and improves the performance of results.

3. The invention clusters huge sample image data to obtain clustering clusters for pre-classifying the sample image data, reduces the time required by manual classification, avoids different classification results caused by subjective difference, can better screen image sets by analyzing the clustered sample image data, and has higher effectiveness and correctness.

4. The invention respectively obtains Guan Yuyi label Y in relevant semantic label set Y _f Semantic vector and set of tags to be measured P _f Calculating the semantic vector of the g-th label to be measured to obtain a label set P _f The average value of the cosine distances between the g-th label in the set and each related semantic label in the related semantic label set Y is used as the label set P to be measured _f And (3) carrying out relevance screening on the g-th label and the image set X, and carrying out relevance measurement on the label semantics by the method, thereby realizing automatic filtration of irrelevant labels.

In conclusion, the invention has simple structure and reasonable design, clusters are obtained by clustering huge sample image data and are used for pre-classifying the sample image data, the clustered sample image data are analyzed, higher effectiveness and correctness are achieved, and meanwhile, the relevance measurement is carried out on the label semantics, so that the automatic filtration of irrelevant labels is realized, and the generalization and robustness of a deep network can be improved.

The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.

Drawings

FIG. 1 is a flow chart of the method of the present invention.

Detailed Description

The method of the present invention will be described in further detail below with reference to the accompanying drawings and embodiments of the invention.

It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present invention will be described in detail below with reference to the embodiments with reference to the attached drawings.

It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.

It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are, for example, capable of operation in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

For ease of description, spatially relative terms such as "over … …", "over … …", "over … …", "over", etc. may be used herein to describe the spatial positional relationship of one device or feature to another device or feature as shown in the figures. It will be understood that the spatially relative terms are intended to encompass different orientations of the device in use or operation in addition to the orientation depicted in the figures. For example, if a device in the figures is turned over, devices described as "above" or "on" other devices or configurations would then be oriented "below" or "under" the other devices or configurations. Thus, the exemplary term "above … …" may include both orientations of "above … …" and "below … …". The device may be otherwise variously oriented (rotated 90 degrees or at other orientations) and the spatially relative descriptors used herein interpreted accordingly.

As shown in fig. 1, the present invention comprises the steps of:

the method comprises the following steps: conveying applianceThe sensor acquires an image set X, and stores the image set X in a storage unit, wherein X = { X = { (X) } ₁ ,...x _i ,...x _n In which x _i I is more than or equal to 1 and less than or equal to n, and n is a positive integer.

In actual use, different types of sample image data are acquired through the sensor, different objects are obtained, and the sample image data acquired by the sensor are also different.

Step two: a set of labels corresponding to the image set X is established in the storage unit. In specific implementation, the data set includes an image set X and a tag set, where the tag set includes an original category tag set U and a mis-labeled tag set V, and U = { U = ₁ ,...,u _p ,...u _h }，V＝{v ₁ ,...,v _q ,...v _l And h is more than or equal to 1 and less than or equal to p and less than or equal to l, wherein h and l are positive integers, and l + h = n. Each sample image data X in the image set X _i Corresponds to one of the tags in the original category tag set U. And the label set V is used for storing the finally screened unrelated labels.

Step three, obtaining image depth characteristics: computer-to-image sample image data X in image set X _i Extracting depth characteristics to obtain depth characteristics phi (x) _i )。

In specific implementation, in order to improve the clustering effect, the sample image data X in the image set X needs to be extracted _i Into an appropriate characterization. In specific implementation, the computer adopts a large image data set Imagenet to pre-train a deep convolution residual neural network, and related sample image data x are extracted according to the acquired sensor data _i Depth feature of (x) _i ) The characteristics of strong learning capability of a deep convolutional neural network and good residual learning convergence are integrated, the feature extraction and selection have robustness, the problem of lack of image feature details is solved, the integrity of image information is protected, and the result performance is improved.

The method specifically comprises the following steps: deep convolutional neural network model checking sample image data x with convolution kernel for each layer _i Performing convolution operation, and extracting each sample image data x _i Using a residual network model to convolve the neural network in depthAdding jump connection to the network, sample image data x _i The initial features are directly transmitted to a back layer in a deep convolution neural network to improve the performance of results, and then the full connection layers of the feature input model are spliced to obtain sample image data x _i Depth feature of (x) _i ) Depth feature phi (x) _i ) Involving x relating to sample image data _i Of the depth feature phi (x) and thus _i ) Affecting the final classification filtering effect.

Step four, obtaining clustering clusters: using a predetermined number k as the cluster number to the depth feature phi (x) _i ) Clustering is carried out to obtain a cluster set A, A = { A = } ₁ ,...,A _f ,...A _k F is more than or equal to 1 and less than or equal to k, and k is a positive integer;

the spectral clustering algorithm is based on the similarity matrix, converts the common clustering problem into the graph partitioning problem, is established on the basis of spectrogram theory, is not limited by the space shape of a sample during clustering, and is superior to the traditional clustering algorithm. The spectral clustering algorithm starts from the global state when solving, has the advantage of converging to the global optimal solution, cannot fall into the local optimal solution, can ensure the minimum similarity among different classes and the maximum similarity in the same class, and has better performance and application scene than the traditional clustering algorithm. Therefore, the application preferably adopts a spectral clustering algorithm to perform depth feature phi (x) _i ) And (6) clustering.

The specific process of clustering by adopting the spectral clustering algorithm comprises the following steps: constructing a similarity matrix W of the image set X, wherein W is the matrix composed of s _ij The similarity matrix of the composition is formed,

σ denotes the standard deviation. The similarity matrix W is denoted as W = (W) _ij ) _{i，j＝1,...n} The computer based on the formula->

Computing diagonal matrix D, D = { D = { ₁ ,...d _i ,...d _n }. Obtaining a depth feature phi (x) from L = D-W _i ) Of the laplacian matrix L, performing the characteristic on the laplacian matrix LEigenvalue decomposition, constructing a feature vector space, clustering feature vectors in the feature vector space through a clustering algorithm to obtain a cluster set A, A = { A = } ₁ ,...,A _f ,...A _k }。

The clustering cluster is obtained by clustering huge sample image data and is used for pre-classifying the sample image data, the time required by manual classification is reduced, the different classification results caused by subjective difference are avoided, the image set can be better screened by analyzing the clustered sample image data, the method has higher effectiveness and correctness, and a more reliable method is provided for filtering irrelevant labels.

Step five, constructing a related semantic tag set: obtaining each cluster A according to the original category label set U in the step two _f To cluster a, with a semantic tag of the cluster center _f Semantic label of clustering center as related semantic label y _f Obtaining a relevant semantic label set Y, Y = { Y = { Y = ₁ ,...,y _f ,...y _k }。

Since each sample image data X in the image set X _i All correspond to a label in the original category label set U, so cluster A is clustered _f The cluster center of (b) also corresponds to a semantic tag in the original category tag set U.

Step six, constructing a label set to be measured: obtaining a label set P to be measured, P = { P = { (P) ₁ ,...,P _f ,...P _k }，P _f Represents a cluster A _f Clustering the corresponding label sets to be measured into clusters A _f Combining labels corresponding to the clustering elements except the clustering center to form a label set P to be measured _f ，

t is a positive integer.

Similarly, since each sample image data X in the image set X _i All correspond to a label in the original category label set U, so cluster A is clustered _f And the lower clustering elements except the clustering center also respectively correspond to one semantic label in the original category label set U.

Step seven, generating semantic vectors: taking the related semantic tag set Y and the tag set P to be measured as input, and acquiring all semantic vectors H of the related semantic tag set Y _f And P in the label set P to be measured _f All semantic vectors K of _fg 。

It should be noted that the present application preferably uses the near word network model Synonyms. The near word network model synnyms is a trained word2vec model, the word2vec uses a large amount of data, is trained by using context information, maps words to a low-dimensional space, and is based on distance instead of matching and semantic instead of form on the aspect of algorithm. The near word network model Synonyms is used as a trained word2vec model, can map each word to a vector, can be used for representing the relation between words and words, and has the capability of measuring the correlation degree between the words and the words.

And inputting words into the near meaning word network model Synonyms for prediction, outputting hidden layer variables by the near meaning word network model Synonyms, and calculating parameters according to the hidden layer variables to obtain semantic vectors corresponding to the words. In other words, the near word network model synnym can output the mathematical expression form of the word, namely the semantic vector, according to the input word.

Calculating related semantic label set Y and f-th clustering cluster label set P to be measured _f Correlation of each label in _fg In which H is _f Representing Guan Yuyi tag Y in the set of related semantic tags Y _f Semantic vector of (2), K _fg Representing P in a set P of labels to be measured _f The semantic vector of the g-th label;

Sim _fg representing a set of tags P to be measured _f The average of the cosine distances between the g-th label in the set of related semantic labels Y and each related semantic label in the set of related semantic labels Y, so that Sim is obtained _fg As a set of labels P to be measured _f The correlation degree between the g-th label and the image set X is taken as the label set P to be measured _f The g-th label in (1) should be filtered.

And step nine, irrelevant label filtering is carried out according to the relevance: in the original category label set U, clustering A _f Intermediate correlation Sim _fg Labels below a threshold η are deleted; in the image set X, cluster A is clustered _f Intermediate correlation Sim _fg Sample image data x corresponding to label lower than threshold η _i And deleting the data to obtain a trainable data set, so that the fitting time of the deep learning model parameters is reduced, the fitting efficiency is improved, and the using effect is good.

In specific implementation, a correlation threshold η is set by using the cosine distance between the semantic vector of the original category tag set U and the semantic vector of the related semantic tag set Y and the difference between the cosine distance between the semantic vector of the mis-labeled tag set V and the semantic vector of the related semantic tag set Y.

The above embodiments are only examples of the present invention, and are not intended to limit the present invention, and all simple modifications, changes and equivalent structural changes made to the above embodiments according to the technical spirit of the present invention still fall within the protection scope of the technical solution of the present invention.

Claims

1. The irrelevant label filtering method based on depth feature clustering and semantic measurement is characterized by comprising the following steps of: the method comprises the following steps:

step two: establishing a label set corresponding to the image set X in a storage unit, wherein the label set comprises an original category label set U and a mislabeling label set V;

Step four, clustering depth features to obtain clusters: using a predetermined number k as the cluster number to the depth feature phi (x) _i ) Clustering is carried out to obtain a cluster set A, A = { A = } ₁ ,...,A _f ,...A _k F is more than or equal to 1 and less than or equal to k, and k is a positive integer;

Step six, constructing a label set to be measured of the clustering cluster: obtaining a label set P to be measured, P = { P = { (P) ₁ ,...,P _f ,...P _k }，P _f Represents a cluster A _f Obtaining each cluster A according to the original category label set U in the step two by the corresponding label set to be measured _f The other category labels except the clustering center are added, and the cluster A is clustered _f Adding other category labels except the cluster center into the cluster A _f Corresponding label set P to be measured _f ，

G is more than or equal to 1 and less than or equal to t, and t is a positive integer;

Calculating related semantic label set Y and f-th clustering cluster label set P to be measured _f Correlation of each label in _fg Wherein H _f | | represents the Guan Yuyi label Y in the relevant semantic label set Y _f Of the semantic vector, | K _fg I represents P in label set P to be measured _f The semantic vector of the g-th label;

2. The method of claim 1 for uncorrelated label filtering based on depth feature clustering and semantic metrics, wherein: in the third step, the sample image data X in the image set X is subjected to deep convolution residual error neural network model pre-trained in the large image data set Imagenet _i And extracting depth features, wherein the network model consists of a convolutional layer, a residual layer and a full connection layer.

3. The method of claim 1 for irrelevant label filtering based on depth feature clustering and semantic metrics, wherein: in the fourth step, the depth characteristic phi (x) is subjected to spectral clustering algorithm _i ) And clustering, specifically comprising the following steps:

σ represents the standard deviation;

step 402: computing diagonal matrix D, D = { D = { ₁ ,...d _i ,...d _n Therein of

step 404: performing eigenvalue decomposition on the Laplace matrix L to construct an eigenvector space, and clustering eigenvectors in the eigenvector space by a clustering algorithm to obtainTo cluster set a, a = { a = { a = } ₁ ,...,A _f ,...A _k }。

4. The method of claim 1 for irrelevant label filtering based on depth feature clustering and semantic metrics, wherein: and seventhly, generating semantic vectors by using a near word network model Synonyms.

5. The method of claim 1 for uncorrelated label filtering based on depth feature clustering and semantic metrics, wherein: in the ninth step, a correlation threshold η is set by using the cosine distance between the semantic vector of the original category tag set U and the semantic vector of the related semantic tag set Y and the difference between the cosine distance between the semantic vector of the mis-labeled tag set V and the semantic vector of the related semantic tag set Y.