CN113420821A

CN113420821A - Multi-label learning method based on local correlation of labels and features

Info

Publication number: CN113420821A
Application number: CN202110734886.2A
Authority: CN
Inventors: 程倩倩; 黄�俊; 张辉宜
Original assignee: Anhui University of Technology AHUT
Current assignee: Anhui University of Technology AHUT
Priority date: 2021-06-30
Filing date: 2021-06-30
Publication date: 2021-09-21

Abstract

The invention discloses a multi-label learning method based on local correlation of labels and features, belonging to the technical field of machine learning; the method includes the steps that local feature correlation is added while local mark correlation is considered and utilized to classify multiple marks, training data are decomposed into a plurality of data subsets through clustering, the local mark correlation and the local feature correlation are obtained in each data subset through mark manifold regularization and feature manifold regularization, a model coefficient is learned for each group of examples respectively, and a new regularization term is added to consider the relation among all clustering data subsets; the feature correlation helps to remove redundant features, and the label correlation and the feature correlation are not globally shared, so that different label correlations and feature correlations are allocated to different examples through the local label correlation and the local feature correlation, the classification performance of the multi-label classification model is improved, and a multi-label learning task is better performed.

Description

Multi-label learning method based on local correlation of labels and features

Technical Field

The invention relates to the technical field of machine learning, in particular to a multi-label learning method based on local correlation of labels and features.

Background

In traditional supervised learning, an instance usually corresponds to only one category label, but in real life, an object often has multiple semantics. For example, a news report can cover a plurality of subjects at the same time, and the news report can be divided into different categories such as politics, economy, sports and the like from different aspects; the movie "the stabbing novice" belongs to three movie categories of "action", "fantasy" and "adventure" at the same time; the published academic papers generally contain a plurality of keywords, so as to improve the retrieval efficiency of the academic papers in a retrieval system. Multi-label learning is used to solve the above text classification problem of labeling multiple keywords. With the attention of more and more scholars, multiple labels are rapidly developed, and multiple label learning is successfully applied to various research fields such as information retrieval, image classification, clinical data analysis, biomedical classification and the like.

The multi-label learning is an important research subject in machine learning and data mining, and aims to train a classification model and predict all relevant labels for invisible instances by using the trained model. There are many algorithms for solving the multi-label learning problem, and the existing methods can be classified into methods using first-order, second-order and third-order strategies according to the strategy of using label correlation in the multi-label learning algorithm. The first-order strategy mainly assumes that different labels are completely independent, each label is trained separately when multi-label classification is carried out, and interaction among the labels is ignored, such as BR algorithm. Second order strategies consider correlations between pairs of labels, such as CLR, LPLC, which take advantage of the relationship between two labels, but one label may depend on multiple labels simultaneously, resulting in inaccurate classification. The third-order strategy considers interactions between a subset of random markers or associations between each marker and the rest of the markers, such as CC, LLSF-DL, and although the third-order strategy mines strong marker correlations, it results in more complex calculations.

Under a big data environment, the semantics of the data are more complex, and the number of the marks in the whole data set is more, so that the marking process is difficult, and great challenges are brought to multi-mark learning. In multi-label learning, a certain correlation exists between a label and a feature, the label correlation learning is helpful for improving the generalization performance of the model, and the feature correlation learning is helpful for removing redundant features. In practical applications, the dependencies of the signatures and features may be shared by only a subset of the instances, rather than all instances, and thus with global dependencies of signatures and features, unnecessary constraints may be added to instances that do not contain such dependencies, thereby compromising the performance of the model.

Current multi-label learning methods do not take into account the fact that labels are combined with feature local correlations and cannot learn the exact correlations for a given datum. The learners have proposed many multi-label learning methods considering features and label correlations, such as the multi-label learning method (global) with global and local label correlations integrating global and local label correlations through label manifold regularization, and handling the cases of full labels and missing labels, which is advantageous in that to avoid the influence of the missing labels on the label correlations, the laplacian matrix is directly learned instead of specifying any correlation metric or label correlation matrix, and the label correlations are controlled on the output, but the method only considers the label correlations and cannot well improve the model performance. While the robust multi-marker feature selection method (DRMFS) based on dual graphs utilizes feature graph regularization and marker graph regularization to obtain marker correlation and feature correlation, the method only considers the global correlation of markers and features and does not consider the combination of the local correlation of markers and features. A missing mark feature selection method (FSLCLC) based on mark compression and local feature correlation obtains the local correlation of features through manifold regularization, the method solves the same coefficient matrix for all subsets when the feature manifold regularization is utilized, and the corresponding coefficient matrix should be different considering that the feature correlation and the mark correlation in each subset are possibly different, but the method can cause the reduction of model performance when the same coefficient matrix is learned.

The invention aims to search the local correlation of the features and the marks, simultaneously learns different model coefficients for different data subsets, can allocate more accurate correlation for different subsets and further improve the classification performance of the model.

Through retrieval, Chinese patent application number ZL201911306128.X, application date is 2019, 12 and 18, the name of the invention is a potential category discovery and classification method in multi-label classification, the application integrates known label classification and potential label discovery and classification in a unified frame, a nonnegative matrix decomposition technology is utilized to decompose a characteristic matrix into an approximate solution and a coefficient matrix of a complete category label matrix, known partial results of the approximate solution are constrained to be consistent with real values, and meanwhile, a classification model from sample characteristics to complete labels is constructed to discover potential label types; through potential mark discovery, valuable implicit information in data is mined, any class with strong correlation is constrained to have similar classification model coefficients by utilizing the correlation between a known mark and the potential mark, an approximate classification prediction result is obtained, the known mark classification and the potential mark classification are guided mutually and promoted together, the classification performance of the known mark and the potential mark is finally improved, and a multi-mark learning task is better performed.

Disclosure of Invention

1. Technical problem to be solved by the invention

The invention provides a multi-label learning method based on label and feature local correlation, which is used for assigning different correlations to different data subsets, removing redundant features in the features and solving a model coefficient for each data subset so as to enable multi-label classification to be more accurate.

2. Technical scheme

In order to achieve the purpose, the technical scheme provided by the invention is as follows:

the invention discloses a multi-label learning method based on local correlation of labels and features, which comprises the following steps:

s1, extracting the features of the training data to obtain a feature representation matrix X of the data, and performing class marking on the training data to establish a known class marking matrix Y of the data;

s2, clustering and dividing the feature expression matrix X into g data subsets { X ] by using a k-means method¹,...,X^gAnd dividing the known class mark matrix Y into g mark subsets { Y } corresponding to the known class mark matrix Y according to the data subsets of the characteristic representation matrix X¹,...,Y^gObtaining a clustering center C at the same time;

s3, calculating a mark correlation matrix B corresponding to each subset in the g data subsets^mAnd a feature correlation matrix S^mAnd calculating a Laplace matrix of the corresponding matrix

And

calculating the similarity between the data subsets according to the clustering center matrix C obtained in S2;

s4, constructing a data subset X^mMapping to Category tag Y^mLinear model W of^mAs a classifier;

s5, modeling the local mark correlation and the local feature correlation in the mth data subset in sequence; controlling local marker relevance on output, and adopting marker popularity regularization constraint marker subset Y^mThe output of the model corresponding to the related label is similar; controlling the local feature correlation on the model coefficient, and adopting the feature popularity regularization to restrict the data subset X^mThe model coefficients corresponding to the similar features in the intermediate database are similar;

s6, utilizing the similarity of model coefficients obtained by the similar subsets of regular term constraint, and utilizing the similarity of model coefficients corresponding to similar groups to add regular constraint to obtain a target model to be solved finally;

s7, obtaining g final classification models through learning of steps S1-S6, giving a test sample t, solving the distance between the test sample t and g cluster centers obtained in the step S2, selecting the classification models corresponding to the former r cluster subsets with close distances, bringing the test sample t into the r classification models, and fusing and outputting the final results of the test sample categories.

3. Advantageous effects

Compared with the prior art, the technical scheme provided by the invention has the following remarkable effects:

(1) most existing multi-label learning methods only consider the relevance of labels when classifying the multi-labels, but rarely consider the relevance between features. In practical applications, the feature correlation and the tag correlation may not be globally applicable, but only to a subset of the data, so that the accuracy of class tag classification may be affected by using the global tag correlation and the feature correlation. The method integrates the local correlation and classification of the marks and the characteristics into a unified framework, clusters the characteristic matrix into a plurality of data subsets by using a clustering technology, and learns the corresponding characteristics and mark correlation of each data subset by using the characteristics and mark manifold regularization, so that the learning of the correlation is more accurate.

(2) According to the multi-label learning method based on the local correlation of the labels and the features, the label correlation is utilized for learning, the generalization performance of the model is favorably improved, the feature correlation is favorable for removing redundancy in the features to obtain a more compact feature space, and therefore the accuracy of class label classification is further improved.

(3) The multi-label learning method based on the labels and the characteristic local correlation combines the characteristic and the label local correlation, learns a classification model coefficient for each data subset, and restricts the similar data subsets to have the similar model coefficients, so that the relation among the data subsets is considered, and a multi-label learning task is better performed.

Drawings

FIG. 1 is a diagram of a model framework incorporating signatures and feature local correlations for the method of the present invention.

Detailed Description

For a further understanding of the invention, reference should be made to the following detailed description taken in conjunction with the accompanying drawings and examples.

Example 1

With reference to fig. 1, the multi-label learning method based on local correlation between labels and features in this embodiment includes two stages, namely model construction and training, and label prediction, and includes the following specific steps:

(1) model construction and training:

and S1, performing feature extraction and class marking on the training data, obtaining a feature representation matrix X of the data and establishing a known class marking matrix Y. The method specifically comprises the following steps:

assuming that the training data features are represented as a real matrix

Wherein n represents the number of samples, d represents the number of features,

representing a real number domain. Y is formed by {0, 1}^n×qIs a class label matrix of known classes of training data, q represents the number of known class labels, where y_ijElements representing i rows and j columns in matrix Y, Y_ij1 means that the ith sample belongs to the jth class label, otherwise y_ijI is a positive integer between 1 and n, and j is a positive integer between 1 and q.

S2, according to the feature representation matrix X of the training data, dividing the training data into g data subsets { X ] through kmeans (k means) clustering¹,...,X^gAnd g token subsets Y corresponding to the data subsets are obtained¹,...,Y^g}. The method specifically comprises the following steps:

dividing a feature representation matrix X of training data into g data subsets { X ] through kmeans (k means) clustering¹,...,X^gAnd f, each group of corresponding feature matrix

n_mDenotes the m-th subset X^mNumber of samples, dividing Y into g marker subsets { Y) according to the resulting data subsets¹,...,Y^g}，

Y^mIs X^mA corresponding subset of labels in Y, where g represents the number of data subsets of the cluster; obtaining a cluster center matrix from the clustering of the training data X

Row i in C

Representing the ith data subset XⁱThe cluster center vector of (2).

S3, according to the g data subsets obtained in the step S2, calculating a mark correlation matrix B corresponding to each subset^mAnd a feature correlation matrix S^mCalculating corresponding Laplace matrix by the mark correlation matrix and the characteristic correlation matrix

And

and calculating the similarity among the g data subsets according to the cluster centers obtained in the step S2 (the similarity is used for describing the distance between the clustered data subsets in the space). The method specifically comprises the following steps:

setting up

A tag correlation matrix corresponding to the mth data subset, wherein any ith row and jth column element is

Calculating cosine similarity through the formula (1) to obtain a mark correlation matrix,

represents Y^mThe value of the ith row and ith column element,

represents Y^mRow h and column j.

For marking the correlation matrix B^mThe matrix of the laplacian of (c),

wherein, sum (B)^m) Represents respectively to B^mIs added, and a column vector, diag (sum (B) is returned^m) Return a dimension and sum (B)^m) The number of rows of the matrix is the same, and diagonal elements and sum (B)^m) And the values of the other elements are 0.

Setting up

A characteristic correlation matrix corresponding to the mth data subset, wherein any ith row and jth column element is

Calculating cosine similarity through the formula (2) to obtain a characteristic correlation matrix,

represents X^mThe value of the ith row and ith column element,

represents X^mRow h and column j.

Is a characteristic correlation matrix S^mThe matrix of the laplacian of (c),

wherein, sum (S)^m) Represents respective pairs S^mIs added, and a column vector, diag (S) (sum) is returned^m) Return one dimension and sum (S)^m) The number of rows of (A) is the same, and the diagonal elements and sum (S)^m) And the values of the other elements are 0.

Setting up

Is a similarity matrix among g data subsets, wherein the element of the ith row and the jth column is a_ijCalculating cosine similarity by using the formula (3) to obtain a similarity matrix among the data subsets, c_ihRepresenting the value of the element in row i and column h in C, C_jhRepresenting the values of the elements in row j and column h of C.

S4, constructing a data subset X^mMapping to Category tag Y^mLinear model W of^mAs a classifier. The method specifically comprises the following steps:

using a multiple linear regression model as a classifier, a linear classification model W is built^m(ii) a Characterizing X based on each data subset^mLearning a matrix Y mapped to class labels for each of g subsets of data^mLinear classification model f (X)^m,W^m)＝X^mW^mAnd for model parameters

And (3) making a regular constraint of the F normal form to control the complexity of the model and obtain a minimized target formula:

in the formula (4), the matrix W^mFor the model parameters to be solved, λ is the nonnegative weight coefficient, whose range of values is {10 }^-5，10^-4，10^-3，10^-2，10⁰，10¹}. Here, F-normal is used to calculate the result X obtained by the classification model^mW^mAnd Y^mThe error between, due to the division into g data subsets, takes the form of minimization of the sum.

S5, modeling the local mark correlation and the local feature correlation in the mth data subset in sequence; controlling local marker correlation on output, adopting marker popular regularization constraint marker Y^mThe output of the model corresponding to the related label is similar; controlling the local feature correlation on the model coefficient, and adopting the feature popularity regularization to restrict the data subset X^mThe model coefficients corresponding to the similar features in (1) are similar:

(1) firstly, modeling the correlation of the local marks: in the m-th data subset, assume if subset Y is marked^mAny two of the markers

And

the more correlated, the output of the linear model corresponding to these two labels

And

the more similar (I is not less than 1, j is not more than q), otherwise, the correlation degree is based on the mark correlation matrix B calculated by the formula (1)^mTo determine. Wherein

And

are column vectors, each representing W^mThe ith column and the jth column of (1),

and

also column vectors, respectively representing Y^mI and j (1. ltoreq. i, j. ltoreq. q). And controlling the local mark correlation on output, and adopting manifold regularization for constraint. The method specifically comprises the following steps:

modeling the correlation between the marks, and utilizing manifold regular constraint to output X of the model corresponding to the correlation marks^mW^mThe minimization objective formula is obtained:

in the formula (5), matrix

F^m＝X^mW^m，W^mAlpha is a non-negative weight coefficient which is a model parameter to be solved and has a value range of {10^-5，10^-4，10^-3，10^-2Denoted tr (·) is the matrix trace norm,

labeling the categories with a correlation matrix B^mThe laplacian matrix of.

(2) Modeling the local feature correlation: in the mth data subset, assume if data subset X^mAny two features of

And

the more similar the two features correspond to model coefficients

And

the more similar (1. ltoreq. k, l. ltoreq. d), the less the opposite is true, and the degree of similarity is based on the characteristic correlation matrix S calculated by the formula (2)^mTo determine. Wherein

And

are column vectors, each representing X^mThe ith column and the jth column of (1),

and

are row vectors, each representing W^mThe k-th line and the l-th line. And controlling the local characteristic correlation on a model coefficient, and adopting manifold regularization for constraint. The method specifically comprises the following steps:

modeling the correlation among the characteristics, and utilizing manifold regular constraint to constrain model coefficients W corresponding to the correlation characteristics^mThe minimization objective formula is obtained:

in the formula (6), W^mFor the model coefficient to be solved, beta is a non-negative weight coefficient, and the value range is {10^-5，10^-4，10^-3，10^-2，10^-1，10⁰，10¹Denoted tr (·) is the matrix trace norm,

is a characteristic correlation matrix S^mThe laplacian matrix of.

S6, considering the g data subsets { X ] obtained in step S2¹,...,x^gInfluence of similarity between the data subsets on model coefficients, similarity of model coefficients obtained by using regularly constrained similar data subsets, i.e. if X isⁱAnd X^jSimilar rule WⁱAnd W^jAlso similar, otherwise WⁱAnd W^jAre not similar, XⁱAnd X^jRespectively representing the ith and jth data subsets, WⁱAnd W^jAnd respectively representing model coefficient matrixes corresponding to the ith and jth data subsets. The method specifically comprises the following steps:

modeling the similarity among the data subsets, and adding regular constraint by using the similarity of model coefficients corresponding to similar groups to obtain a final target formula to be solved:

in the formula (7), W^mAnd W^jAll represent model coefficients, W^mDenotes the m-th subset X^mCorresponding model coefficient, W^jDenotes the jth subset X^jCorresponding model coefficients, m and j representing positive integers between 1 and g, and gamma being a non-negative weight coefficient, whose threshold is {10 }^-5，10^-4，10^-3，10^-2，10^-1，10⁰，10¹}。a_mjRepresenting a subset of data X^mAnd X^jThe similarity between them is a matrix

Row m and column j.

(2) And (3) marking prediction:

and S7, label prediction. And S1-S7 learning is carried out to obtain g final classification models, a test sample t is given, the distance between the test sample t and g clustering centers obtained in the step S2 is solved, the classification models corresponding to the clustering subsets with the former r closer distances are selected, the test sample t is brought into the r classification models, and the final results of the classes of the test samples are fused and output. The method specifically comprises the following steps:

given a characteristic representation of a test sample t

Find x_tAnd the euclidean distances between g cluster centers C,

calculated by equation (8):

wherein c is_i(j) Representing the ith data subset XⁱCluster center vector c of_iThe jth element value of (1), x_t(jh) represents a vector x_tValue of the j-th element of (1), f_t(i) Represents a vector f_tThe value of the i-th element in (1) is x_tAnd c_iThe Euclidean distance between the two (i is more than or equal to 1 and less than or equal to g).

According to the calculated distance f_tSelecting the first r data subsets { X ] with smaller distance from the test sample¹,...,X^rModel coefficients { W } obtained in the training phase¹,...,W^rAnd testing the test data t respectively to obtain an average value to obtain a predicted value

In the formula (9), r is a weight parameter, and the value range of r is {1, 2, 3 }.

Is the first r subsets of data

The corresponding model coefficients.

Calculating final output mark vectors y of the test samples on q categories according to the obtained predicted values of the test samples t and the set threshold tau_t∈{0，1}^1×lTherein []Is an indicator function. According to the calculation result of the formula (10), when the condition is met, returning to 1 to indicate that the test sample belongs to the ith class; otherwise, 0 is returned, indicating that the class i does not belong to.

y_t(i)＝[(z_t)_i＞τ],1≤i≤l (10)

According to the embodiment, the local relevance and classification of the marks and the features are fused in a unified framework, the feature matrix is clustered into a plurality of data subsets by using a clustering technology, and different features and mark relevance are learned for each data subset by using feature and mark manifold regularization, so that the learning of the relevance is more accurate, and meanwhile, the redundancy in the features is removed. In the construction of the model, a classification model coefficient is learned for each data subset respectively, and similar subsets are constrained to have similar model coefficients, so that the relation among the data subsets is considered, a multi-label learning task is better performed, and the accuracy of class label classification is improved.

The present invention and its embodiments have been described above schematically, without limitation, and what is shown in the drawings is only one of the embodiments of the present invention, and the actual structure is not limited thereto. Therefore, if the person skilled in the art receives the teaching, without departing from the spirit of the invention, the person skilled in the art shall not inventively design the similar structural modes and embodiments to the technical solution, but shall fall within the scope of the invention.

Claims

1. A multi-label learning method based on local correlation of labels and features is characterized by comprising the following steps:

s2, clustering and dividing the feature expression matrix X into g data subsets { X ] by using a k-means method¹，...，X^gAnd dividing the known class mark matrix Y into g mark subsets { Y } corresponding to the known class mark matrix Y according to the data subsets of the characteristic representation matrix X¹，...，Y^gObtaining a clustering center C at the same time;

And

2. The multi-label learning method based on local correlation of labels and features as claimed in claim 1, wherein: in step S1, the data feature expression matrix X is a real matrix,

representing a real number domain; the known class label matrix Y e {0, 1}^n×qAnd q represents the known number y of category labels_ijElements representing i rows and j columns in matrix Y, Y_ij1 means that the ith sample belongs to the jth class label, otherwise y_ijI is a positive integer between 1 and n, and j is a positive integer between 1 and q.

3. The multi-label learning method based on local correlation of labels and features as claimed in claim 2, wherein: in step S2, g represents the number of data subsets of the cluster, and the feature representation matrix X of the training data is divided into g data subsets { X ] by k-means clustering¹，...，X^gAnd f, each group of corresponding feature matrix

n_mRepresenting the m-th data subset X^mNumber of samples, dividing Y into g marker subsets { Y) according to the resulting data subsets¹，...，Y^g}，

Y^mIs X^mM is more than or equal to 1 and less than or equal to g in the corresponding mark subset in Y; obtaining a cluster center matrix from the clustering of the training data X

Row i in C

Representing the ith data subset XⁱThe cluster center vector of (2).

4. A multi-label learning method based on local correlation of labels and features according to claim 3, characterized in that: in the above step S3, the setting is performed

A tag correlation matrix corresponding to the mth data subset, wherein the ith row and the jth column of elements

Calculating cosine similarity through the formula (1) to obtain a mark correlation matrix:

in the formula (1), the reaction mixture is,

represents Y^mThe value of the ith row and ith column element,

represents Y^mThe value of the element in the h row and the j column;

mark correlation matrix B^mThe corresponding laplacian matrix is:

wherein, sum (B)^m) Represents respectively to B^mIs added, and a column vector, diag (sum (B) is returned^m) Return a dimension and sum (B)^m) The number of rows of the matrix is the same, and diagonal elements and sum (B)^m) Corresponding to each other, and the values of the other elements are all 0;

setting up

The element of the ith row and the jth column in the characteristic correlation matrix corresponding to the mth data subset

Calculating cosine similarity through the formula (2) to obtain a characteristic correlation matrix:

in the formula (2), the reaction mixture is,

represents X^mThe value of the ith row and ith column element,

represents X^mThe value of the element in the h row and the j column;

characteristic correlation matrix S^mThe corresponding laplacian matrix is:

wherein, sum (S)^m) Represents respective pairs S^mIs added, and a column vector, diag (S) (sum) is returned^m) Return one dimension and sum (S)^m) The number of rows of (A) is the same, and the diagonal elements and sum (S)^m) Corresponding to each other, and the values of the other elements are all 0;

setting up

Is a similarity matrix between g data subsets, where any ith row and jth column element a_ijRepresenting the similarity between the ith data subset and the jth data subset, and calculating the cosine similarity by using the formula (3) to obtain a similarity matrix between the data subsets:

in the formula (3), c_ihRepresenting the value of the element in row i and column h in C, C_jhRepresenting the values of the elements in row j and column h of C.

5. The multi-label learning method based on local correlation of labels and features as claimed in claim 4, wherein: in step S4, a multiple linear regression model is used as a classifier to create a linear classification model W^m(ii) a Characterizing X based on each data subset^mLearning a matrix Y mapped to class labels for each of g subsets of data^mLinear classification model f (X)^m，W^m)＝X^mW^mAnd for model parameters

And (3) making a regular constraint of the F normal form to control the complexity of the model and obtain an updated minimized target formula:

in the formula (4), the matrix W^mThe model parameter to be solved, lambda is a nonnegative weight coefficient, and the value range is {10^-5，10^-4，10^-3，10^-2，10⁰，10¹}。

6. The multi-label learning method based on local correlation of labels and features as claimed in claim 5, wherein: in step S5, the local tag correlation is modeled, and the tag correlation matrix B is calculated using the formula (1)^mEach element value in the matrix represents the degree to which any two tokens in a subset of tokens are related, if token subset Y^mAny two of the markers

And

the more correlated, the output of the linear model corresponding to the two labels

And

the more similar; here, the local tag correlation is controlled on the output, and the tag manifold regularization is adopted for constraint, so as to obtain a minimization target formula:

in the formula (5), matrix

F^m＝X^mW^mAlpha is a non-negative weight coefficient, and tr (·) represents a matrix trace norm; wherein

And

is that the column vector represents W respectively^mThe ith column and the jth column of (1),

and

is a column vector representing Y respectively^mThe ith and jth columns of (1).

7. The multi-label learning method based on local correlation of labels and features as claimed in claim 6, wherein: in step S5, the local feature correlation is modeled, and a feature correlation matrix S is calculated by using the formula (2)^mEach element value in the matrix represents a subset of data X^mIf any two features in the subset are similar, then the subset X of features^mAny two features of

And

the more similar the two features correspond to model coefficients

And

the more similar(ii) a Here, the local feature correlation is controlled on a model coefficient, and a feature manifold regularization is adopted for constraint to obtain a minimized target formula:

in the formula (6), beta is a non-negative weight coefficient, and tr (·) represents a matrix trace norm; wherein

And

and

are row vectors, each representing W^mThe k-th line and the l-th line.

8. The multi-label learning method based on local correlation of labels and features as claimed in claim 7, wherein: in step S6, the similarity between the data subsets is modeled, and regular constraints are added by using the model coefficients corresponding to the similar data subsets, that is, if X is XⁱAnd X^jSimilar rule WⁱAnd W^jAlso similarly, XⁱAnd X^jRespectively representing the ith and jth data subsets, WⁱAnd W^jRespectively representing model coefficient matrixes corresponding to the ith and jth data subsets to obtain a final target formula to be solved:

in the formula (7), gamma is a non-negative weight coefficient, a_mjIs the m row and j column in A represents the m data subset X^mAnd j-th data subset X^jThe similarity between the two is obtained by calculating according to the formula (3).

9. A multi-label learning method based on local relevance of labels and features according to any of claims 1-8, characterized by: in the step S7, a feature expression x of a test sample t is given_tFinding x_tAnd g cluster centers C

Calculated by equation (8):

wherein c is_i(j) Representing the ith data subset XⁱCluster center vector c of_iThe jth element value of (1), x_t(j) Representing a vector x_tValue of the j-th element of (1), f_t(i) Represents a vector f_tThe value of the i-th element in (1) is x_tAnd c_iThe Euclidean distance between the two (i is more than or equal to 1 and less than or equal to g);

according to the calculated distance f_tSelecting the first r data subsets { X ] with smaller distance from the test sample¹，...，X^rModel coefficients { W } obtained in the training phase¹，...，W^rAnd testing the test data t respectively to obtain an average value to obtain a predicted value

In the formula (9), r is a non-negative weight coefficient,

is the first r subsets of data

The corresponding model coefficients.

10. The multi-label learning method based on local correlation of labels and features as claimed in claim 9, wherein: in step S7, final output label vectors y of the test samples in q categories are calculated based on the obtained predicted values of the test samples t and the set threshold τ_t∈{0，1}^1×q：

y_t(i)＝[(z_t)_i＞τ]，1≤i≤l (10)

When the condition is met, returning to 1 to indicate that the test sample belongs to the ith class; otherwise, 0 is returned, indicating that the class i does not belong to.