CN113420821A - Multi-label learning method based on local correlation of labels and features - Google Patents

Multi-label learning method based on local correlation of labels and features Download PDF

Info

Publication number
CN113420821A
CN113420821A CN202110734886.2A CN202110734886A CN113420821A CN 113420821 A CN113420821 A CN 113420821A CN 202110734886 A CN202110734886 A CN 202110734886A CN 113420821 A CN113420821 A CN 113420821A
Authority
CN
China
Prior art keywords
matrix
correlation
data
local
features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110734886.2A
Other languages
Chinese (zh)
Inventor
程倩倩
黄�俊
张辉宜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui University of Technology AHUT
Original Assignee
Anhui University of Technology AHUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui University of Technology AHUT filed Critical Anhui University of Technology AHUT
Priority to CN202110734886.2A priority Critical patent/CN113420821A/en
Publication of CN113420821A publication Critical patent/CN113420821A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2431Multiple classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/245Classification techniques relating to the decision surface
    • G06F18/2451Classification techniques relating to the decision surface linear, e.g. hyperplane

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Mathematical Physics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Computational Mathematics (AREA)
  • Computing Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a multi-label learning method based on local correlation of labels and features, belonging to the technical field of machine learning; the method includes the steps that local feature correlation is added while local mark correlation is considered and utilized to classify multiple marks, training data are decomposed into a plurality of data subsets through clustering, the local mark correlation and the local feature correlation are obtained in each data subset through mark manifold regularization and feature manifold regularization, a model coefficient is learned for each group of examples respectively, and a new regularization term is added to consider the relation among all clustering data subsets; the feature correlation helps to remove redundant features, and the label correlation and the feature correlation are not globally shared, so that different label correlations and feature correlations are allocated to different examples through the local label correlation and the local feature correlation, the classification performance of the multi-label classification model is improved, and a multi-label learning task is better performed.

Description

Multi-label learning method based on local correlation of labels and features
Technical Field
The invention relates to the technical field of machine learning, in particular to a multi-label learning method based on local correlation of labels and features.
Background
In traditional supervised learning, an instance usually corresponds to only one category label, but in real life, an object often has multiple semantics. For example, a news report can cover a plurality of subjects at the same time, and the news report can be divided into different categories such as politics, economy, sports and the like from different aspects; the movie "the stabbing novice" belongs to three movie categories of "action", "fantasy" and "adventure" at the same time; the published academic papers generally contain a plurality of keywords, so as to improve the retrieval efficiency of the academic papers in a retrieval system. Multi-label learning is used to solve the above text classification problem of labeling multiple keywords. With the attention of more and more scholars, multiple labels are rapidly developed, and multiple label learning is successfully applied to various research fields such as information retrieval, image classification, clinical data analysis, biomedical classification and the like.
The multi-label learning is an important research subject in machine learning and data mining, and aims to train a classification model and predict all relevant labels for invisible instances by using the trained model. There are many algorithms for solving the multi-label learning problem, and the existing methods can be classified into methods using first-order, second-order and third-order strategies according to the strategy of using label correlation in the multi-label learning algorithm. The first-order strategy mainly assumes that different labels are completely independent, each label is trained separately when multi-label classification is carried out, and interaction among the labels is ignored, such as BR algorithm. Second order strategies consider correlations between pairs of labels, such as CLR, LPLC, which take advantage of the relationship between two labels, but one label may depend on multiple labels simultaneously, resulting in inaccurate classification. The third-order strategy considers interactions between a subset of random markers or associations between each marker and the rest of the markers, such as CC, LLSF-DL, and although the third-order strategy mines strong marker correlations, it results in more complex calculations.
Under a big data environment, the semantics of the data are more complex, and the number of the marks in the whole data set is more, so that the marking process is difficult, and great challenges are brought to multi-mark learning. In multi-label learning, a certain correlation exists between a label and a feature, the label correlation learning is helpful for improving the generalization performance of the model, and the feature correlation learning is helpful for removing redundant features. In practical applications, the dependencies of the signatures and features may be shared by only a subset of the instances, rather than all instances, and thus with global dependencies of signatures and features, unnecessary constraints may be added to instances that do not contain such dependencies, thereby compromising the performance of the model.
Current multi-label learning methods do not take into account the fact that labels are combined with feature local correlations and cannot learn the exact correlations for a given datum. The learners have proposed many multi-label learning methods considering features and label correlations, such as the multi-label learning method (global) with global and local label correlations integrating global and local label correlations through label manifold regularization, and handling the cases of full labels and missing labels, which is advantageous in that to avoid the influence of the missing labels on the label correlations, the laplacian matrix is directly learned instead of specifying any correlation metric or label correlation matrix, and the label correlations are controlled on the output, but the method only considers the label correlations and cannot well improve the model performance. While the robust multi-marker feature selection method (DRMFS) based on dual graphs utilizes feature graph regularization and marker graph regularization to obtain marker correlation and feature correlation, the method only considers the global correlation of markers and features and does not consider the combination of the local correlation of markers and features. A missing mark feature selection method (FSLCLC) based on mark compression and local feature correlation obtains the local correlation of features through manifold regularization, the method solves the same coefficient matrix for all subsets when the feature manifold regularization is utilized, and the corresponding coefficient matrix should be different considering that the feature correlation and the mark correlation in each subset are possibly different, but the method can cause the reduction of model performance when the same coefficient matrix is learned.
The invention aims to search the local correlation of the features and the marks, simultaneously learns different model coefficients for different data subsets, can allocate more accurate correlation for different subsets and further improve the classification performance of the model.
Through retrieval, Chinese patent application number ZL201911306128.X, application date is 2019, 12 and 18, the name of the invention is a potential category discovery and classification method in multi-label classification, the application integrates known label classification and potential label discovery and classification in a unified frame, a nonnegative matrix decomposition technology is utilized to decompose a characteristic matrix into an approximate solution and a coefficient matrix of a complete category label matrix, known partial results of the approximate solution are constrained to be consistent with real values, and meanwhile, a classification model from sample characteristics to complete labels is constructed to discover potential label types; through potential mark discovery, valuable implicit information in data is mined, any class with strong correlation is constrained to have similar classification model coefficients by utilizing the correlation between a known mark and the potential mark, an approximate classification prediction result is obtained, the known mark classification and the potential mark classification are guided mutually and promoted together, the classification performance of the known mark and the potential mark is finally improved, and a multi-mark learning task is better performed.
Disclosure of Invention
1. Technical problem to be solved by the invention
The invention provides a multi-label learning method based on label and feature local correlation, which is used for assigning different correlations to different data subsets, removing redundant features in the features and solving a model coefficient for each data subset so as to enable multi-label classification to be more accurate.
2. Technical scheme
In order to achieve the purpose, the technical scheme provided by the invention is as follows:
the invention discloses a multi-label learning method based on local correlation of labels and features, which comprises the following steps:
s1, extracting the features of the training data to obtain a feature representation matrix X of the data, and performing class marking on the training data to establish a known class marking matrix Y of the data;
s2, clustering and dividing the feature expression matrix X into g data subsets { X ] by using a k-means method1,...,XgAnd dividing the known class mark matrix Y into g mark subsets { Y } corresponding to the known class mark matrix Y according to the data subsets of the characteristic representation matrix X1,...,YgObtaining a clustering center C at the same time;
s3, calculating a mark correlation matrix B corresponding to each subset in the g data subsetsmAnd a feature correlation matrix SmAnd calculating a Laplace matrix of the corresponding matrix
Figure BDA0003139964190000031
And
Figure BDA0003139964190000032
calculating the similarity between the data subsets according to the clustering center matrix C obtained in S2;
s4, constructing a data subset XmMapping to Category tag YmLinear model W ofmAs a classifier;
s5, modeling the local mark correlation and the local feature correlation in the mth data subset in sequence; controlling local marker relevance on output, and adopting marker popularity regularization constraint marker subset YmThe output of the model corresponding to the related label is similar; controlling the local feature correlation on the model coefficient, and adopting the feature popularity regularization to restrict the data subset XmThe model coefficients corresponding to the similar features in the intermediate database are similar;
s6, utilizing the similarity of model coefficients obtained by the similar subsets of regular term constraint, and utilizing the similarity of model coefficients corresponding to similar groups to add regular constraint to obtain a target model to be solved finally;
s7, obtaining g final classification models through learning of steps S1-S6, giving a test sample t, solving the distance between the test sample t and g cluster centers obtained in the step S2, selecting the classification models corresponding to the former r cluster subsets with close distances, bringing the test sample t into the r classification models, and fusing and outputting the final results of the test sample categories.
3. Advantageous effects
Compared with the prior art, the technical scheme provided by the invention has the following remarkable effects:
(1) most existing multi-label learning methods only consider the relevance of labels when classifying the multi-labels, but rarely consider the relevance between features. In practical applications, the feature correlation and the tag correlation may not be globally applicable, but only to a subset of the data, so that the accuracy of class tag classification may be affected by using the global tag correlation and the feature correlation. The method integrates the local correlation and classification of the marks and the characteristics into a unified framework, clusters the characteristic matrix into a plurality of data subsets by using a clustering technology, and learns the corresponding characteristics and mark correlation of each data subset by using the characteristics and mark manifold regularization, so that the learning of the correlation is more accurate.
(2) According to the multi-label learning method based on the local correlation of the labels and the features, the label correlation is utilized for learning, the generalization performance of the model is favorably improved, the feature correlation is favorable for removing redundancy in the features to obtain a more compact feature space, and therefore the accuracy of class label classification is further improved.
(3) The multi-label learning method based on the labels and the characteristic local correlation combines the characteristic and the label local correlation, learns a classification model coefficient for each data subset, and restricts the similar data subsets to have the similar model coefficients, so that the relation among the data subsets is considered, and a multi-label learning task is better performed.
Drawings
FIG. 1 is a diagram of a model framework incorporating signatures and feature local correlations for the method of the present invention.
Detailed Description
For a further understanding of the invention, reference should be made to the following detailed description taken in conjunction with the accompanying drawings and examples.
Example 1
With reference to fig. 1, the multi-label learning method based on local correlation between labels and features in this embodiment includes two stages, namely model construction and training, and label prediction, and includes the following specific steps:
(1) model construction and training:
and S1, performing feature extraction and class marking on the training data, obtaining a feature representation matrix X of the data and establishing a known class marking matrix Y. The method specifically comprises the following steps:
assuming that the training data features are represented as a real matrix
Figure BDA0003139964190000041
Wherein n represents the number of samples, d represents the number of features,
Figure BDA0003139964190000042
representing a real number domain. Y is formed by {0, 1}n×qIs a class label matrix of known classes of training data, q represents the number of known class labels, where yijElements representing i rows and j columns in matrix Y, Yij1 means that the ith sample belongs to the jth class label, otherwise yijI is a positive integer between 1 and n, and j is a positive integer between 1 and q.
S2, according to the feature representation matrix X of the training data, dividing the training data into g data subsets { X ] through kmeans (k means) clustering1,...,XgAnd g token subsets Y corresponding to the data subsets are obtained1,...,Yg}. The method specifically comprises the following steps:
dividing a feature representation matrix X of training data into g data subsets { X ] through kmeans (k means) clustering1,...,XgAnd f, each group of corresponding feature matrix
Figure BDA0003139964190000043
nmDenotes the m-th subset XmNumber of samples, dividing Y into g marker subsets { Y) according to the resulting data subsets1,...,Yg},
Figure BDA0003139964190000044
YmIs XmA corresponding subset of labels in Y, where g represents the number of data subsets of the cluster; obtaining a cluster center matrix from the clustering of the training data X
Figure BDA0003139964190000045
Row i in C
Figure BDA0003139964190000046
Representing the ith data subset XiThe cluster center vector of (2).
S3, according to the g data subsets obtained in the step S2, calculating a mark correlation matrix B corresponding to each subsetmAnd a feature correlation matrix SmCalculating corresponding Laplace matrix by the mark correlation matrix and the characteristic correlation matrix
Figure BDA0003139964190000047
And
Figure BDA0003139964190000048
and calculating the similarity among the g data subsets according to the cluster centers obtained in the step S2 (the similarity is used for describing the distance between the clustered data subsets in the space). The method specifically comprises the following steps:
setting up
Figure BDA0003139964190000049
A tag correlation matrix corresponding to the mth data subset, wherein any ith row and jth column element is
Figure BDA00031399641900000410
Calculating cosine similarity through the formula (1) to obtain a mark correlation matrix,
Figure BDA00031399641900000411
represents YmThe value of the ith row and ith column element,
Figure BDA00031399641900000412
represents YmRow h and column j.
Figure BDA0003139964190000051
For marking the correlation matrix BmThe matrix of the laplacian of (c),
Figure BDA0003139964190000052
wherein, sum (B)m) Represents respectively to BmIs added, and a column vector, diag (sum (B) is returnedm) Return a dimension and sum (B)m) The number of rows of the matrix is the same, and diagonal elements and sum (B)m) And the values of the other elements are 0.
Figure BDA0003139964190000053
Setting up
Figure BDA0003139964190000054
A characteristic correlation matrix corresponding to the mth data subset, wherein any ith row and jth column element is
Figure BDA0003139964190000055
Calculating cosine similarity through the formula (2) to obtain a characteristic correlation matrix,
Figure BDA0003139964190000056
represents XmThe value of the ith row and ith column element,
Figure BDA0003139964190000057
represents XmRow h and column j.
Figure BDA0003139964190000058
Is a characteristic correlation matrix SmThe matrix of the laplacian of (c),
Figure BDA0003139964190000059
Figure BDA00031399641900000510
wherein, sum (S)m) Represents respective pairs SmIs added, and a column vector, diag (S) (sum) is returnedm) Return one dimension and sum (S)m) The number of rows of (A) is the same, and the diagonal elements and sum (S)m) And the values of the other elements are 0.
Figure BDA00031399641900000511
Setting up
Figure BDA00031399641900000512
Is a similarity matrix among g data subsets, wherein the element of the ith row and the jth column is aijCalculating cosine similarity by using the formula (3) to obtain a similarity matrix among the data subsets, cihRepresenting the value of the element in row i and column h in C, CjhRepresenting the values of the elements in row j and column h of C.
Figure BDA00031399641900000513
S4, constructing a data subset XmMapping to Category tag YmLinear model W ofmAs a classifier. The method specifically comprises the following steps:
using a multiple linear regression model as a classifier, a linear classification model W is builtm(ii) a Characterizing X based on each data subsetmLearning a matrix Y mapped to class labels for each of g subsets of datamLinear classification model f (X)m,Wm)=XmWmAnd for model parameters
Figure BDA00031399641900000514
And (3) making a regular constraint of the F normal form to control the complexity of the model and obtain a minimized target formula:
Figure BDA00031399641900000515
in the formula (4), the matrix WmFor the model parameters to be solved, λ is the nonnegative weight coefficient, whose range of values is {10 }-5,10-4,10-3,10-2,100,101}. Here, F-normal is used to calculate the result X obtained by the classification modelmWmAnd YmThe error between, due to the division into g data subsets, takes the form of minimization of the sum.
S5, modeling the local mark correlation and the local feature correlation in the mth data subset in sequence; controlling local marker correlation on output, adopting marker popular regularization constraint marker YmThe output of the model corresponding to the related label is similar; controlling the local feature correlation on the model coefficient, and adopting the feature popularity regularization to restrict the data subset XmThe model coefficients corresponding to the similar features in (1) are similar:
(1) firstly, modeling the correlation of the local marks: in the m-th data subset, assume if subset Y is markedmAny two of the markers
Figure BDA0003139964190000061
And
Figure BDA0003139964190000062
the more correlated, the output of the linear model corresponding to these two labels
Figure BDA0003139964190000063
And
Figure BDA0003139964190000064
the more similar (I is not less than 1, j is not more than q), otherwise, the correlation degree is based on the mark correlation matrix B calculated by the formula (1)mTo determine. Wherein
Figure BDA0003139964190000065
And
Figure BDA0003139964190000066
are column vectors, each representing WmThe ith column and the jth column of (1),
Figure BDA0003139964190000067
and
Figure BDA0003139964190000068
also column vectors, respectively representing YmI and j (1. ltoreq. i, j. ltoreq. q). And controlling the local mark correlation on output, and adopting manifold regularization for constraint. The method specifically comprises the following steps:
modeling the correlation between the marks, and utilizing manifold regular constraint to output X of the model corresponding to the correlation marksmWmThe minimization objective formula is obtained:
Figure BDA0003139964190000069
in the formula (5), matrix
Figure BDA00031399641900000610
Fm=XmWm,WmAlpha is a non-negative weight coefficient which is a model parameter to be solved and has a value range of {10-5,10-4,10-3,10-2Denoted tr (·) is the matrix trace norm,
Figure BDA00031399641900000611
labeling the categories with a correlation matrix BmThe laplacian matrix of.
(2) Modeling the local feature correlation: in the mth data subset, assume if data subset XmAny two features of
Figure BDA00031399641900000612
And
Figure BDA00031399641900000613
the more similar the two features correspond to model coefficients
Figure BDA00031399641900000614
And
Figure BDA00031399641900000615
the more similar (1. ltoreq. k, l. ltoreq. d), the less the opposite is true, and the degree of similarity is based on the characteristic correlation matrix S calculated by the formula (2)mTo determine. Wherein
Figure BDA00031399641900000616
And
Figure BDA00031399641900000617
are column vectors, each representing XmThe ith column and the jth column of (1),
Figure BDA00031399641900000618
and
Figure BDA00031399641900000619
are row vectors, each representing WmThe k-th line and the l-th line. And controlling the local characteristic correlation on a model coefficient, and adopting manifold regularization for constraint. The method specifically comprises the following steps:
modeling the correlation among the characteristics, and utilizing manifold regular constraint to constrain model coefficients W corresponding to the correlation characteristicsmThe minimization objective formula is obtained:
Figure BDA00031399641900000620
in the formula (6), WmFor the model coefficient to be solved, beta is a non-negative weight coefficient, and the value range is {10-5,10-4,10-3,10-2,10-1,100,101Denoted tr (·) is the matrix trace norm,
Figure BDA00031399641900000621
is a characteristic correlation matrix SmThe laplacian matrix of.
S6, considering the g data subsets { X ] obtained in step S21,...,xgInfluence of similarity between the data subsets on model coefficients, similarity of model coefficients obtained by using regularly constrained similar data subsets, i.e. if X isiAnd XjSimilar rule WiAnd WjAlso similar, otherwise WiAnd WjAre not similar, XiAnd XjRespectively representing the ith and jth data subsets, WiAnd WjAnd respectively representing model coefficient matrixes corresponding to the ith and jth data subsets. The method specifically comprises the following steps:
modeling the similarity among the data subsets, and adding regular constraint by using the similarity of model coefficients corresponding to similar groups to obtain a final target formula to be solved:
Figure BDA0003139964190000071
in the formula (7), WmAnd WjAll represent model coefficients, WmDenotes the m-th subset XmCorresponding model coefficient, WjDenotes the jth subset XjCorresponding model coefficients, m and j representing positive integers between 1 and g, and gamma being a non-negative weight coefficient, whose threshold is {10 }-5,10-4,10-3,10-2,10-1,100,101}。amjRepresenting a subset of data XmAnd XjThe similarity between them is a matrix
Figure BDA00031399641900000710
Row m and column j.
(2) And (3) marking prediction:
and S7, label prediction. And S1-S7 learning is carried out to obtain g final classification models, a test sample t is given, the distance between the test sample t and g clustering centers obtained in the step S2 is solved, the classification models corresponding to the clustering subsets with the former r closer distances are selected, the test sample t is brought into the r classification models, and the final results of the classes of the test samples are fused and output. The method specifically comprises the following steps:
given a characteristic representation of a test sample t
Figure BDA0003139964190000072
Find xtAnd the euclidean distances between g cluster centers C,
Figure BDA0003139964190000073
Figure BDA0003139964190000074
calculated by equation (8):
Figure BDA0003139964190000075
wherein c isi(j) Representing the ith data subset XiCluster center vector c ofiThe jth element value of (1), xt(jh) represents a vector xtValue of the j-th element of (1), ft(i) Represents a vector ftThe value of the i-th element in (1) is xtAnd ciThe Euclidean distance between the two (i is more than or equal to 1 and less than or equal to g).
According to the calculated distance ftSelecting the first r data subsets { X ] with smaller distance from the test sample1,...,XrModel coefficients { W } obtained in the training phase1,...,WrAnd testing the test data t respectively to obtain an average value to obtain a predicted value
Figure BDA0003139964190000076
Figure BDA0003139964190000077
In the formula (9), r is a weight parameter, and the value range of r is {1, 2, 3 }.
Figure BDA0003139964190000078
Is the first r subsets of data
Figure BDA0003139964190000079
The corresponding model coefficients.
Calculating final output mark vectors y of the test samples on q categories according to the obtained predicted values of the test samples t and the set threshold taut∈{0,1}1×lTherein []Is an indicator function. According to the calculation result of the formula (10), when the condition is met, returning to 1 to indicate that the test sample belongs to the ith class; otherwise, 0 is returned, indicating that the class i does not belong to.
yt(i)=[(zt)i>τ],1≤i≤l (10)
According to the embodiment, the local relevance and classification of the marks and the features are fused in a unified framework, the feature matrix is clustered into a plurality of data subsets by using a clustering technology, and different features and mark relevance are learned for each data subset by using feature and mark manifold regularization, so that the learning of the relevance is more accurate, and meanwhile, the redundancy in the features is removed. In the construction of the model, a classification model coefficient is learned for each data subset respectively, and similar subsets are constrained to have similar model coefficients, so that the relation among the data subsets is considered, a multi-label learning task is better performed, and the accuracy of class label classification is improved.
The present invention and its embodiments have been described above schematically, without limitation, and what is shown in the drawings is only one of the embodiments of the present invention, and the actual structure is not limited thereto. Therefore, if the person skilled in the art receives the teaching, without departing from the spirit of the invention, the person skilled in the art shall not inventively design the similar structural modes and embodiments to the technical solution, but shall fall within the scope of the invention.

Claims (10)

1. A multi-label learning method based on local correlation of labels and features is characterized by comprising the following steps:
s1, extracting the features of the training data to obtain a feature representation matrix X of the data, and performing class marking on the training data to establish a known class marking matrix Y of the data;
s2, clustering and dividing the feature expression matrix X into g data subsets { X ] by using a k-means method1,...,XgAnd dividing the known class mark matrix Y into g mark subsets { Y } corresponding to the known class mark matrix Y according to the data subsets of the characteristic representation matrix X1,...,YgObtaining a clustering center C at the same time;
s3, calculating a mark correlation matrix B corresponding to each subset in the g data subsetsmAnd a feature correlation matrix SmAnd calculating a Laplace matrix of the corresponding matrix
Figure FDA0003139964180000011
And
Figure FDA0003139964180000012
calculating the similarity between the data subsets according to the clustering center matrix C obtained in S2;
s4, constructing a data subset XmMapping to Category tag YmLinear model W ofmAs a classifier;
s5, modeling the local mark correlation and the local feature correlation in the mth data subset in sequence; controlling local marker relevance on output, and adopting marker popularity regularization constraint marker subset YmThe output of the model corresponding to the related label is similar; controlling the local feature correlation on the model coefficient, and adopting the feature popularity regularization to restrict the data subset XmThe model coefficients corresponding to the similar features in the intermediate database are similar;
s6, utilizing the similarity of model coefficients obtained by the similar subsets of regular term constraint, and utilizing the similarity of model coefficients corresponding to similar groups to add regular constraint to obtain a target model to be solved finally;
s7, obtaining g final classification models through learning of steps S1-S6, giving a test sample t, solving the distance between the test sample t and g cluster centers obtained in the step S2, selecting the classification models corresponding to the former r cluster subsets with close distances, bringing the test sample t into the r classification models, and fusing and outputting the final results of the test sample categories.
2. The multi-label learning method based on local correlation of labels and features as claimed in claim 1, wherein: in step S1, the data feature expression matrix X is a real matrix,
Figure FDA0003139964180000013
wherein n represents the number of samples, d represents the number of features,
Figure FDA0003139964180000014
representing a real number domain; the known class label matrix Y e {0, 1}n×qAnd q represents the known number y of category labelsijElements representing i rows and j columns in matrix Y, Yij1 means that the ith sample belongs to the jth class label, otherwise yijI is a positive integer between 1 and n, and j is a positive integer between 1 and q.
3. The multi-label learning method based on local correlation of labels and features as claimed in claim 2, wherein: in step S2, g represents the number of data subsets of the cluster, and the feature representation matrix X of the training data is divided into g data subsets { X ] by k-means clustering1,...,XgAnd f, each group of corresponding feature matrix
Figure FDA0003139964180000015
nmRepresenting the m-th data subset XmNumber of samples, dividing Y into g marker subsets { Y) according to the resulting data subsets1,...,Yg},
Figure FDA0003139964180000016
YmIs XmM is more than or equal to 1 and less than or equal to g in the corresponding mark subset in Y; obtaining a cluster center matrix from the clustering of the training data X
Figure FDA0003139964180000017
Figure FDA0003139964180000018
Row i in C
Figure FDA0003139964180000019
Representing the ith data subset XiThe cluster center vector of (2).
4. A multi-label learning method based on local correlation of labels and features according to claim 3, characterized in that: in the above step S3, the setting is performed
Figure FDA0003139964180000021
A tag correlation matrix corresponding to the mth data subset, wherein the ith row and the jth column of elements
Figure FDA0003139964180000022
Calculating cosine similarity through the formula (1) to obtain a mark correlation matrix:
Figure FDA0003139964180000023
in the formula (1), the reaction mixture is,
Figure FDA0003139964180000024
represents YmThe value of the ith row and ith column element,
Figure FDA0003139964180000025
represents YmThe value of the element in the h row and the j column;
mark correlation matrix BmThe corresponding laplacian matrix is:
Figure FDA0003139964180000026
wherein, sum (B)m) Represents respectively to BmIs added, and a column vector, diag (sum (B) is returnedm) Return a dimension and sum (B)m) The number of rows of the matrix is the same, and diagonal elements and sum (B)m) Corresponding to each other, and the values of the other elements are all 0;
setting up
Figure FDA0003139964180000027
The element of the ith row and the jth column in the characteristic correlation matrix corresponding to the mth data subset
Figure FDA0003139964180000028
Calculating cosine similarity through the formula (2) to obtain a characteristic correlation matrix:
Figure FDA0003139964180000029
in the formula (2), the reaction mixture is,
Figure FDA00031399641800000210
represents XmThe value of the ith row and ith column element,
Figure FDA00031399641800000211
represents XmThe value of the element in the h row and the j column;
characteristic correlation matrix SmThe corresponding laplacian matrix is:
Figure FDA00031399641800000212
wherein, sum (S)m) Represents respective pairs SmIs added, and a column vector, diag (S) (sum) is returnedm) Return one dimension and sum (S)m) The number of rows of (A) is the same, and the diagonal elements and sum (S)m) Corresponding to each other, and the values of the other elements are all 0;
setting up
Figure FDA00031399641800000213
Is a similarity matrix between g data subsets, where any ith row and jth column element aijRepresenting the similarity between the ith data subset and the jth data subset, and calculating the cosine similarity by using the formula (3) to obtain a similarity matrix between the data subsets:
Figure FDA00031399641800000214
in the formula (3), cihRepresenting the value of the element in row i and column h in C, CjhRepresenting the values of the elements in row j and column h of C.
5. The multi-label learning method based on local correlation of labels and features as claimed in claim 4, wherein: in step S4, a multiple linear regression model is used as a classifier to create a linear classification model Wm(ii) a Characterizing X based on each data subsetmLearning a matrix Y mapped to class labels for each of g subsets of datamLinear classification model f (X)m,Wm)=XmWmAnd for model parameters
Figure FDA00031399641800000215
And (3) making a regular constraint of the F normal form to control the complexity of the model and obtain an updated minimized target formula:
Figure FDA00031399641800000216
in the formula (4), the matrix WmThe model parameter to be solved, lambda is a nonnegative weight coefficient, and the value range is {10-5,10-4,10-3,10-2,100,101}。
6. The multi-label learning method based on local correlation of labels and features as claimed in claim 5, wherein: in step S5, the local tag correlation is modeled, and the tag correlation matrix B is calculated using the formula (1)mEach element value in the matrix represents the degree to which any two tokens in a subset of tokens are related, if token subset YmAny two of the markers
Figure FDA0003139964180000031
And
Figure FDA0003139964180000032
the more correlated, the output of the linear model corresponding to the two labels
Figure FDA0003139964180000033
And
Figure FDA0003139964180000034
the more similar; here, the local tag correlation is controlled on the output, and the tag manifold regularization is adopted for constraint, so as to obtain a minimization target formula:
Figure FDA0003139964180000035
in the formula (5), matrix
Figure FDA0003139964180000036
Fm=XmWmAlpha is a non-negative weight coefficient, and tr (·) represents a matrix trace norm; wherein
Figure FDA0003139964180000037
And
Figure FDA0003139964180000038
is that the column vector represents W respectivelymThe ith column and the jth column of (1),
Figure FDA0003139964180000039
and
Figure FDA00031399641800000310
is a column vector representing Y respectivelymThe ith and jth columns of (1).
7. The multi-label learning method based on local correlation of labels and features as claimed in claim 6, wherein: in step S5, the local feature correlation is modeled, and a feature correlation matrix S is calculated by using the formula (2)mEach element value in the matrix represents a subset of data XmIf any two features in the subset are similar, then the subset X of featuresmAny two features of
Figure FDA00031399641800000311
And
Figure FDA00031399641800000312
the more similar the two features correspond to model coefficients
Figure FDA00031399641800000313
And
Figure FDA00031399641800000314
Figure FDA00031399641800000315
the more similar(ii) a Here, the local feature correlation is controlled on a model coefficient, and a feature manifold regularization is adopted for constraint to obtain a minimized target formula:
Figure FDA00031399641800000316
in the formula (6), beta is a non-negative weight coefficient, and tr (·) represents a matrix trace norm; wherein
Figure FDA00031399641800000317
And
Figure FDA00031399641800000318
are column vectors, each representing XmThe ith column and the jth column of (1),
Figure FDA00031399641800000319
and
Figure FDA00031399641800000320
are row vectors, each representing WmThe k-th line and the l-th line.
8. The multi-label learning method based on local correlation of labels and features as claimed in claim 7, wherein: in step S6, the similarity between the data subsets is modeled, and regular constraints are added by using the model coefficients corresponding to the similar data subsets, that is, if X is XiAnd XjSimilar rule WiAnd WjAlso similarly, XiAnd XjRespectively representing the ith and jth data subsets, WiAnd WjRespectively representing model coefficient matrixes corresponding to the ith and jth data subsets to obtain a final target formula to be solved:
Figure FDA00031399641800000321
Figure FDA0003139964180000041
in the formula (7), gamma is a non-negative weight coefficient, amjIs the m row and j column in A represents the m data subset XmAnd j-th data subset XjThe similarity between the two is obtained by calculating according to the formula (3).
9. A multi-label learning method based on local relevance of labels and features according to any of claims 1-8, characterized by: in the step S7, a feature expression x of a test sample t is giventFinding xtAnd g cluster centers C
Figure FDA0003139964180000042
Calculated by equation (8):
Figure FDA0003139964180000043
wherein c isi(j) Representing the ith data subset XiCluster center vector c ofiThe jth element value of (1), xt(j) Representing a vector xtValue of the j-th element of (1), ft(i) Represents a vector ftThe value of the i-th element in (1) is xtAnd ciThe Euclidean distance between the two (i is more than or equal to 1 and less than or equal to g);
according to the calculated distance ftSelecting the first r data subsets { X ] with smaller distance from the test sample1,...,XrModel coefficients { W } obtained in the training phase1,...,WrAnd testing the test data t respectively to obtain an average value to obtain a predicted value
Figure FDA0003139964180000044
Figure FDA0003139964180000045
In the formula (9), r is a non-negative weight coefficient,
Figure FDA0003139964180000046
is the first r subsets of data
Figure FDA0003139964180000047
The corresponding model coefficients.
10. The multi-label learning method based on local correlation of labels and features as claimed in claim 9, wherein: in step S7, final output label vectors y of the test samples in q categories are calculated based on the obtained predicted values of the test samples t and the set threshold τt∈{0,1}1×q
yt(i)=[(zt)i>τ],1≤i≤l (10)
When the condition is met, returning to 1 to indicate that the test sample belongs to the ith class; otherwise, 0 is returned, indicating that the class i does not belong to.
CN202110734886.2A 2021-06-30 2021-06-30 Multi-label learning method based on local correlation of labels and features Pending CN113420821A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110734886.2A CN113420821A (en) 2021-06-30 2021-06-30 Multi-label learning method based on local correlation of labels and features

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110734886.2A CN113420821A (en) 2021-06-30 2021-06-30 Multi-label learning method based on local correlation of labels and features

Publications (1)

Publication Number Publication Date
CN113420821A true CN113420821A (en) 2021-09-21

Family

ID=77717903

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110734886.2A Pending CN113420821A (en) 2021-06-30 2021-06-30 Multi-label learning method based on local correlation of labels and features

Country Status (1)

Country Link
CN (1) CN113420821A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117454154A (en) * 2023-12-22 2024-01-26 江西农业大学 Robust feature selection method for bias marker data

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117454154A (en) * 2023-12-22 2024-01-26 江西农业大学 Robust feature selection method for bias marker data

Similar Documents

Publication Publication Date Title
CN110689081B (en) Weak supervision target classification and positioning method based on bifurcation learning
CN111460249B (en) Personalized learning resource recommendation method based on learner preference modeling
CN109948425B (en) Pedestrian searching method and device for structure-aware self-attention and online instance aggregation matching
Wang et al. Discriminative feature and dictionary learning with part-aware model for vehicle re-identification
CN112819023B (en) Sample set acquisition method, device, computer equipment and storage medium
CN113190699A (en) Remote sensing image retrieval method and device based on category-level semantic hash
CN113360701B (en) Sketch processing method and system based on knowledge distillation
CN113821670B (en) Image retrieval method, device, equipment and computer readable storage medium
CN113887661B (en) Image set classification method and system based on representation learning reconstruction residual analysis
CN110647904A (en) Cross-modal retrieval method and system based on unmarked data migration
Park et al. Fast and scalable approximate spectral matching for higher order graph matching
CN112132186A (en) Multi-label classification method with partial deletion and unknown class labels
SG171858A1 (en) A method for updating a 2 dimensional linear discriminant analysis (2dlda) classifier engine
CN113065409A (en) Unsupervised pedestrian re-identification method based on camera distribution difference alignment constraint
CN115827954A (en) Dynamically weighted cross-modal fusion network retrieval method, system and electronic equipment
CN113764034B (en) Method, device, equipment and medium for predicting potential BGC in genome sequence
CN109857892B (en) Semi-supervised cross-modal Hash retrieval method based on class label transfer
CN114579794A (en) Multi-scale fusion landmark image retrieval method and system based on feature consistency suggestion
CN114373093A (en) Fine-grained image classification method based on direct-push type semi-supervised deep learning
CN113420821A (en) Multi-label learning method based on local correlation of labels and features
CN112183580B (en) Small sample classification method based on dynamic knowledge path learning
CN116343915B (en) Construction method of biological sequence integrated classifier and biological sequence prediction classification method
CN107885854A (en) A kind of semi-supervised cross-media retrieval method of feature based selection and virtual data generation
CN115861902B (en) Unsupervised action migration and discovery method, system, device and medium
CN116383441A (en) Community detection method, device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210921

RJ01 Rejection of invention patent application after publication