CN116777006A - Sample missing label enhancement-based multi-label learning method, device and equipment - Google Patents

Sample missing label enhancement-based multi-label learning method, device and equipment Download PDF

Info

Publication number
CN116777006A
CN116777006A CN202310813822.0A CN202310813822A CN116777006A CN 116777006 A CN116777006 A CN 116777006A CN 202310813822 A CN202310813822 A CN 202310813822A CN 116777006 A CN116777006 A CN 116777006A
Authority
CN
China
Prior art keywords
label
tag
missing
sample
processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310813822.0A
Other languages
Chinese (zh)
Inventor
房小兆
吕炜俊
曾峙翔
胡曦
刘源源
周郭许
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN202310813822.0A priority Critical patent/CN116777006A/en
Publication of CN116777006A publication Critical patent/CN116777006A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The application relates to a multi-label learning method, device and equipment based on sample missing label enhancement, wherein the method comprises the steps of obtaining a training data set of missing label samples; preprocessing the training data set to obtain a processing training set for recovering the real label; performing learning and aggregation processing by adopting an algorithm adaptation strategy according to the processing training set to obtain a classifier for multi-label learning; taking the classifier as a label prediction model; and inputting the sample to be predicted into a label prediction model to obtain a label corresponding to the sample to be predicted. The method realizes the enhancement of label information by obtaining a processing training set for recovering the real labels; inducing the processing training set by adopting an algorithm adaptation strategy to obtain a classifier considering the class unbalance problem in the processing training set; the label prediction model is constructed based on the classifier, so that the problem of unbalance of multiple labels is solved, and the precision and accuracy of the predicted labels are improved.

Description

Sample missing label enhancement-based multi-label learning method, device and equipment
Technical Field
The application relates to the technical field of labels, in particular to a multi-label learning method, device and equipment based on sample missing label enhancement.
Background
Humans produce data in gigabytes per day, resulting in an increasing need for innovative work to effectively address the great challenges of multi-tag learning with big data. The multi-label learning technique, which may also be referred to as a multi-label classification technique, i.e., assigning multiple labels simultaneously to each image instance, is critical in various fields from protein functional classification and file classification to automatic image classification. However, for multi-label learning, it is often difficult and expensive to collect completely supervised, i.e. completely correct and complete data for supervised learning, and often accompanied by high time costs, so how to model the dependency of labels under limited supervision and to deal with the problem of incomplete supervision, so that efficient and accurate multi-label classification is a key bottleneck to be solved in real-world classification tasks.
Currently, there are two main types of methods for multi-label learning of missing labels, which are methods based on low rank and graph assumptions, respectively. While the presence of tag dependencies generally means that the output space is low rank, this assumption has been widely used to supplement missing items in tag completion tasks. Since it facilitates two key targets in tag deletion MLML, namely tag correlation extraction and missing tag complement. The multi-label classification problem of the missing labels is regarded as a low-rank matrix completion problem with side information, namely characteristics. Considering the tag matrix as having an output of the classifier matrix with side information matrix constrained by a low rank and generalizing the problem to a flexible empirical risk minimization framework. Classifier decomposition into two low rank matrices is often employed and another optimization method is used to effectively address large scale problems. In this case, the presence of the tail tag may break the low rank attribute with a certain probability. Therefore, the tail label is regarded as an outlier, and the output label matrix is respectively decomposed into two low-rank sparse label matrices, and then another optimization problem is solved to obtain an ideal classifier. The low-order hypotheses may be utilized in various ways. On the basis of low rank, the embedded method is also a widely used method for projecting labels into a low-dimensional space and then training the classifier. However, the training set for multi-label learning generally has a class imbalance problem, which easily causes the reduction of the training efficiency and generalization performance of the classifier, and the class imbalance problem cannot be well treated by the method based on low-rank hypothesis at present.
The existing multi-label learning for missing labels can also be used for solving the problem of missing multi-labels by adopting a graph-based model. A weighted graph is represented by g= (V, E, W), where V represents the set of vertices, E represents the set of edges, and W is a weight matrix. In the defined graph, the most typical strategy is to add a manifold regularization term in the empirical risk minimization framework. Namely, the method comprises the following steps: a tag specific map is constructed for each tag from the feature-induced similarity maps. A manifold regularization term is then added for each tag distribution. Popular regularization terms generally need to satisfy three assumptions: tag consistency, sample level smoothness, tag level smoothness. The obtained result is closer to the real label matrix by constraining the three conditions. The graph information is mainly used to disambiguate incomplete supervision and involves different techniques to exploit tag relevance. Many graph-based approaches focus only on sample-level smoothing principles, but some work is also focused on label smoothing. In addition, the deep learning model is utilized to disambiguate missing labels, a full connected graph with vertices as labels is established, and then a Graph Neural Network (GNN) is trained to model label dependency. The input of the GNN is the feature vector of the convolutional neural network extracted image, and the output is the predicted label. And reducing the normalization factor according to the label proportion by using partial binary cross entropy loss. Class learning strategies that learn from the progress models are employed to complement missing items.
In summary, the two approaches described above have so far predominated in the multi-label learning problem of missing labels, and although some recent work has thought to improve performance by using depth models, they have mainly involved trivial convolutional networks and self-encoders. Since the machine-learned dataset only gives logical labels, these methods test the performance of the model by training the classifier directly with labels in logical form and feature information, and then classifying unknown samples. Considering that information in logical form often contains only the correlation of labels to samples, and no correlation information of labels that belong to the same instance and no correlation information of labels that do not belong to an instance.
Disclosure of Invention
The embodiment of the application provides a multi-label learning method, device and equipment based on sample missing label enhancement, which are used for solving the technical problem that the output predicted label is inaccurate due to class imbalance in the existing multi-label learning process of the missing label.
In order to achieve the above object, the embodiment of the present application provides the following technical solutions:
in one aspect, a multi-tag learning method based on sample missing tag enhancement is provided, comprising the steps of:
Acquiring a training data set of the missing label sample;
preprocessing the training data set to obtain a processing training set for recovering the real label;
classifying and aggregating by adopting an algorithm adaptation strategy according to the processing training set to obtain a classifier for multi-label learning;
taking the classifier as a label prediction model; and inputting the sample to be predicted into a label prediction model to obtain a label corresponding to the sample to be predicted.
Preferably, preprocessing the training data set to obtain a processing training set for recovering the real label includes:
optimizing the training data set by adopting a tensor singular value decomposition mode with low rank constraint to obtain correlation data of the missing tag sample;
processing the training data set by adopting a mapping model to obtain mapping data;
constructing an optimization model according to the mapping data, the correlation data and the training set data;
performing iterative update convergence calculation on the optimization model by adopting a first-order gradient quasi-Newton method to obtain the mapping optimal parameters of the label distribution; and processing the mapping data by thresholding according to the mapping optimal parameters to obtain a processing training set for recovering the real labels.
Preferably, optimizing the training data set by adopting a tensor singular value decomposition mode with low rank constraint, and obtaining the correlation data of the missing tag sample includes:
constructing an augmented Lagrangian equation according to the feature matrix, the label matrix and the tensor singular value decomposition of the tensor kernel norms of the training data set;
performing iterative update calculation according to the extended Lagrangian equation to obtain converged correlation data;
the augmented lagrangian equation is:
in the method, in the process of the invention,to assist tensor variables, A 1 ,A 2 ,/>Are Lagrangian multipliers, X is a feature matrix of the training data set, Y is a standard matrix of the training data set, and +.>Epsilon is represented by->The three-order tensors constructed separately, I F Is the F-norm of the matrix,λ 2 mu, p represent balance coefficients of different values, ">For correlation data, E is the corruption of missing tag samples.
Preferably, the optimization model is:
wherein n is the total number of missing label samples in the training data set, y i For the q-dimensional logical binary label vector of the i-th missing label sample,θ is the weight matrix, b is the bias vector, +.>For correlation data, Y is a standard matrix of a training data set, I is an identity matrix, T is a transpose of the matrix, and xi= [ ζ (x) 1 ),…,ξ(x n )],ξ(x i ) Embedding a d-dimensional real value vector of an ith missing label sample into a high-dimensional space for a Gaussian kernel function F Is F norm of matrix, ++>Is a label distribution matrix lambda 1 Is a balance parameter.
Preferably, processing the mapping data by thresholding according to the mapping optimal parameter, and obtaining a processing training set for recovering the real label includes:
carrying out mapping processing on the mapping data according to the mapping optimal parameters to obtain mapping label distribution;
normalizing the mapping tag distribution to obtain tag distribution and a tag distribution matrix;
processing the labels distributed by the labels by adopting a threshold processing formula to obtain real labels;
constructing a processing training set by n missing label samples and real labels corresponding to the n missing label samples;
the thresholding formula is:
thr∈[0,1]
wherein y is l As a tag of the first class of tags,is the true label of the j-th column, Y j For the set of all tags in column j of the tag matrix, -/->For tag distribution of class I in column j, y l* A class of tags that is the highest tag distribution value.
Preferably, the classifying and aggregating processing is performed by adopting an algorithm adaptation strategy according to the processing training set, and the classifier for obtaining the multi-label learning comprises the following steps:
Classifying the processing training set according to each type of labels according to the positive missing label sample and the negative missing label sample to obtain a classified data set;
selecting a real label of the first class and a real label of the K class from the classified data set, and cross-coupling the real label of the K class and the real label of the first class to obtain a coupled data set of the labels, wherein l and K epsilon q, and l is not equal to K;
and adopting K multiclass unbalanced learners to learn all the coupled data sets and then coupling the coupled data sets to obtain the multicag learned classifier.
Preferably, the sample missing tag enhancement-based multi-tag learning method includes: acquiring a threshold constant, and processing each type of real label by adopting a real value mapping function of a classifier to obtain a prediction confidence coefficient corresponding to each type of real label; and distinguishing the types of the missing label samples corresponding to each type of real labels according to whether the prediction confidence is larger than a threshold constant.
In still another aspect, a multi-tag learning device based on sample missing tag enhancement is provided, which includes a data acquisition module, a preprocessing module, a learning aggregation module and a prediction output module;
the data acquisition module is used for acquiring a training data set of the missing label sample;
The preprocessing module is used for preprocessing the training data set to obtain a processing training set for recovering the real label;
the learning aggregation module is used for carrying out classification and aggregation treatment by adopting an algorithm adaptation strategy according to the treatment training set to obtain a classifier for multi-label learning;
the prediction output module is used for taking the classifier as a label prediction model; and inputting the sample to be predicted into a label prediction model to obtain a label corresponding to the sample to be predicted.
Preferably, the preprocessing module is further configured to perform optimization processing on the training data set by adopting a tensor singular value decomposition mode of low-rank constraint to obtain correlation data of the missing tag sample; processing the training data set by adopting a mapping model to obtain mapping data; constructing an optimization model according to the mapping data, the correlation data and the training set data; performing iterative update convergence calculation on the optimization model by adopting a first-order gradient quasi-Newton method to obtain the mapping optimal parameters of the label distribution; and processing the mapping data by thresholding according to the mapping optimal parameters to obtain a processing training set for recovering the real labels.
In yet another aspect, a terminal device is provided that includes a processor and a memory;
the memory is used for storing program codes and transmitting the program codes to the processor;
the processor is configured to execute the sample missing tag enhancement-based multi-tag learning method according to the instruction in the program code.
From the above technical solutions, the embodiment of the present application has the following advantages: the multi-label learning method, device and equipment based on sample missing label enhancement, wherein the method comprises the steps of obtaining a training data set of missing label samples; preprocessing the training data set to obtain a processing training set for recovering the real label; performing learning and aggregation processing by adopting an algorithm adaptation strategy according to the processing training set to obtain a classifier for multi-label learning; taking the classifier as a label prediction model; and inputting the sample to be predicted into a label prediction model to obtain a label corresponding to the sample to be predicted. The multi-label learning method based on sample missing label enhancement realizes enhancement of label information by obtaining a processing training set for recovering real labels; inducing the processing training set by adopting an algorithm adaptation strategy to obtain a classifier considering the class unbalance problem in the processing training set; the method solves the problem of unbalance of multi-label class based on the classifier, improves the precision and accuracy of the predicted labels, and solves the technical problem of inaccurate label output prediction caused by class unbalance in the existing multi-label learning process of missing labels.
Drawings
In order to more clearly illustrate the embodiments of the application or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the description below are only some embodiments of the application, and that other drawings can be obtained according to these drawings without inventive faculty for a person skilled in the art.
FIG. 1 is a flowchart illustrating steps of a sample missing tag-based enhanced multi-tag learning method according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a label prediction model in a sample missing label-enhanced multi-label learning method according to an embodiment of the present application;
fig. 3 is a frame diagram of a multi-label learning apparatus based on sample missing label enhancement according to an embodiment of the present application.
Detailed Description
In order to make the objects, features and advantages of the present application more comprehensible, the technical solutions in the embodiments of the present application are described in detail below with reference to the accompanying drawings, and it is apparent that the embodiments described below are only some embodiments of the present application, but not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
In the description of embodiments of the present application, the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more such feature. In the description of the embodiments of the present application, the meaning of "plurality" is two or more, unless explicitly defined otherwise.
In the embodiments of the present application, unless explicitly specified and limited otherwise, the terms "mounted," "connected," "secured" and the like are to be construed broadly and include, for example, either permanently connected, removably connected, or integrally formed; can be mechanically or electrically connected; can be directly connected or indirectly connected through an intermediate medium, and can be communicated with the inside of two elements or the interaction relationship of the two elements. The specific meaning of the above terms in the embodiments of the present application will be understood by those of ordinary skill in the art according to specific circumstances.
The embodiment of the application provides a multi-label learning method, device and equipment based on sample missing label enhancement, which adopt two stages of pretreatment and predictive classifier induced step-by-step treatment, decompose and sequentially treat the problems to be solved, and refine the problems better. The obtained label distribution is used for recovering the real label, so that the method has better interpretation and reliability, and the algorithm adaptation strategy for the induction of the prediction classifier well considers the problem of class unbalance while utilizing the label correlation. The method and the device are used for solving the technical problem that the output predicted label is inaccurate due to the fact that the existing multi-label learning processing of the missing label is unbalanced.
Embodiment one:
fig. 1 is a flowchart illustrating steps of a sample missing tag enhancement-based multi-tag learning method according to an embodiment of the present application, and fig. 2 is a schematic diagram illustrating a tag prediction model in the sample missing tag enhancement-based multi-tag learning method according to an embodiment of the present application.
As shown in fig. 1, an embodiment of the present application provides a multi-tag learning method based on sample missing tag enhancement, including the following steps:
s1, acquiring a training data set of a missing label sample.
In step S1, a training data set is acquiredx refers to the real-valued feature vector corresponding to the image obtained by extracting and re-vectorizing the features of the missing label sample (usually the image) through a neural network, and can be understood as all the features of the image. The label vector y corresponds to the relevant condition of all the existing class labels of the image, the labels belonging to the image are set to 1, the labels not belong to 0, and each missing label sample (each image) corresponds to one label vector, so that the label matrix also reflects the multi-label classification condition of the missing label sample. Missing tag sample x of the ith instance i Represented by a d-dimensional real-valued eigenvector, the instance corresponding to a q-dimensional logical binary label vector y i I.e., missing tag samples with a total of n instances, q class tags. Theoretically, x= [ X ] is represented by 1 ,x 2 ,…,x n ] T ∈R n×d Feature matrix, and use Y= [ Y ] 1 ,y 2 ,…,y n ] T ∈R n×q The tag matrix is characterized. Y is Y ij Represents one item of the ith row and jth column in the label matrix and Y ij ∈{1,0}。Y ij The i-th instance can be considered to be related to the j-th label, Y ij The i-th instance can be considered to be uncorrelated with the j-th label or uncertainty about its correlation, i.e., missing labelAnd (5) signing. The multi-label learning method based on sample missing label enhancement restores the missing label matrix Y into a real label matrix +.>And obtaining a classifier through the real sample information learning, and predicting a label matrix of an unknown example.
S2, preprocessing the training data set to obtain a processing training set for recovering the real label.
It should be noted that, in step S2, processing is performed according to the training data set obtained in step S1, so as to obtain a processed training set restored to a real tag matrix, and data is provided for a classifier obtained subsequently. According to the multi-label learning method based on sample missing label enhancement, the processing training set is obtained, so that label distribution is close to real label information, the credibility of labels to samples of examples can be obtained after label information is enhanced, the credibility representing information quantity can represent information which cannot be represented by single logic value representing information quantity, the credibility of labels to examples is represented, the relative relativity of labels to examples is reflected, and the labels with the credibility which can be recovered through threshold value screening can be obtained.
And S3, learning and aggregation processing is carried out by adopting an algorithm adaptation strategy according to the processing training set, and a classifier for multi-label learning is obtained.
It should be noted that, in step S3, the data in the processing training set obtained in step S2 is processed to obtain a classifier suitable for multi-label learning, so that the multi-label learning method based on sample missing label enhancement adopts an algorithm adaptation strategy to transform a label prediction process into a different general problem transformation strategy, the strategy induces a plurality of class imbalance learners through random coupling with other labels, each learner corresponds to a random pair of labels, the predictions of the learners on the missing samples are aggregated to determine whether the corresponding labels are related to the classifier of the sample, and the class imbalance problem can be well considered while using the label correlation is realized through the classifier.
S4, taking the classifier as a label prediction model; and inputting the sample to be predicted into a label prediction model to obtain a label corresponding to the sample to be predicted.
In the step S4, a label prediction model is constructed according to the step S3 to obtain a distributor capable of balancing multiple types of labels, and the sample missing label enhancement-based multi-label learning method can predict the labels of the samples to be predicted by using the label prediction model to obtain the labels predicted corresponding to the samples to be predicted.
The application provides a multi-label learning method based on sample missing label enhancement, which comprises the steps of obtaining a training data set of a missing label sample; preprocessing the training data set to obtain a processing training set for recovering the real label; performing learning and aggregation processing by adopting an algorithm adaptation strategy according to the processing training set to obtain a classifier for multi-label learning; taking the classifier as a label prediction model; and inputting the sample to be predicted into a label prediction model to obtain a label corresponding to the sample to be predicted. The multi-label learning method based on sample missing label enhancement realizes enhancement of label information by obtaining a processing training set for recovering real labels; inducing the processing training set by adopting an algorithm adaptation strategy to obtain a classifier considering the class unbalance problem in the processing training set; the method solves the problem of unbalance of multi-label class based on the classifier, improves the precision and accuracy of the predicted labels, and solves the technical problem of inaccurate label output prediction caused by class unbalance in the existing multi-label learning process of missing labels.
In one embodiment of the present application, preprocessing the training data set to obtain a processed training set that recovers the true tags includes:
Optimizing the training data set by adopting a tensor singular value decomposition mode of low-rank constraint to obtain correlation data of the missing tag sample;
processing the training data set by adopting a mapping model to obtain mapping data;
constructing an optimization model according to the mapping data, the correlation data and the training set data;
performing iterative update convergence calculation on the optimization model by adopting a first-order gradient quasi-Newton method to obtain the mapping optimal parameters of the label distribution; processing the mapping data by thresholding according to the mapping optimal parameters to obtain a processing training set for recovering the real labels;
the optimization model is as follows:
wherein n is the total number of missing label samples in the training data set, y i For the q-dimensional logical binary label vector of the i-th missing label sample,θ is the weight matrix, b is the bias vector, +.>For correlation data, Y is a standard matrix of a training data set, I is an identity matrix, T is a transpose of the matrix, and xi= [ ζ (x) 1 ),…,ξ(x n )],ξ(x i ) Embedding a d-dimensional real value vector of an ith missing label sample into a high-dimensional space for a Gaussian kernel function F Is F norm of matrix, ++>Is a label distribution matrix lambda 1 Is a balance parameter.
The multi-label learning method based on sample missing label enhancement is characterized in that a tensor singular value decomposition mode based on low-rank constraint is adopted to preprocess a training data set, and then existing sample logic value label information and characteristic information are adopted to monitor recovery of label distribution, so that a processing training set for recovering real labels is obtained. The efficient first-order gradient quasi-newton method is also called a finite memory BFGS method, and is similar to a low-rank expression solution, and is also an existing method, and the content of the first-order gradient quasi-newton method is not described in detail.
In this embodiment, performing optimization processing on the training data set by adopting a tensor singular value decomposition mode with low-rank constraint, and obtaining correlation data of the missing tag sample includes:
constructing an augmented Lagrangian equation according to the feature matrix, the tag matrix and the tensor singular value decomposition of the tensor kernel norms of the training data set;
performing iterative update calculation according to the extended Lagrangian equation to obtain converged correlation data;
the augmented lagrangian equation is:
in the method, in the process of the invention,to assist tensor variables, A 1 ,A 2 ,/>Are Lagrangian multipliers, X is a feature matrix of the training data set, Y is a standard matrix of the training data set, and +.>Epsilon is represented by->The three-order tensors constructed separately, I F Is F norm, lambda of matrix 2 Mu, p represent balance coefficients of different values, ">For correlation data, E is the corruption of missing tag samples.
It should be noted that, the sample missing tag enhancement-based multi-tag learnerThe method may be performed by deleting the tag sample x i For illustration, a training data set is first converted into mapping data by a mapping model Is composed of->The mapping data may be represented by a first formula, which is:
In the first formula, in order to obtain an optimal solutionThe first formula may be optimized by an objective function:
in the method, in the process of the invention,for loss function->To mine underlying information for sample dependencies, lambda 1 Is a balance parameter. For the followingAnd->The details include: />Is a loss function between the logical tag and the tag distribution. />The information that is the recovered tag distribution should be close to the existing logical tag information. For example, a logical label lacking label samples is {0, 1}, and the label distribution is restored to { d } 1 ,d 2 ,d 3 Then it is reasonable to infer d 1 And d 2 Are all close to 0, d 3 Close to 1. Since the a priori information of the actual tag distribution is implicit, therefore +.>The loss can be functionalized as a least squares Loss (LS) function equation:
whileIs an important part of the tag distribution recovery process. Specifically, the global sample correlation is expressed by a second formula, which is:
since the lack of label samples can be represented by a linear combination of correlation samples, global sample correlation in feature space can be explored by minimizing low rank expression LRR. Sample correlation is obtained by applying a low rank representation on the feature space, aimed at looking for the LRR of the feature matrix X to mine the global structure in the feature space. Therefore, assuming that x=xc+e, a rank minimization problem is obtained, where the rank minimization problem is expressed by a third formula:
Wherein E is the damage of the missing tag sample, lambda 2 Is an imbalance factor that balances two parts affecting low rank. By using l 2,1 Norms to handle corruption of missing tag samples, then:
because the third formula is a non-convex function, in order to facilitate optimization, the third formula is replaced by a kernel function, and then the third formula is converted into a fourth formula, wherein the fourth formula is as follows:
in the formula, C * To be the core norm of the matrix C,is the ith singular value of C.
In the embodiment of the application, in the process of constructing the augmented Lagrangian equation, besides the bottom information of sample characteristics, the performance and accuracy of the label enhancement process are improved by utilizing the existing logic label information, so that the multi-label learning method based on sample missing label enhancement fully excavates the hidden information of the data sample based on tensor singular value decomposition of low-rank tensor constraint, and then the fourth formula is converted into the fifth formula, wherein the fifth formula is as follows:
in the method, in the process of the application,and ε is defined by->Third-order tensors, respectively constructed +.>Tensor singular value decomposition representing tensor based on tensor kernel norms,/->Can be expressed as:
in the method, in the process of the application,the representation is along +.>Fast fourier expansion in third dimension, +.>Representation->The p-th diagonal element, f is the subscript of tensor singular value decomposition, ++ >Can be calculated by the sixth formula:
wherein U is,V is tensor singular value decomposition, respectively.
Unitary invariance from matrix core normsCan be used forThe method comprises the following steps of:
wherein the method comprises the steps ofWill->The expansion into block diagonal form is:
the block circulant matrix can be directly block-diagonalized in view of fourier transform, soExpressed as:
therefore, the multi-label learning method based on sample missing label enhancement utilizes tensor singular value decomposition based on low-rank tensor constraint to integrate the bottom information of the existing logic labels into the formation process of sample correlation, and builds an augmented Lagrange equation.
In the embodiment of the application, processing the mapping data by thresholding according to the mapping optimal parameter to obtain the processing training set for recovering the real label comprises the following steps:
mapping the mapping data according to the mapping optimal parameters to obtain mapping label distribution;
normalizing the mapped label distribution to obtain label distribution and a label distribution matrix;
processing the labels distributed by adopting a threshold processing formula to obtain real labels;
constructing a processing training set by n missing label samples and real labels corresponding to the n missing label samples;
The thresholding formula is:
thr∈[0,1]
wherein y is l As a tag of the first class of tags,is the true label of the j-th column, Y j For the set of all tags in column j of the tag matrix, -/->For tag distribution of class I in column j, y l* A class of tags that is the highest tag distribution value.
It should be noted that, according to the optimization model, the optimal mapping parameters in the mapping data are obtained by iterative optimizationPassing unknown samples through->Mapping to obtain corresponding tag distribution->Since the tag distribution needs to satisfy the constraint, D needs to be i Normalization was performed by softmax normalization. The tag distribution produced lastAnd tag distribution matrix->The constraint conditions are as follows:
in this embodiment, the recovered tag matrix is thresholded to obtain a value greater than a certain threshold thr ε [0,1 ]]In order to prevent the output from being an empty set, outputting the label corresponding to the maximum label distribution value through a threshold processing formula. Wherein the training set is processed as Is the true label of all classes in the ith missing label sample.
In one embodiment of the present application, learning and aggregation processing is performed by adopting an algorithm adaptation strategy according to a processing training set, and obtaining a classifier for multi-label learning includes:
classifying the processing training set according to each type of labels according to the positive missing label sample and the negative missing label sample to obtain a classified data set;
Selecting a real label of a first class and a real label of a K class from the classified data set, and cross-coupling the real label of the K class and the real label of the first class to obtain a coupled data set of the labels, wherein l and K epsilon q are equal to l is equal to K;
and adopting K multiclass unbalanced learners to learn all the coupled data sets and then coupling the coupled data sets to obtain the multicag learned classifier.
It should be noted that, classifying data setsThe method comprises the following steps:
wherein y is j For the j-th class of labels, all missing label samples are positive and negative for the j-th class of labels. In the present embodiment, the data set is classifiedThe method comprises the steps of including q-class tags, selecting a class of real tags in a classification data set as a class I real tag, removing the class I real tag data in the classification data set, and selecting a class of real tags in all the rest real tags as a class K real tag. Then cross-coupling the K-th real tag with the first real tag to obtain a tag coupling data set +.>Namely:
wherein y is k Is the K-th real tag. Before the coupled dataset is obtained, the classified dataset needs to be classified at a given tag (y j ,y k ) Under the condition, the classified data set is generalized into a multi-class training set Multi-class training set->The expression of (2) is:
in the embodiment of the application, in the classifier process of multi-label learning, based on random K multi-class unbalanced learnersBy putting->Applied at +.>A multi-class classifier can be obtained>
In one embodiment of the application, a sample missing tag-based enhanced multi-tag learning method includes: acquiring a threshold constant, and processing each type of real label by adopting a real value mapping function of a classifier to obtain a prediction confidence coefficient corresponding to each type of real label; and distinguishing the types of the missing label samples corresponding to each type of real labels according to whether the prediction confidence is larger than a threshold constant.
It should be noted that, the real value mapping function is:
in the formula g jk (+ 2|x) is the lack of a label template x relative to the j-th class label y j Confidence of prediction for positive examples (without regard to x versus y k Positive or negative samples).
In this embodiment, each type of tag randomly decimates K types of tagsCross-coupled thereto, then real-valued mapping function f j (.) aggregating the predictive confidence of the K multiclass imbalance learned classifiers. To obtain a predicted logical tag, the threshold function is set to a constant function t j (x)=a j That is, for some type of tag y j The prediction confidence is greater than the threshold constant a j Can be considered as positive missing tag samples and vice versa as negative missing tag samples.
In the embodiment of the application, for the threshold constant a j The multi-label learning method based on sample missing label enhancement adopts an F1 value measurement mode, and the F1 value measurement mode is commonly used for evaluating the performance of a binary classifier, especially in the case of class distribution bias. Let theRepresenting the application { f } j The F1 value achieved in the dichotomized training set, a }, namely:
threshold constant a j By maximizing the corresponding F1 value, namely:
embodiment two:
fig. 3 is a block diagram of a multi-label learning apparatus based on sample missing label enhancement according to an embodiment of the present application.
As shown in fig. 3, an embodiment of the present application provides a multi-tag learning apparatus based on sample missing tag enhancement, which includes a data acquisition module 10, a preprocessing module 20, a learning aggregation module 30, and a prediction output module 40;
a data acquisition module 10, configured to acquire a training data set of missing tag samples;
a preprocessing module 20, configured to preprocess the training data set to obtain a processed training set for recovering the real tag;
the learning aggregation module 30 is configured to perform learning and aggregation processing by adopting an algorithm adaptation strategy according to the processing training set, so as to obtain a classifier for multi-label learning;
A prediction output module 40, configured to use the classifier as a label prediction model; and inputting the sample to be predicted into a label prediction model to obtain a label corresponding to the sample to be predicted.
In the embodiment of the present application, the preprocessing module 20 is further configured to perform optimization processing on the training data set by adopting a tensor singular value decomposition manner with low rank constraint, so as to obtain correlation data of the missing tag sample; processing the training data set by adopting a mapping model to obtain mapping data; constructing an optimization model according to the mapping data, the correlation data and the training set data; performing iterative update convergence calculation on the optimization model by adopting a first-order gradient quasi-Newton method to obtain the mapping optimal parameters of the label distribution; and processing the mapping data by thresholding according to the mapping optimal parameters to obtain a processing training set for recovering the real labels.
It should be noted that, the modules in the second apparatus correspond to the steps in the first method, and the content of the sample missing tag-based enhanced multi-tag learning method is described in detail in the first embodiment, and the content of the modules in the second apparatus is not described in detail in the second embodiment.
Embodiment III:
the embodiment of the application provides terminal equipment, which comprises a processor and a memory;
A memory for storing program code and transmitting the program code to the processor;
and the processor is used for executing the multi-label learning method based on the sample missing label enhancement according to the instruction in the program code.
It should be noted that the processor is configured to execute the steps in the above-described embodiment of the multi-tag learning method based on sample missing tag enhancement according to the instructions in the program code. In the alternative, the processor, when executing the computer program, performs the functions of the modules/units in the system/apparatus embodiments described above.
For example, a computer program may be split into one or more modules/units, which are stored in a memory and executed by a processor to perform the present application. One or more of the modules/units may be a series of computer program instruction segments capable of performing specific functions for describing the execution of the computer program in the terminal device.
The terminal device may be a computing device such as a desktop computer, a notebook computer, a palm computer, a cloud server, etc. The terminal device may include, but is not limited to, a processor, a memory. It will be appreciated by those skilled in the art that the terminal device is not limited and may include more or less components than those illustrated, or may be combined with certain components, or different components, e.g., the terminal device may also include input and output devices, network access devices, buses, etc.
The processor may be a central processing unit (Central Processing Unit, CPU), other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory may be an internal storage unit of the terminal device, such as a hard disk or a memory of the terminal device. The memory may also be an external storage device of the terminal device, such as a plug-in hard disk provided on the terminal device, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), or the like. Further, the memory may also include both an internal storage unit of the terminal device and an external storage device. The memory is used for storing computer programs and other programs and data required by the terminal device. The memory may also be used to temporarily store data that has been output or is to be output.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other forms.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in part or all of the technical solution or in part in the form of a software product stored in a storage medium, including instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application.

Claims (10)

1. The multi-label learning method based on sample missing label enhancement is characterized by comprising the following steps of:
acquiring a training data set of the missing label sample;
preprocessing the training data set to obtain a processing training set for recovering the real label;
classifying and aggregating by adopting an algorithm adaptation strategy according to the processing training set to obtain a classifier for multi-label learning;
taking the classifier as a label prediction model; and inputting the sample to be predicted into a label prediction model to obtain a label corresponding to the sample to be predicted.
2. The sample missing tag enhancement based multi-tag learning method of claim 1, wherein preprocessing the training dataset to obtain a processed training set that recovers real tags comprises:
Optimizing the training data set by adopting a tensor singular value decomposition mode with low rank constraint to obtain correlation data of the missing tag sample;
processing the training data set by adopting a mapping model to obtain mapping data;
constructing an optimization model according to the mapping data, the correlation data and the training set data;
performing iterative update convergence calculation on the optimization model by adopting a first-order gradient quasi-Newton method to obtain the mapping optimal parameters of the label distribution; and processing the mapping data by thresholding according to the mapping optimal parameters to obtain a processing training set for recovering the real labels.
3. The sample missing tag enhancement-based multi-tag learning method of claim 2, wherein optimizing the training data set in a tensor singular value decomposition manner with low rank constraint to obtain correlation data of missing tag samples comprises:
constructing an augmented Lagrangian equation according to the feature matrix, the label matrix and the tensor singular value decomposition of the tensor kernel norms of the training data set;
performing iterative update calculation according to the extended Lagrangian equation to obtain converged correlation data;
The augmented lagrangian equation is:
in the method, in the process of the invention,to assist tensor variables, A 1 ,A 2 ,/>Are Lagrangian multipliers, X is a feature matrix of the training data set, Y is a standard matrix of the training data set, and C and epsilon are represented by { C (i) } i=1,2 ,{E (i) } i=1,2 The three-order tensors constructed separately, I F Is F norm, lambda of matrix 2 Mu, p represent balance coefficients of different values, ">For correlation data, E is the corruption of missing tag samples.
4. The sample missing tag enhancement based multi-tag learning method of claim 2, wherein the optimization model is:
wherein n is the total number of missing label samples in the training data set, y i For the q-dimensional logical binary label vector of the i-th missing label sample,θ is the weight matrix, b is the bias vector, +.>For correlation data, Y is a standard matrix of a training data set, I is an identity matrix, T is a transpose of the matrix, and xi= [ ζ (x) 1 ),…,ξ(x n )],ξ(x i ) Embedding a d-dimensional real value vector of an ith missing label sample into a high-dimensional space for a Gaussian kernel function F Is F norm of matrix, ++>Is a label distribution matrix lambda 1 Is a balance parameter.
5. The sample missing tag enhancement based multi-tag learning method of claim 2, wherein processing the mapping data using thresholding according to the mapping optimal parameters to obtain a processing training set to recover real tags comprises:
Carrying out mapping processing on the mapping data according to the mapping optimal parameters to obtain mapping label distribution;
normalizing the mapping tag distribution to obtain tag distribution and a tag distribution matrix;
processing the labels distributed by the labels by adopting a threshold processing formula to obtain real labels;
constructing a processing training set by n missing label samples and real labels corresponding to the n missing label samples;
the thresholding formula is:
thr∈[0,1]
wherein y is l As a tag of the first class of tags,is the true label of the j-th column, Y j For the set of all tags in column j of the tag matrix, -/->For tag distribution of class I in column j, y l* At the mostA class of tags with high tag distribution values.
6. The sample missing tag enhancement-based multi-tag learning method of claim 1, wherein the classifying and aggregating the sample missing tag enhancement-based multi-tag learning classifier according to the processing training set using an algorithm adaptation strategy comprises:
classifying the processing training set according to each type of labels according to the positive missing label sample and the negative missing label sample to obtain a classified data set;
selecting a real label of the first class and a real label of the K class from the classified data set, and cross-coupling the real label of the K class and the real label of the first class to obtain a coupled data set of the labels, wherein l and K epsilon q, and l is not equal to K;
And adopting K multiclass unbalanced learners to learn all the coupled data sets and then coupling the coupled data sets to obtain the multicag learned classifier.
7. The sample missing tag-based enhanced multi-tag learning method of claim 6, comprising: acquiring a threshold constant, and processing each type of real label by adopting a real value mapping function of a classifier to obtain a prediction confidence coefficient corresponding to each type of real label; and distinguishing the types of the missing label samples corresponding to each type of real labels according to whether the prediction confidence is larger than a threshold constant.
8. The multi-label learning device based on sample missing label enhancement is characterized by comprising a data acquisition module, a preprocessing module, a learning aggregation module and a prediction output module;
the data acquisition module is used for acquiring a training data set of the missing label sample;
the preprocessing module is used for preprocessing the training data set to obtain a processing training set for recovering the real label;
the learning aggregation module is used for carrying out classification and aggregation treatment by adopting an algorithm adaptation strategy according to the treatment training set to obtain a classifier for multi-label learning;
The prediction output module is used for taking the classifier as a label prediction model; and inputting the sample to be predicted into a label prediction model to obtain a label corresponding to the sample to be predicted.
9. The multi-label learning device based on sample missing label enhancement according to claim 8, wherein the preprocessing module is further configured to perform optimization processing on the training data set by adopting a tensor singular value decomposition mode of low-rank constraint to obtain correlation data of missing label samples; processing the training data set by adopting a mapping model to obtain mapping data; constructing an optimization model according to the mapping data, the correlation data and the training set data; performing iterative update convergence calculation on the optimization model by adopting a first-order gradient quasi-Newton method to obtain the mapping optimal parameters of the label distribution; and processing the mapping data by thresholding according to the mapping optimal parameters to obtain a processing training set for recovering the real labels.
10. A terminal device comprising a processor and a memory;
the memory is used for storing program codes and transmitting the program codes to the processor;
The processor is configured to execute the multi-label learning method based on sample missing label enhancement according to any of claims 1-7 according to instructions in the program code.
CN202310813822.0A 2023-07-04 2023-07-04 Sample missing label enhancement-based multi-label learning method, device and equipment Pending CN116777006A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310813822.0A CN116777006A (en) 2023-07-04 2023-07-04 Sample missing label enhancement-based multi-label learning method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310813822.0A CN116777006A (en) 2023-07-04 2023-07-04 Sample missing label enhancement-based multi-label learning method, device and equipment

Publications (1)

Publication Number Publication Date
CN116777006A true CN116777006A (en) 2023-09-19

Family

ID=87989393

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310813822.0A Pending CN116777006A (en) 2023-07-04 2023-07-04 Sample missing label enhancement-based multi-label learning method, device and equipment

Country Status (1)

Country Link
CN (1) CN116777006A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117420809A (en) * 2023-12-18 2024-01-19 台山市南特金属科技有限公司 Crankshaft machining optimization decision method and system based on artificial intelligence
CN117576012A (en) * 2023-11-10 2024-02-20 中国矿业大学 Disease prediction method based on unbalanced fundus image data

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117576012A (en) * 2023-11-10 2024-02-20 中国矿业大学 Disease prediction method based on unbalanced fundus image data
CN117576012B (en) * 2023-11-10 2024-05-07 中国矿业大学 Disease prediction method based on unbalanced fundus image data
CN117420809A (en) * 2023-12-18 2024-01-19 台山市南特金属科技有限公司 Crankshaft machining optimization decision method and system based on artificial intelligence
CN117420809B (en) * 2023-12-18 2024-03-01 台山市南特金属科技有限公司 Crankshaft machining optimization decision method and system based on artificial intelligence

Similar Documents

Publication Publication Date Title
CN110532417B (en) Image retrieval method and device based on depth hash and terminal equipment
Ma et al. Regularized vector field learning with sparse approximation for mismatch removal
Jing et al. Semi-supervised low-rank mapping learning for multi-label classification
US20230153619A1 (en) Method for training neural network and related device
WO2021188354A1 (en) Automated and adaptive design and training of neural networks
CN116777006A (en) Sample missing label enhancement-based multi-label learning method, device and equipment
CN111127364B (en) Image data enhancement strategy selection method and face recognition image data enhancement method
CN111898703B (en) Multi-label video classification method, model training method, device and medium
Jing et al. Multi-label classification by semi-supervised singular value decomposition
CN110880007A (en) Automatic selection method and system for machine learning algorithm
CN112163114B (en) Image retrieval method based on feature fusion
CN115080749B (en) Weak supervision text classification method, system and device based on self-supervision training
Bora et al. Clustering approach towards image segmentation: an analytical study
Li et al. A novel visual codebook model based on fuzzy geometry for large-scale image classification
CN115439685A (en) Small sample image data set dividing method and computer readable storage medium
Wang et al. Product Grassmann manifold representation and its LRR models
Wang et al. High-dimensional Data Clustering Using K-means Subspace Feature Selection.
Hameed et al. Content based image retrieval based on feature fusion and support vector machine
Bi et al. Critical direction projection networks for few-shot learning
CN110705631B (en) SVM-based bulk cargo ship equipment state detection method
CN116910571A (en) Open-domain adaptation method and system based on prototype comparison learning
US20230105322A1 (en) Systems and methods for learning rich nearest neighbor representations from self-supervised ensembles
CN112766423B (en) Training method and device for face recognition model, computer equipment and storage medium
Shen et al. On image classification: Correlation vs causality
Ji et al. Multi-label classification with weak labels by learning label correlation and label regularization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination