CN110598733A

CN110598733A - Multi-label distance measurement learning method based on interactive modeling

Info

Publication number: CN110598733A
Application number: CN201910715600.9A
Authority: CN
Inventors: 不公告发明人
Original assignee: Nanjing Zhigu Artificial Intelligence Research Institute Co Ltd
Current assignee: Nanjing Zhigu Artificial Intelligence Research Institute Co Ltd
Priority date: 2019-08-05
Filing date: 2019-08-05
Publication date: 2019-12-20
Also published as: WO2021022571A1

Abstract

The invention discloses a multi-label distance measurement learning method based on interactive modeling, which comprises the steps of extracting training data from any multi-label application scene and carrying out multi-label labeling on the training data; the extracted training samples are preprocessed, so that the quality of the samples is improved; representing a distance metric matrix to be learned as a combined distance metric form; defining multi-marker semantic similarity based on feature and marker collaborative computation, and constructing a triple constraint set; combining the combined distance measurement and the triplet constraint set to construct a multi-mark distance measurement learning model, and carrying out optimization solution on the multi-mark distance measurement learning model; after learning distance measurement, mapping training data to a distance measurement space, and then learning by using the existing multi-label learning algorithm to obtain a multi-label classifier based on distance measurement learning; and inputting the sample to be predicted into the classifier to obtain a labeled sample. The invention can greatly reduce the time complexity of the multi-label learning system and increase the practicability of the multi-label learning framework.

Description

Multi-label distance measurement learning method based on interactive modeling

Technical Field

The invention relates to a multi-label distance measurement learning method based on interactive modeling, in particular to a multi-label distance measurement learning method based on interactive modeling of a feature space and a label space, which is suitable for any multi-label learning scene and belongs to the technical field of machine learning.

Background

In recent years, multi-label learning has received much attention from researchers and has resulted in a great deal of research effort. However, due to the combinability of the label space, the multi-label learning has high complexity and is difficult to be applied to the actual scene. The existing multi-label learning method mostly models the correlation among labels from a label space, and has less processing on a feature space. The multi-label data are analyzed, redundancy exists in the feature space, therefore, how to construct a proper multi-label distance measurement representation can greatly improve the learning performance of the multi-label system, and meanwhile, the combination of the distance measurement and the neighbor strategy reduces the complexity of the learning system, so that the practical application of multi-label learning is promoted. In view of the above problems, it is necessary to further study the learning method of the present invention.

Disclosure of Invention

The purpose of the invention is as follows: in order to solve the defects of the prior art, the invention aims to provide a multi-label distance measurement learning method based on interactive modeling, considering the correlation between labels and the structural interaction between the characteristic space and the label space from the aspect of characteristic space processing.

The invention adopts the following technical scheme:

the invention discloses a multi-mark distance measurement learning method based on interactive modeling, which comprises the following steps:

(1) extracting training data from any multi-label application scene, and performing multi-label labeling on the training data;

(2) preprocessing the extracted training samples, filtering and marking the samples with the occupancy rate smaller than a set threshold value, and improving the quality of the samples;

(3) based on a Mahalanobis distance measurement learning framework, considering the structural interaction between a feature space and a mark space in multi-mark data, and expressing a distance measurement matrix to be learned as a combined distance measurement form;

(4) defining multi-marker semantic similarity based on feature and marker collaborative computation, and constructing a triple constraint set;

(5) integrating the steps (3) and (4) to obtain a multi-mark distance measurement learning model, and carrying out optimization solution on the multi-mark distance measurement learning model;

(6) after learning distance measurement, mapping training data to a distance measurement space, and then learning by using an existing multi-label learning algorithm (MLKNN), thereby obtaining a multi-label classifier based on distance measurement learning;

(7) inputting a sample to be predicted into the classifier to obtain a labeled sample;

(8) performing sampling inspection on the labeling result, and if the labeling result is qualified, finishing the sampling inspection; otherwise, returning to the step (1), and continuing to extract the sample for model adjustment and updating.

The application scenes in step (1) include, but are not limited to, images, texts, video scenes, and the like.

Preferably, the step (3) represents the distance metric to be learned as a combined metric form as follows:

wherein, b_l∈R^dIs a combined basis vector, w_l0 or more is a non-negative combination coefficient (weight) to be learned; l is the interval index of the sum symbol, varying from 1 to K. K is the upper bound of the summation symbol and is a self-defined value, and the method constructs a component basis vector for each token. T denotes the transpose of the matrix.

More preferably, b is constructed in combination with discriminative information of the markers in consideration of the structured interactivity between the markers and the features, and specifically, for each marker, the first K feature vectors are obtained as local metric vectors using Principal Component Analysis (PCA), so that K × q b are total, and the combination coefficient w is a vector of dimension 1 × Kq.

Further preferably, the step (4) comprises the steps of:

(41) obtaining a semantic similarity matrix S by performing collaborative calculation on the identification information of the feature space and the mark space:

s_ij＝y_i ^TGy_j

where G＝(αA+(1-α)C)

where j corresponds to sample x_jSubscript of (i) corresponds to sample x_iA subscript of (a); a is a matrix of q × q, q represents the number of marks, and the value of each row and column in the matrix is represented by a_lhM is the upper bound of the summation symbol and represents m samples; a is_lhCorresponding with a mark y_lAnd y_hA, since the correlation between the markers tends to be asymmetric_lh≠a_hl；c_lhCosine similarity, alpha and balance coefficient of corresponding combined base vector, and relative contribution of balance characteristic space and mark space to semantic similarity; c is a matrix of q x q, q represents the number of marks, and the value of each row and column in the matrix is represented by C_lhThe calculation formula of (a), T represents the transposition of the matrix;

(42) setting a threshold parameter theta of semantic similarity, and aiming at a training sample x according to the definition of a semantic similarity matrix_iCan be divided into the following similarity sets Z_iAnd dissimilar set

Z_i＝{x_j|s_ij≥θ,j≠i,1≤j≤m}

(43) Respectively finding K samples with the nearest Euclidean distance in the similar set and the dissimilar set to form a set K_iAndthe triplet constraint set R is then expressed as follows:

still more preferably, the learning model of step (5) is as follows:

ζ_i≥0,w_i≥0(1≤i≤m)

whereinRepresenting the embedding spatial distance of the two samples,indicating the dissimilarity of the two token vectors.w is the non-negative combination coefficient (weight) to be learned in step (3), C is the balance coefficient, m represents the number of samples, ζ is the relaxation variable representing the loss, and R is the triplet constraint set.

The invention has the beneficial effects that: compared with the prior art, the multi-label distance measurement learning method based on interactive modeling provided by the invention can be used for processing any multi-label learning scene by fully utilizing the correlation among labels and the interactivity between the feature space and the label space, and reducing the complexity of a multi-label learning system and promoting the practical application of a multi-label learning framework by combining the distance measurement and the k nearest neighbor strategy.

Drawings

FIG. 1 is a flowchart of a multi-label distance metric learning method based on interactive modeling according to the present invention.

Detailed Description

The invention is described in further detail below with reference to the figures and specific examples.

As shown in FIG. 1, the invention is a multi-label distance metric learning method based on feature space and label space interactive modeling, comprising the following steps:

(1) for any multi-mark application scene as figureSampling images, videos, texts and the like to obtain training data, extracting corresponding features and carrying out manual labeling to obtain training data D { (x)_i,y_i) I is more than or equal to 1 and less than or equal to m, wherein x_iE x is a d-dimensional feature vector,is an example x_iThe set of labels of (1).

(2) And preprocessing the extracted training samples, and filtering out samples with the mark occupancy rate smaller than a set threshold value, so as to improve the sample quality.

(3) And based on the Mahalanobis distance metric learning framework, considering the structural interaction between the feature space and the mark space in the multi-mark data, and expressing the distance metric matrix to be learned as a combined distance metric form.

Based on the multi-labeled data characteristics, the distance metric is represented as a combined metric form:

wherein, b_l∈R^dIs a combined basis vector, w_lNot less than 0 is the non-negative combination coefficient (weight) to be learned, and the importance of each basis vector is described. B is constructed in combination with the discriminative information of the markers, taking into account the structured interactivity between the markers and the features, and specifically, for each marker, the first K feature vectors are obtained as local metric vectors using Principal Component Analysis (PCA), so that K × q b are total, and the combination coefficient w is a vector of dimension 1 × Kq.

(4) Defining multi-marker semantic similarity based on feature and marker collaborative computation, and constructing a triple constraint set; the specific substeps are as follows:

(41) under the multi-label scene, due to the combinability of multiple types of labels, a semantic similarity matrix S is obtained by performing collaborative calculation on the identification information of the feature space and the label space:

s_ij＝y_i ^TGy_j

where G＝(αA+(1-α)C)

wherein a is_lhCorresponding with a mark y_l,y_hA, since the correlation between the markers tends to be asymmetric_lh≠a_hl。c_lhAnd (4) cosine similarity of the corresponding combined basis vectors, wherein alpha is a balance coefficient, and the relative contribution of the balance feature space and the mark space to semantic similarity is balanced.

(42) Based on the definition of the semantic similarity matrix, the training set can be divided into the following similarity set and

Z_i＝{x_j|s_ij≥θ,j≠i,1≤j≤m}

a dissimilar set:

wherein θ is a threshold parameter of semantic similarity.

(43) Finding out the ones with the shortest Euclidean distance in the similar set and the dissimilar set respectivelySamples, forming a set K_iAndthe triplet constraint set R is then expressed as follows:

wherein R comprisesAnd (4) a triplet.

(5) And (4) integrating the steps (3) and (4) to obtain a multi-mark distance measurement learning model as follows:

ζ_i≥0,w_i≥0(1≤i≤m)

whereinRepresenting the embedding spatial distance of the two samples,indicating the dissimilarity of the two token vectors.

This optimization problem can be solved for the combined weights w using a FISTA iterative optimization algorithm.

(6) After learning distance measurement, mapping training data to a distance measurement space, and then learning by using an existing multi-label learning algorithm to obtain a multi-label classifier based on distance measurement learning.

(7) And inputting the sample to be detected into the classifier to obtain a labeled sample. Firstly, the multi-label classifier is mapped to an learned distance metric space and then input into the multi-label classifier for label prediction.

In summary, the invention provides a multi-label distance measurement learning method based on feature space and label space interactive modeling, which makes full use of the correlation between labels and the interactivity between the feature space and the label space, can process any multi-label learning scene, and reduces the complexity of a multi-label learning system by combining distance measurement and a k nearest neighbor strategy, and promotes the practical application of a multi-label learning framework.

The above description is only of the preferred embodiments of the present invention, and it should be noted that: it will be apparent to those skilled in the art that various modifications and adaptations can be made without departing from the principles of the invention and these are intended to be within the scope of the invention.

Claims

1. A multi-label distance measurement learning method based on interactive modeling is characterized by comprising the following steps:

(6) after learning distance measurement, mapping training data to a distance measurement space, and then learning by using the existing multi-label learning algorithm to obtain a multi-label classifier based on distance measurement learning;

(7) inputting a sample to be predicted into the multi-label classifier to obtain a labeled sample;

2. The interaction modeling based multi-label distance metric learning method as claimed in claim 1, wherein the application scene in step (1) comprises image, text and video scenes.

3. The interaction modeling based multi-label distance metric learning method according to claim 1, wherein the distance metric to be learned is represented in step (3) as a combined metric form as follows:

wherein, b_l∈R^dIs a combined basis vector, w_l≧ 0 is the non-negative combination coefficient (weight) to be learned, l is the interval index of the summation sign, varying from 1 to K. K is the upper bound of the summation symbol and is a self-defined value, and the method constructs a component basis vector for each token. T denotes the transpose of the matrix.

4. The interaction modeling based multi-label distance metric learning method according to claim 1, wherein the step (4) comprises the steps of:

s_ij＝y_i ^TGy_j

where G＝(αA+(1-α)C)

wherein j corresponds toThis x_jSubscript of (i) corresponds to sample x_iA subscript of (a); a is a matrix of q × q, q represents the number of marks, and the value of each row and column in the matrix is represented by a_lhM is the upper bound of the summation symbol and represents m samples; a is_lhCorresponding with a mark y_lAnd y_hA, since the correlation between the markers tends to be asymmetric_lh≠a_hl(ii) a C is a matrix of q x q, q represents the number of marks, and the value of each row and column in the matrix is represented by C_lhIs obtained by the calculation formula of (c)_lhCorresponding to the cosine similarity of the combined base vector, T represents the transposition of the matrix; alpha is a balance coefficient, and the relative contribution of the balance feature space and the mark space to the semantic similarity is balanced;

Z_i＝{x_j|s_ij≥θ,j≠i,1≤j≤m}

5. the interactive modeling based multi-label distance metric learning method according to claim 1, wherein the learning model of the step (5) is as follows:

wherein w is a non-negative combination coefficient (weight) to be learned in the step (3), C is a balance coefficient, m represents the number of samples, ζ is a relaxation variable representing loss, and R is a triplet constraint set;representing the embedding spatial distance of the two samples,indicating the degree of dissimilarity of the two token vectors,