CN111027636B

CN111027636B - Unsupervised feature selection method and system based on multi-label learning

Info

Publication number: CN111027636B
Application number: CN201911312573.7A
Authority: CN
Inventors: 朱磊; 石丹
Original assignee: Shandong Normal University
Current assignee: Shandong Normal University
Priority date: 2019-12-18
Filing date: 2019-12-18
Publication date: 2020-09-29
Anticipated expiration: 2039-12-18
Also published as: CN111027636A

Abstract

The present disclosure provides an unsupervised feature selection method and system based on multi-label learning, including: extracting features of each acquired data sample to obtain a feature data set, learning a binary multi-label matrix and a feature selection matrix for the feature data set, and constructing an unsupervised feature selection objective function based on multi-label learning; solving an unsupervised feature selection target function based on multi-label learning by adopting a discrete optimization method based on an augmented Lagrange multiplier method to obtain a feature selection matrix; and sequencing the feature selection matrix to determine the selected target features. Simultaneously learning multi-label and executing feature selection for semantic guidance, and applying binary constraint in spectrum embedding to obtain multi-label to guide the final feature selection process; in addition, a dynamic sample similarity graph capture data structure is constructed in an adaptive mode, and therefore the discrimination capability of multiple labels is enhanced.

Description

Unsupervised feature selection method and system based on multi-label learning

Technical Field

The disclosure relates to the technical field of feature selection, in particular to an unsupervised feature selection method and system based on multi-label learning.

Background

With the rapid development of information technology, high-dimensional data is emerging in different research fields, such as multimedia computing, data mining, pattern recognition, machine learning and the like. On the one hand, high dimensional data can provide richer information. On the other hand, it also presents a challenging dimensional disaster problem. The high-dimensional data usually contains noise or abnormal values, so that the direct use of such high-dimensional data often has a bad influence on the subsequent learning task and even reduces the performance of the method. To solve this problem, a dimension reduction technique is proposed, which includes two different processing methods: (1) selecting characteristics; (2) and (5) feature extraction.

Feature selection reduces the dimensionality of features by selecting important, discriminative features. Feature selection techniques are mainly divided into two categories depending on whether they rely on data tags or not: (1) supervised feature selection; (2) unsupervised feature selection. Among them, unsupervised feature selection is a more practical, but also more difficult task. For unsupervised feature selection, the most critical issue is how to accurately obtain feature information and use it to guide the feature selection process. In recent years, existing methods employ spectral analysis to explore the intrinsic information structure of the data. These methods include two steps: firstly, creating a sample similarity graph through spectrum analysis; the feature selection matrix is then learned based on spectral embedding.

Although good performance has been achieved, there are still some problems that need to be further solved: (1) the existing unsupervised feature selection method has no label guidance or uses a single label to guide the process of selecting features; the former causes the selected feature semantics to be missing, and the latter causes information loss. (2) The graph created by the existing graph-based feature selection method is not high in quality, a graph is usually constructed by a Gaussian kernel directly on original data, and the graph is kept constant in the whole model learning process. In addition, the process of graph creation and feature selection is split into two separate processes, which also results in a method that is sub-optimal.

Therefore, the key to improve the technical performance is: (1) learning more accurate labels that fit more into the data itself guides feature selection. Real-world data sets, including images, videos, biological data, etc., are often multi-labeled rather than unitary. (2) The quality of the graph is improved, the spectrum analysis is better combined with the feature selection, and the model is more accurately guided to select valuable features.

Disclosure of Invention

In order to overcome the defects of the prior art, the present disclosure provides an unsupervised feature selection method and system based on multi-label learning, which simultaneously learns the multi-labels for semantic guidance and performs feature selection, and applies binary constraint in spectrum embedding to obtain the multi-labels to guide the final feature selection process; in addition, a dynamic sample similarity graph capture data structure is constructed in an adaptive mode, and therefore the discrimination capability of multiple labels is enhanced.

In order to achieve the purpose, the following technical scheme is adopted in the disclosure:

in a first aspect, the present disclosure provides an unsupervised feature selection method based on multi-label learning, including:

extracting features of each acquired data sample to obtain a feature data set, learning a binary multi-label matrix and a feature selection matrix for the feature data set, and constructing an unsupervised feature selection objective function based on multi-label learning;

solving an unsupervised feature selection target function based on multi-label learning by adopting a discrete optimization method based on an augmented Lagrange multiplier method to obtain a feature selection matrix;

the feature selection matrix is ordered to determine the target features to be selected.

As some possible implementations, the unsupervised feature selection objective function based on multi-label learning is:

wherein x is_i,x_j∈R^1×dRespectively represent the i, j (1. ltoreq. i, j. ltoreq. n) th sample, G ∈ R^n×nIs a dynamic similarity graph of learning, g_ijIs an element of row i and column j of G; | | non-woven hair_FIs Frobenius norm (F norm for short) of the matrix;

is the Laplace matrix of the dynamic graph G, and the degree matrix D is a diagonal matrix with the ith diagonal element of

X^TIs the transpose of the data matrix X, B is a binary multi-label matrix, B ∈ {0,1}^n×lIs the discrete constraint of the binary multi-label matrix, l is the length of the binary multi-label, P ∈ R^d×lIs a feature selection matrix, mu, α is a balance parameter, sigma, β is a regularization parameter, and n is the number of samples.

As some possible implementations, the constructing an unsupervised feature selection objective function based on multi-label learning includes:

learning the feature selection matrix by using a regression model to obtain a low-dimensional feature subspace by using L_2,1The norm minimizing item restricts the feature selection matrix to be row sparse;

creating a dynamic sample similarity graph from original characteristic data by a self-adaptive learning method, and obtaining a binary multi-label matrix by adopting spectral analysis;

and constructing the learned binary multi-label matrix and the feature selection matrix into an unsupervised feature selection objective function based on multi-label learning through spectrum embedding.

As some possible implementations, the learned feature selection matrix is represented as:

wherein, P ∈ R^d×lIs a feature selection matrix, X^TIs the transpose of matrix X, B is a binary multi-label matrix, β is a regularization parameter;

is L of the matrix P_2,1And (4) norm.

As some possible implementations, the learning process of the dynamic sample similarity graph is represented as:

wherein G is a dynamic similarity map, G_ijIs the element in the ith (1 ≦ i ≦ n) row and jth (1 ≦ j ≦ n) column of G, σ is the regularization parameter, and n is the number of samples.

As some possible implementation manners, the solving process for solving the unsupervised feature selection objective function based on the multi-label learning by using the discrete optimization method based on the augmented lagrange multiplier specifically includes:

fixing any two variables of a feature selection matrix, a binary multi-label matrix variable and a dynamic similarity graph matrix in the objective function, and solving a third variable;

and setting iteration times, and carrying out iterative solution on the solving process to obtain the local optimal solution of the feature selection matrix.

As some possible implementations, the solving process further includes,

spectral embedding Tr (B) for an objective function^TL_GB) Item, using an auxiliary discrete variable Z ∈ {0,1}^n×lAnd replacing a second binary multi-label matrix B variable, and solving the auxiliary discrete variable by fixing a dynamic similarity graph matrix, a feature selection matrix and a binary multi-label matrix variable in the unsupervised feature selection objective function based on multi-label learning.

In a second aspect, the present disclosure provides an unsupervised feature selection system based on multi-label learning, comprising,

the target function construction module is used for extracting the characteristics of each acquired data sample to obtain a characteristic data set, learning a binary multi-label matrix and a characteristic selection matrix for the characteristic data set, and constructing an unsupervised characteristic selection target function based on multi-label learning;

the solving module is used for solving an unsupervised feature selection target function based on multi-label learning by adopting a discrete optimization method based on an augmented Lagrange multiplier method to obtain a feature selection matrix;

and the selection module is used for sequencing the characteristic selection matrix to determine the target characteristic to be selected.

In a third aspect, the present disclosure provides a computer-readable storage medium having stored therein a plurality of instructions adapted to be loaded by a processor of a terminal device and to perform the steps of the method for unsupervised feature selection based on multi-tag learning.

In a fourth aspect, the present disclosure provides a terminal device comprising a processor and a computer-readable storage medium, the processor being configured to implement instructions; the computer readable storage medium is for storing a plurality of instructions adapted to be loaded by a processor and to perform the steps of the unsupervised feature selection method based on multi-tag learning.

Compared with the prior art, the beneficial effect of this disclosure is:

the method simultaneously learns multiple labels for semantic guidance and performs feature selection, and obtains the multiple labels by applying binary constraint in spectrum embedding to guide the final feature selection process in order to select features with discriminant under the unsupervised condition;

creating a dynamic sample similarity graph and applying a binary constraint in spectrum embedding to generate a binary multi-label, so that the label is more instructive and discriminable, and important features can be accurately selected under the guidance of the multi-label;

the unsupervised feature selection method based on multi-label learning can be extended to multi-view learning and can be used for processing unsupervised feature selection problems on multi-view data.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this disclosure, are included to provide a further understanding of the disclosure, and are incorporated in and constitute a part of this specification, illustrate exemplary embodiments of the disclosure and together with the description serve to explain the application and not to limit the disclosure.

FIG. 1 is a flow chart of the disclosed method;

FIG. 2 is a flowchart of the method of example 1;

FIG. 3 is a flowchart of the method of example 2.

Detailed Description

The present disclosure is further described with reference to the following drawings and examples.

It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs.

It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.

Example 1

The present disclosure provides an unsupervised feature selection method based on multi-label learning, comprising:

s1: acquiring a real data set, and performing feature extraction on each data sample to acquire a data set in a feature form;

s2: constructing an unsupervised feature selection objective function based on multi-label learning for the feature data set;

s3: solving an unsupervised feature selection objective function based on multi-label learning to obtain a feature selection matrix;

s4: and determining the features to be selected in a sorting mode based on the learned feature selection matrix.

As one or more embodiments, the step S1 includes obtaining a feature data set X ∈ R^d×nWhere d is the feature dimension and n is the number of samples.

As one or more embodiments, the step S2 includes:

the present disclosure proposes learning binary multi-labels and performing feature selection simultaneously. Specifically, binary multitags are learned based on spectral analysis to guide the feature selection process and thus to select features with discriminability. Meanwhile, the method adaptively learns a dynamic sample similarity graph (which can well express the characteristic internal structure) from the original data, and the dynamic graph can further improve the discrimination capability of multiple labels. The regression model is used to learn the feature selection matrix and apply L to the feature selection matrix_2,1Norm minimization constrains them to become sparse.

Further, the unsupervised feature selection objective function based on multi-label learning is as follows:

wherein x is_i,x_j∈R^1×dRespectively represent the i, j (1. ltoreq. i, j. ltoreq. n) th sample, G ∈ R^n×nIs a dynamic similarity graph of learning, g_ijIs an element of row i (1. ltoreq. i.ltoreq.n) of G and column j (1. ltoreq. j.ltoreq.n); | | non-woven hair_FIs the F-norm of the matrix (e.g.: C | | non-woven cells)_FIs a matrix C ∈ R^d×nIs defined as the F norm of

X^TIs the transpose of matrix X, B is a binary multi-label matrix, B ∈ {0,1}^n×lIs the discrete constraint of the binary multi-label matrix, l is the length of the binary multi-label, P ∈ R^d×lIs a feature selection matrix, mu, α is a balance parameter, sigma, β is a regularization parameter, and n is the number of samples.

S21: the present disclosure learns a multi-label matrix and a feature selection matrix simultaneously. Learning a feature selection matrix P using a regression model, using L_2,1The norm minimization term constrains the feature selection matrix to be row sparse, and the learning of data multi-labels and feature selection are combined into a unified model through spectrum embedding.

The spectral embedding is represented as:

Tr(B^TL_GB),s.t.B∈{0,1}^n×l(2)

wherein Tr () represents the operation of finding the trace of the matrix, B is a binary multi-label matrix, B ∈ {0,1}^n×lIs a discrete constraint of a binary multi-label matrix,

is the Laplace matrix of the similarity graph G, the degree matrix D is a diagonal matrix, the ith diagonal element of which is

The feature selection matrix learning process is represented as:

wherein, P ∈ R^d×lIs a matrix of feature choices that is,

is L of the matrix P_2,1And (4) norm.

S22, in order to accurately acquire the internal structure information of the data and further improve the quality of the binary multi-label, the dynamic similarity graph which can better support the feature selection task is learned by the method, the dynamic sample similarity graph G ∈ R is adaptively learned, and the method and the device are provided with the function of adaptively learning^n×nAs a graph model, consider that there are k nearest data points as sample x_iIs assumed to be a sample point x_iCan pass the similarity probability g with all the sample points_ijTo be associated. In general, for the distance between sample points

The smaller the likelihood, the greater the likelihood probability g that should be assigned_ij。

The process of learning the dynamic sample similarity graph is represented as:

as one or more embodiments, the step S3 includes:

because the objective function has a discrete constraint of a multi-label matrix, directly solving the objective function is an NP difficult problem. To solve this problem, the present disclosure proposes a discrete optimization method based on the Augmented Lagrange Multiplier (ALM) to directly solve the multi-label matrix.

The specific iterative steps of the optimization (solving for one variable by fixing other variables) include:

s31: fixing variables P and B in an unsupervised feature selection target function based on multi-label learning, solving a dynamic similarity graph matrix G, and changing the target function into:

equation (5) can be written as follows:

wherein, b_i,b_j∈R^1×lRespectively are ith and j (i is more than or equal to 1 and j is more than or equal to n) row vectors of the multi-label matrix B.

The row vectors of matrix G are independent of each other, so matrix G is optimized row by row. For the row vector g_i：

Where 1 denotes a column vector with all elements 1.

Definition of

Vector d_iThe jth element of (a) is

Equation (7) can be written in vector form as follows:

the lagrange function of equation (8) is:

wherein, theta, η_iA value of > 0 is the Lagrangian multiplier. According to KKT (Karush-Kuhn-Tucker) conditions, g is obtained_iThe following solutions:

s32: fixing variables G and B in an unsupervised feature selection target function based on multi-label learning, solving a feature selection matrix P, and changing the target function into:

equation (11) derives P and makes it equal to 0, resulting in a solution for P:

P＝(XX^T+βΛ)^-1XB (12)

wherein Λ is a diagonal matrix whose elements

Is a very small constant.

S33: fixing variables G and P in an unsupervised feature selection target function based on multi-label learning, solving B, wherein the target function is changed into:

equation (13) can be converted to the following form:

the discrete constraint of the multi-label matrix in the objective function makes it difficult to directly solve the multi-label matrix B (NP-hard problem). The present disclosure proposes to solve B using a discrete optimization method based on the augmented lagrange multiplier method. In particular for B^TL_GItem B, using an auxiliary discrete variable Z ∈ {0,1}^n×lTo replace the second B variable and maintain their equivalence during the optimization process. The following optimization formula is thus obtained:

where variable M is used to measure the difference between B and Z. The last term of equation (15) can be simplified as:

through the above conversion, the objective function of optimization B becomes:

a closed solution for B was obtained:

B＝(sgn(2αX^TP-μL_GZ+λZ-M)+1)/2 (18)

s34: fixing variables G, P and B in an unsupervised feature selection objective function based on multi-label learning, solving a variable Z, and optimizing the objective function as follows:

the closed solution for variable Z is as follows:

Z＝(sgn(-μL_GB+λB+M)+1)/2 (20)

s35, according to the ALM theory, the optimization formula of M and lambda is as follows:

the method aims to obtain a dynamic graph G, a feature selection matrix P and a binary multi-label B by solving an objective function, and the objective function contains a plurality of unknown variables, so that the objective function cannot be directly solved; and the discrete constraint of the multi-label matrix in the objective function makes it difficult to directly solve the multi-label matrix B. Therefore, a discrete optimization method is provided to solve the algorithm, and a local optimal solution is finally obtained through a finite number of iterations by an iteration solution mode (fixing other variables and solving one variable).

As one or more embodiments, the step S4 specifically includes: selecting matrix P by using learned features_i||₂The importance of the features is measured, and then the features with r maximum values (most important) are selected by sorting.

Finally, a typical unsupervised learning task (clustering) is used to evaluate the performance of the feature selection method. Two general clustering evaluation indexes are adopted: and displaying the clustering result by using clustering Accuracy (ACC) and Normalized Mutual Information (NMI). Both indices range between 0 and 1, with larger numbers indicating better performance. The specific definitions of the two evaluation indexes are respectively:

①ACC

Ω＝[w₁,...,w_c]is a cluster, O ═ o₁,...,o_c]Is the category true value. b_iA predicted cluster label, t, representing sample i_iIs a genuine label. And if s_i＝t_iThen(s)_i,t_i) 1, otherwise(s)_i,t_i)＝0。

②NMI

Where C is a set of clusters obtained from the real tags and C' is a set of clusters obtained from the clustering algorithm. H (C) and H (C ') are the entropy of C and C ', respectively, and MI (C, C ') is a mutual information metric, and measures the degree of coincidence of the two data distributions.

The present disclosure proposes to jointly perform multi-label learning and feature selection, learning a binary multi-label for a sample as a supervised information (pseudo label) to support a final feature selection task in order to select important features in an unsupervised environment; in order to improve the quality of multi-label, a sample similarity map is dynamically created by fully exploring the data structure, and then a pseudo label is obtained by a spectrum analysis method. In this invention, multi-label learning, spectral analysis, and feature selection are facilitated in an efficient discrete optimization.

Example 2

As shown in fig. 3, this example also extends the unsupervised feature selection method based on multi-label learning into a multi-view setting, providing an unsupervised multi-view feature selection method based on multi-label learning.

The unsupervised multi-view feature selection method based on multi-label learning comprises the following steps:

s1: acquiring a real data set; the data set includes: a number of samples; extracting a plurality of view features from each sample to obtain a multi-view data set in a feature form;

s2: constructing an unsupervised multi-view feature selection objective function based on multi-label learning for the multi-view feature data set;

s3: solving an unsupervised multi-view feature selection objective function based on multi-label learning to obtain a feature selection matrix;

As one or more embodiments, the step S1 includes:

acquiring a real data set, wherein V view features are extracted from each sample to obtain a multi-view feature data set X;

multi-view feature dataset X ═ X¹,X²...,X^V]∈R^n×dEach sample contains V views, where n is the number of samples, d is the total feature dimension, and V is the number of views;

for the data matrix of the v-th view, d^vIs the characteristic dimension of the v-th view.

As one or more embodiments, the step S2 includes:

for each view feature, a sample similarity graph is created in advance by using a Gaussian kernel function;

further, the similarity graph A of the v view feature^v：

Wherein, the neighbor relation of the sample is measured by a K neighbor algorithm KNN; v ═ 1,2,. V]Is the view index, V is the view number;

is a similarity graph of the vth view feature,

is the ith (i ═ 1, 2.., n, in the v view]N is the total number of samples) samples;

is the jth (j ═ 1, 2.., n in the vth view]) A sample is obtained; exp () represents an exponential function with e as the base,

representing a sample

The Euclidean distance between; σ is a bandwidth parameter of the function that controls the number of neighbors.

Further, the unsupervised multi-view feature selection objective function based on multi-label learning is as follows:

wherein, G ∈ R^n×nIs a dynamic fusion map, g_ijIs an element of the ith (i is more than or equal to 1 and less than or equal to n) row and the jth (j is more than or equal to 1 and less than or equal to n) column of G, | | | | | Y phosphor_FIs the F-norm of the matrix (e.g.: C | | non-woven cells)_FIs a matrix C ∈ R^d×nIs defined as the F norm of

Is a Laplace matrix of the dynamic fusion graph G, and the degree matrix D is a diagonal matrix with the ith diagonal element of

X^TIs the transpose of the matrix X, B is the learned binary multi-label matrix, B ∈ {0,1}^n×lIs the discrete constraint of the binary multi-label matrix, l is the length of the binary multi-label, P ∈ R^d×^lIs the feature selection matrix, μ, α are balance parameters, β are regularization parameters.

Further, the S2 includes:

s21: for a multi-view feature dataset X ═ X¹,X²...,X^V]∈R^n×d，

d is the total feature dimension in the dataset, A^v∈R^n×nThe similarity graph corresponding to the V (V is more than or equal to 1 and less than or equal to V) th view feature, a corresponding similarity graph is created in advance for each view feature, a dynamic fusion graph G is learned based on a plurality of pre-created similarity graphs, the aim is to learn the dynamic fusion graph G, and the dynamic graph is consistent with the sample similarity graphs of the plurality of pre-created views; therefore, the creation of a dynamic fusion map is achieved by minimizing the linear combination of reconstruction errors for each view;

weighting a plurality of pre-created similarity graphs in a self-adaptive manner and carrying out linear fusion to realize the learning of a dynamic fusion graph G; the similarity relationship representation of the multi-view data is defined as:

equation (24) can also be written in the form of an explicit view weight representation:

wherein, γ^v＝1/2||G-A^v||_FIs the weight of the v-th view feature.

S22: the multi-label learning and feature selection of data are combined into a unified model through spectrum embedding.

The spectral embedding is represented as:

Tr(B^TL_GB),s.t.B∈{0,1}^n×l(26)

s23: we learn the feature selection matrix with a regression model, and use L_2,1The norm minimization term constrains the feature selection matrix to be row sparse. The objective function of the process is:

wherein, P ∈ R^d×lIs the feature selection matrix and l is the length of the binary multi-label.

The S3 specifically includes: since a discrete constraint on a multi-label matrix exists in an objective function, directly solving the objective function is an NP difficult problem. In order to solve the problem, a discrete optimization method based on an augmented Lagrange multiplier method (ALM) is provided to directly solve the multi-label matrix.

s31: fixing variables P and B in an unsupervised multi-view feature selection target function based on multi-label learning, solving variables G of a dynamic fusion graph, and changing the target function into:

equation (28) converts to the form:

since equation (29) is independent for different i, we can solve equation (29) separately for each i:

definition of

d_iIs the jth element as d_ijEquation (30) can be written as a vector as follows:

s32: and fixing variables G and B in the unsupervised multi-view feature selection objective function based on multi-label learning, and solving a feature selection array P. The solution is the same as equation (12) in example 1.

S33: and fixing the unsupervised multi-view feature selection target function variable G and P based on multi-label learning, and solving B. The solution is the same as equation (18) in example 1.

As one or more embodiments, the step S4 includes: selecting matrix P by using learned features_i||₂The importance of the features is measured, and then the features with r maximum values (most important) are selected by sorting.

The method carries out multi-view learning extension on an unsupervised feature selection method based on multi-label learning, carries out feature selection on multi-view data, and combines the multi-label learning and the feature selection into a unified model through spectral analysis; adaptively assigning weights to each view feature to create a fused similarity graph, wherein the graph can accurately capture the internal structural relationship among the multi-view features and improve the quality of the multi-label; the learned multi-labels can well guide the feature selection process to select important and discriminative features.

Example 3

The present disclosure provides an unsupervised feature selection system based on multi-label learning, comprising:

and the selection module is used for sequencing the characteristic selection matrix to determine the selected target characteristics.

Example 4

The present disclosure provides a computer readable storage medium having stored therein a plurality of instructions adapted to be loaded by a processor of a terminal device and to perform the steps of the method for unsupervised feature selection based on multi-tag learning.

Example 5

The present disclosure provides a terminal device comprising a processor and a computer-readable storage medium, the processor configured to implement instructions; the computer readable storage medium is for storing a plurality of instructions adapted to be loaded by a processor and to perform the steps of the unsupervised feature selection method based on multi-tag learning.

Although the present disclosure has been described with reference to specific embodiments, it should be understood that the scope of the present disclosure is not limited thereto, and those skilled in the art will appreciate that various modifications and changes can be made without departing from the spirit and scope of the present disclosure.

Claims

1. The unsupervised feature selection method based on multi-label learning is characterized by comprising the following steps of:

extracting features of each acquired data sample to obtain a feature data set, constructing an unsupervised feature selection target function based on multi-label learning according to the feature data set, and learning a binary multi-label matrix and a feature selection matrix;

sequencing the feature selection matrix to determine target features to be selected;

the unsupervised feature selection objective function based on multi-label learning is as follows:

wherein x is_i,x_j∈R^1×dRespectively represent the i, j samples, where 1 ≦ i, j ≦ n, G ∈ R^n×nIs a dynamic similarity graph of learning, g_ijIs an element in row i and column j of G, where i is greater than or equal to 1 and less than or equal to n, and j is greater than or equal to 1 and less than or equal to n; | | non-woven hair_FIs the F norm of the matrix;

X^TIs the transpose of matrix X, B ∈ {0,1}^n×lIs a binary multi-label matrix, l is the length of the binary multi-label, P ∈ R^d×lIs a feature selection matrix, mu, α is a balance parameter, sigma, β is a regularization parameter, n is the number of samples, | | P | | Y_2,1Is L of the matrix P_2,1A norm;

the constructing of the unsupervised feature selection objective function based on multi-label learning comprises:

establishing a similarity graph corresponding to each feature by adopting a Gaussian kernel function for the feature data set, and obtaining a binary multi-label matrix by learning the dynamic similarity graph and adopting spectral analysis;

constructing a learned binary multi-label matrix and a feature selection matrix into an unsupervised feature selection target function based on multi-label learning through spectrum embedding;

learning the feature selection matrix as:

the solving process for solving the unsupervised feature selection target function based on the multi-label learning by adopting the discrete optimization method based on the augmented Lagrange multiplier method specifically comprises the following steps of:

2. The unsupervised feature selection method based on multi-label learning of claim 1, wherein the learning process of the dynamic similarity graph is expressed as:

wherein G is a dynamic similarity map, G_ijIs the element of the ith row and the jth column of G, i is more than or equal to 1 and less than or equal to n, j is more than or equal to 1 and less than or equal to n, sigma is a regularization parameter, and n is the number of samples.

3. The unsupervised feature selection method based on multi-label learning of claim 1, wherein in the iterative solution:

4. An unsupervised feature selection system based on multi-label learning, comprising:

the target function construction module is used for extracting the characteristics of each acquired data sample to obtain a characteristic data set, constructing an unsupervised characteristic selection target function based on multi-label learning according to the characteristic data set, and learning a binary multi-label matrix and a characteristic selection matrix;

the selection module is used for sequencing the feature selection matrix to determine target features to be selected;

X^TIs the transpose of matrix X, B ∈ {0,1}^n×lIs a binary multi-label matrix, l is the length of the binary multi-label;P∈R^d×lis a feature selection matrix, mu, α is a balance parameter, sigma, β is a regularization parameter, n is the number of samples, | | P | | Y_2,1Is L of the matrix P_2,1A norm;

learning the feature selection matrix as:

5. A computer-readable storage medium having stored therein a plurality of instructions adapted to be loaded by a processor of a terminal device and to perform the method according to any one of claims 1-3.

6. A terminal device comprising a processor and a computer readable storage medium, the processor being configured to implement instructions; a computer readable storage medium for storing a plurality of instructions adapted to be loaded by a processor and for performing the method according to any of claims 1-3.