CN117409456A - Non-aligned multi-view multi-mark learning method based on graph matching mechanism - Google Patents

Non-aligned multi-view multi-mark learning method based on graph matching mechanism Download PDF

Info

Publication number
CN117409456A
CN117409456A CN202311195295.8A CN202311195295A CN117409456A CN 117409456 A CN117409456 A CN 117409456A CN 202311195295 A CN202311195295 A CN 202311195295A CN 117409456 A CN117409456 A CN 117409456A
Authority
CN
China
Prior art keywords
view
matrix
data
mark
aligned
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311195295.8A
Other languages
Chinese (zh)
Inventor
杨震
钟淇宇
吕庚育
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN202311195295.8A priority Critical patent/CN117409456A/en
Publication of CN117409456A publication Critical patent/CN117409456A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • G06V10/765Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects using rules for classification or partitioning the feature space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a non-aligned multi-view multi-mark learning method based on a graph matching mechanism. Based on training data in the sample data set, a feature matrix and an observable mark matrix are constructed. Explicit alignment is carried out on unaligned data through a permutation matrix, namely point-to-point first-order alignment among samples; and carrying out second-order alignment on the graph structure on the view by using distance matrixes of samples in different views, so that the alignment accuracy of the model is further improved. The expression of 'commonality-personality' among aligned data views is mined, and a non-aligned multi-view multi-label learning model based on a graph matching mechanism is constructed by utilizing consistency and complementarity of cross views. Training the model by an alternative optimization method until the model converges to obtain the classification predictor.

Description

Non-aligned multi-view multi-mark learning method based on graph matching mechanism
Technical Field
The invention relates to non-negative matrix factorization, graph matching technology, non-aligned multi-view learning and multi-label classification, in particular to a non-aligned multi-view multi-label learning method based on a graph matching mechanism.
Background
The large-scale Internet development, the popularization of large data and artificial intelligence technology application, and the mass multi-view multi-mark data are brought along. Multi-view multi-label learning has received extensive attention as a primary architecture to address such issues. In a multi-view multi-label learning task, each sample object is represented by a variety of heterogeneous view information, with several associated labels simultaneously labeled. However, the existing multi-view multi-label learning method is generally to explore related information and complementary information of cross views. The search for this information is typically based on view alignment (instances in different views describe the same object). However, such view alignment relationships may become partial view alignment or view non-alignment relationships due to spatial, temporal, or spatio-temporal asynchrony. Such as: in video recommendations, the tag data comes from different video software, but due to the privacy protection principle of the user, it is not possible to match and align these data with the same user. In the face recognition field, due to failure of face feature detection, multi-view faces cannot be aligned, which may result in incapability of facial expression recognition. The existing multi-view multi-label learning model cannot learn a robust multi-label classification model directly from these non-aligned data.
Therefore, the invention provides a non-aligned multi-view multi-label learning method (MCGM for short) based on a graph matching mechanism to solve the problem of attempting non-alignment and the problem of semantic comprehensive expression in multi-view multi-label learning. Aiming at the non-alignment problem of multi-view data, the precise alignment of the feature nodes of the same instance under different views is realized by mining the cross-view 'instance-instance' and 'instance relation-instance relation' graph matching relation, and the feature nodes are used for subsequent classification tasks; aiming at the problem that the existing multi-view multi-mark learning algorithm based on shared subspace representation is difficult to describe all semantic information of multi-view data, a multi-view multi-mark classification model based on 'common-single' semantic representation is designed, contribution of single views in specific semantic expression is emphasized, and semantic expression of few types of samples is promoted.
Disclosure of Invention
The technical solution of the invention is as follows: the non-aligned multi-view multi-mark learning method based on the graph matching mechanism is provided, classification of non-aligned multi-view multi-mark data is achieved, and efficiency and accuracy of the method are guaranteed through semantic comprehensive expression.
The technical scheme of the invention is as follows: a non-aligned multi-view multi-mark learning method based on a graph matching mechanism comprises the steps of firstly obtaining non-aligned multi-view multi-mark data, storing, preprocessing and dividing a data set of the non-aligned multi-view multi-mark data to form a sample data set. Based on training data in the sample data set, a feature matrix and an observable mark matrix are constructed. According to the feature matrix and the observable mark matrix, unaligned data are aligned: 1) Explicit alignment is carried out on unaligned data through a permutation matrix, namely point-to-point first-order alignment among samples; 2) And carrying out second-order alignment on the graph structure on the view by using distance matrixes of samples in different views, so that the alignment accuracy of the model is further improved. The expression of 'commonality-personality' among aligned data views is mined, and a non-aligned multi-view multi-label learning model based on a graph matching mechanism is constructed by utilizing consistency and complementarity of cross views. Training the model by an alternative optimization method until the model converges to obtain the classification predictor. And predicting the test set based on the converged classifier, and obtaining a mark classification result by using the output probability. The method comprises the following specific steps:
in the present invention, the matrix is represented by bolded letters, such as X. Vectors are represented by bold lower case letters, such as x. In addition, (XR) represents a matrix obtained by X.R, where·is a matrix multiplication. The inverse and transpose of matrix X are denoted as X, respectively -1 ,X T 。X v Feature matrix, X, representing the v-th view v The ith column and the jth row are respectively denoted as (X) v ) :,i And (X) v ) j,: 。(X v ) i,j Is X v Element (i, j), x i Representing the ith element of vector x. In addition, we useRepresenting the real number domain, fu Luo Beini Us (Frobenius) Fan Shuji is +.>
Step S1, acquiring non-aligned multi-view multi-marker data, and storing, preprocessing and dividing a data set for the non-aligned multi-view multi-marker data. Since this problem is a completely new problem, there is no presently disclosed non-aligned dataset, and thus a synthetic dataset is employed. Specifically, based on the 6 disclosed multi-view multi-label datasets, the instances in the different views are made to describe the different objects by randomly scrambling the instances. A non-aligned multi-view multi-marker dataset is obtained.
Constructing a data set with V view samples from training dataWherein-> Is the complete feature space of the v-th view, n represents the training sample x i Number d of (d) v Representing the dimension of each sample.Representing a marker space corresponding to the feature space. Wherein y is i ∈{0,1} n×q Is x i Q represents the number of marks.
Step S2, constructing a feature matrix X for training data in the sample data set acquired in the step S1 v The observable mark matrix Y is used for constructing first-order and second-order relation matching of cross views, realizing characteristic alignment of multiple views, and constructing a non-aligned multi-view multi-mark classification model based on 'common-single' semantic representation. The method comprises the following specific steps:
(a) Benefit (benefit)Explicit alignment is performed on unaligned data by using a permutation matrix, and point-to-point first-order alignment is performed between instances. After first-order alignment, a relatively correct mapping relation between the examples is obtained, and common low-dimensional representation of different view data is extracted by adopting non-negative matrix factorization. Introducing a feature mapping matrix W based on the obtained common representation and the observable marking matrix 0 Constructing a linear mapping relation from a shared subspace to a mark space to obtain an initial learning model:
s.t.(M v ) i,j ∈{0,1},M v 1=1,1 T M v =1 T ,P≥0,H v ≥0,W 0 ≥0#(1)
wherein,is the permutation matrix of the v-th view, and the feature matrix is multiplied by the permutation matrix to obtain aligned multi-view data. P and H v The aligned data are obtained by non-negative moment decomposition. />An individual mapping matrix representing a v-th view; />Is a shared subspace, where k is the size of the desired data reduced dimension; p is greater than or equal to 0 and H v 0 is the non-negative constraint of the matrix. />Is a coefficient matrix corresponding to P, thus also constraining W 0 ≥0。W 0 And learning the commonality information of the heterogeneous view by sharing the mapping relation of the subspace P to the subspace Y. Alpha and gamma are two hyper-parameters. The last term is for W 0 Regularization constraint is performed to avoid over-fitting problem and reduce influence of noise characteristics。
(b) Taking into account the individual mapping matrixIn fact the corresponding characteristics for encoding a single view define another set of coefficient matrices +.>To capture the unique characteristics of a single view, wherein +.>The reconstructed feature matrix PH with the v-th view personalized information v Mapped to the markup space. The personality and commonality information of the heterogeneous view is utilized, and the formula (1) is further expanded as follows:
s.t.(M v ) i,j ∈{0,1},M v 1=1,1 T M v =1 T ,P≥0,H v ≥0,W 0 ≥0,W v ≥0#(2)
wherein P can be regarded as a dictionary matrix containing all view information, and H v Representing individual view-specific coding coefficients. Thus, pH is v The goal of (1) is to capture individual information for a particular view, while P captures shared information for all views. Thus, pH is v And P can be regarded as an expression of "commonality-personality" of the multiview data.
(c) Further consider the alignment and marker correlation of the inter-view second order graph structure. Due to M v X v And M j X j Respectively represent view X v And view X j Is a properly aligned view of (c). Mapping matrices with cross-viewTo represent view X v And view X j Matching degree of cross-view samples between. Based on the common knowledge that the graph structures formed after the alignment of the multi-view data should be as consistent as possible, structure matching loss terms are constructed to explore the correct cross-view mapping relationship. Establishing a distance matrix S between respective samples for each view v Representing the graph structure of view v. By mapping matrix across views->Permutation distance matrix S v And S is j The graph connection structure of the two views is made as similar as possible. />The correlation matrix of the markers is represented, and known correlation marker information is extracted by using the correlation in the multi-marker data through the matrix A. The superparameter beta is introduced to balance the weights of the second order alignment. Finally, the final non-aligned multi-view multi-label learning model based on the graph matching mechanism is expressed as follows:
s.t.(M v ) i,j ∈{0,1},M v 1=1,1 T M v =1 T ,P≥0,H v ≥0,W 0 ≥0,W v ≥0
wherein the method comprises the steps of
And S3, simplifying the expression of the non-aligned multi-view multi-mark learning model based on the graph matching mechanism in the step S2, and performing alternating optimization training to minimize the model until the model converges to obtain the classification predictor. The method comprises the following specific steps:
(a) Due to M v Is non-convex and it is difficult to obtain an optimal solution. Again, since the orthogonal transformation does not change the relationship between vectors, this means that less stringent constraints can also preserve the structure of the data. Handle M v Is relaxed as:
(b) The objective function is transformed into an unconstrained problem by introducing lagrange multipliers λ, Φ, Θ, Ω, ψ. Taking view v as an example, the objective function (3) is changed over view v to the following form:
(c) Iterative optimizationFix P->A,W 0 ,/>When M is v Calculation is independent of M v′ V' +.v. Thus for each view v, for M v The optimization is performed separately.
Standard methods solve the coupling equation (3) and constraintsNonlinear methods, such as newton methods, are used. This system of nonlinear equations is often difficult to solve. The selection seeks an approximate solution. By this method, the following can be obtained with respect to M v Is a rule for iterative updating of:
wherein:
(d) And (5) iteratively optimizing P. FixingA,W 0 ,/>In the formula (3)>Taking the derivative of P, one can obtain:
using the KKT condition, i.e. phi i,j P i,j =0, the following iterative update rule for P can be derived:
(e) And (5) iteratively optimizing A. FixingP,W 0 ,/>In the formula (3)>Taking the derivative of A, one can obtain:
let the derivative be equal to 0, the following update rule for A can be obtained:
(f) Iterative optimizationFix P->A,W 0 ,/>When H is v Calculation is independent of H v V' +.v. Thus for each view v, for H v Optimizing alone, in formula (3)>For H v Derivative is obtained, and the following steps are obtained:
using the KKT condition, i.e. Θ i,j (H v ) i,j =0, the following can be obtained for H v Is a rule for iterative updating of:
(g) Iterative optimization W 0 . The process of fixing the P is carried out,A,/>in the formula (3)>For W 0 Derivative is obtained, and the following steps are obtained:
using the KKT condition, i.e. Ω i,j (W 0 ) i,j =0, the following can be obtained for W 0 Is a rule for iterative updating of:
(h) Iterative optimizationFix P->A,/>W 0 In the formula (3)>For W v Derivative is obtained, and the following steps are obtained:
using the KKT condition, i.e. ψ i,j (W v ) i,j The following can be obtained for W =0 v Updating rules:
(i) Repeating (c) to (h), and continuously and alternately updating parametersA,W 0 ,/>And P is carried out until the iteration stop condition is met. And converging the objective function, and outputting the optimal parameters of the non-aligned multi-view multi-mark learning model based on the graph matching mechanism to obtain the classification predictor.
Step S4, predicting a test set based on a converged non-aligned multi-view multi-label learning model based on a graph matching mechanism, and obtaining a label prediction result by using output probability, wherein the specific steps comprise: marking prediction matricesWill be given by equationGiven.
Compared with the prior art, the invention has the advantages that:
1. for the problem of view misalignment in multi-view multi-label learning tasks, first and second order alignment is proposed to solve this problem. The re-ordering matrix may adaptively reorder features in each view, performing a first order alignment to obtain a correct mapping relationship. Therefore, the problem of non-alignment of the views can be simplified into the problem of alignment of the views, and in addition, the distance matrix of samples in different views is used for carrying out second-order alignment on the views structurally, so that the alignment efficiency and accuracy are improved.
2. Aiming at the problem of multi-view multi-mark semantic comprehensive expression, the method can jointly utilize consistency and diversity information of multi-view multi-mark data. The model learns a shared subspace from different views, labeled correlations, an integrated classifier based on individual and shared feature spaces. The aligned data is input into the multi-view multi-label classification model based on the 'common body-single body' semantic representation, the contribution of single body view in specific semantic expression is emphasized, and the semantic expression of a few-class sample is promoted.
3. The invisible correlations present in the markers are learned by introducing a dynamic marker correlation matrix a. Although a fixed tag correlation matrix may be estimated based on known tag matrices. However, it may not be sufficient to use only samples of known labels, and the proposed dynamic label correlation matrix can adaptively measure the correlation between labels, so as to help improve learning performance of the multi-label classification model.
4. The model is simplified to be in a general condition, and an iterative optimization method for solving the objective function is provided. An approximate solution to the model is sought at a certain time complexity. Thereafter, the validity of the model was verified on six real-world datasets.
Drawings
FIG. 1 is a process flow diagram of the method of the present invention.
Fig. 2 is a training workflow diagram of the method of the present invention.
Detailed Description
In order to make the solution of the embodiment of the present invention better understood by those skilled in the art, the embodiment of the present invention is further described in detail below with reference to the accompanying drawings and embodiments.
As shown in fig. 1, the present invention includes the steps of:
1. and acquiring unaligned multi-view multi-marker data, and storing, preprocessing and dividing a data set. Constructed feature matrix X v The marker matrix Y can be observed. Since this problem is a completely new problem, there is no presently disclosed non-aligned dataset, and thus a synthetic dataset is employed. Specifically, based on the 6 disclosed multi-view multi-label datasets, the instances in the different views are made to describe the different objects by randomly scrambling the instances. A non-aligned multi-view multi-marker dataset is obtained. Constructing a data set with V view samples from training data Is a marker space corresponding to the feature set.
2. For the feature matrix X built by the training data in the sample data set obtained in the step 1 v The observable mark matrix Y is used for constructing first-order and second-order relation matching of cross views, realizing characteristic alignment of multiple views, and constructing a non-aligned multi-view multi-mark classification model based on 'common-single' semantic representation. And performing alternating optimization training to minimize the model until the model converges to obtain the classification predictor. The objective function is as follows:
the method comprises the following specific steps:
(a) Input feature matrix X v The method comprises the steps of carrying out a first treatment on the surface of the An observable marking matrix Y; sharing dimensions of subspaces; super-parameters in equation (6); a convergence threshold; number of iterations. The latter four input values may be changed from dataset to achieve a better result.
(b) Random initialization M v 、H v 、W v 、P、W 0 And A. Constructing an adjacency matrix S for each view by means of a feature matrix a
(c) Optimization M is performed according to the formulas (5), (7), (9), (11), (13), (15) alternately and iteratively v 、P、A、H v 、W 0 And W is v . And until the iteration stopping condition is met, the iteration stopping condition can be that the difference value of the objective function value and the two iterations is smaller than a convergence threshold value or the maximum number of iterations is reached, and finally, the optimal solution of the objective function is output to obtain the non-aligned multi-view multi-mark learning model classifier based on the graph matching mechanism.
3. Non-aligned multi-view multi-tagging based on graph matching mechanism after convergenceThe learning model predicts the test set and obtains a mark prediction result by using the output probability, and the specific steps comprise: marking prediction matricesWill be given by equationGiven. To obtain accurate marking information, a certain threshold value is set,/->The element in the vector above this threshold is set to 1, i.e. this flag is the flag of the sample. Setting 0 below this threshold indicates that the flag is not this sample flag. The threshold may typically be set at 0.5, but the threshold will often be different for different data sets.
The present invention has been conducted on six real world datasets to conduct intensive experimental studies. And all six multiview datasets used in the experiment were published. Their statistics are summarized in Table 1. For each dataset, table 1 summarizes the number of samples (n); number of views (m); the number of different labels (c); the average number of marks per sample (# avg); the minimum dimension (d min )。
Table 1 statistics for six multiview datasets
Emotion is a set of music data, the two views of each instance corresponding to the rhythmic and the tonal characteristics of a piece of music; yeast is a biological dataset, with two views of each example corresponding to genetic expression and phylogenetic development of a gene; corel5k, pascal07, ESPGame, mirflicker is four widely used multi-view image dataset. From which a plurality of features of these images are collected, each image being represented by 6 representative color space views, each facing a different application background: HUE, SIFT, GIST, HSV, RGB and LAB. Six unaligned multiview multi-label datasets are obtained by randomly scrambling the instances such that the instances in different views describe different objects. Furthermore, in order to verify the effectiveness of the method MCGM according to the present invention, the method MCGM according to the present invention was compared with the following 6 multi-labeling methods. The two single-view multi-marking methods adopt a series strategy, and experiments are carried out after the multi-view data set is converted into the single-view data set. Other methods are multi-view multi-label learning methods. The comparison method comprises a single-view multi-mark learning method MLkNN and LLSF, and a computer vision field top journal 2007PR and a data mining field top meeting 2016 TKBE are respectively published. The multi-view multi-mark learning methods FIMAN, ICM2L, iMvWL and BEMVL are respectively published in the data mining field top journal 2020SIGKDD, the artificial intelligence field top journal 2019TCYB, the artificial intelligence field top meeting 2018IJCAI, and the data mining field top meeting 2022TKDD. The method uses five evaluation indicators widely used in multi-label learning to measure the performance of each algorithm. Specific evaluation indexes include Average Precision, coverage, hamming Loss, one Error, and ranking Loss. The mean and standard deviation of each metric value for each dataset will be shown in tables 2 to 7. It should be noted that the present invention shows values of 1-Ranking Loss in the table.
Table 2 experimental results of the improvements (mean.+ -. Standard deviation)
TABLE 3 Yeast experimental results (mean.+ -. Standard deviation)
Table 4 Corel5k experimental results (mean.+ -. Standard deviation)
Table 5 Pascal07 experimental results (mean.+ -. Standard deviation)
TABLE 6 ESPGame experimental results (mean.+ -. Standard deviation)
TABLE 7 Mirflicker test results (mean.+ -. Standard deviation)
From the results reported in tables 2 to 7, it can be observed that MCGM is superior to other comparison methods in most cases, both in large and small data sets. In 30 experimental settings (6 data sets and 5 evaluation criteria), the inventive method ranked the first and second ratios 57% and 40% in the results, respectively. And none of the methods is significantly better than the methods of the present invention in terms of index.
Comparison of MCGM with LLSF and MLkNN shows that the performance of the conventional multi-signature approach to multi-view multi-signature learning approach through parallel strategy improvement can be seen to be deficient, mainly because they ignore multi-view consistency and complementary information mining. That is, they ignore the physical meaning of the individual views in the dataset.
Comparison of MCGM with FIMAN, ICM2L, BEMVL and iMvWL shows that the method of the present invention has good performance in dealing with non-aligned view problems. Other algorithms are defective in encountering view misalignment, since they do not take into account the view misalignment problem. Where iMvWL ignores view diversity, which results in its limitation in view information extraction.
It should be noted that the method according to the embodiment of the present invention is applicable to the problem of non-aligned multi-view multi-label classification.
The foregoing has described in detail embodiments of the invention, which are presented herein with particular reference to the drawings and are presented solely to aid in the understanding of the invention; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present invention, the present description should not be construed as limiting the present invention in view of the above.

Claims (2)

1. The non-aligned multi-view multi-mark learning method based on the graph matching mechanism is characterized in that firstly, non-aligned multi-view multi-mark data are acquired, and the non-aligned multi-view multi-mark data are stored, preprocessed and data set divided to form a sample data set; constructing a feature matrix and an observable mark matrix based on training data in the sample data set; according to the feature matrix and the observable mark matrix, unaligned data are aligned: 1) Explicit alignment is carried out on unaligned data through a permutation matrix, namely point-to-point first-order alignment among samples; 2) Performing second-order alignment on the view on the graph structure by using distance matrixes of samples in different views, so that the alignment accuracy of the model is improved; mining the expression of commonality-personality among aligned data views, and constructing a non-aligned multi-view multi-mark learning model based on a graph matching mechanism by utilizing consistency and complementarity of cross views; training the model by an alternative optimization method until the model converges to obtain a classification predictor; and predicting the test set based on the converged classifier, and obtaining a mark classification result by using the output probability.
2. The non-aligned multi-view multi-label learning method based on a graph matching mechanism according to claim 1, wherein the method comprises the steps of,
step S1, acquiring non-aligned multi-view multi-mark data, and storing, preprocessing and dividing a data set for the non-aligned multi-view multi-mark data; adopting artificial synthetic data set; obtaining a non-aligned multi-view multi-marker dataset;
constructing a data set with V view samples from training dataWherein-> Is the complete feature space of the v-th view, n represents the training sample x i Number d of (d) v Representing the dimension of each sample;representing a markup space corresponding to the feature space; wherein y is i ∈{0,1} n×q Is x i Q represents the number of marks;
step S2, constructing a feature matrix X for training data in the sample data set acquired in the step S1 v The observable mark matrix Y is used for constructing first-order and second-order relation matching of cross views, realizing characteristic alignment of a plurality of views, and constructing a non-aligned multi-view multi-mark classification model based on 'common-single' semantic representation; the method comprises the following specific steps:
(a) Explicit alignment is carried out on unaligned data by utilizing a permutation matrix, and point-to-point first-order alignment between examples is carried out; obtaining a relatively correct mapping relation between the examples after first-order alignment, and extracting common low-dimensional representation of different view data by adopting non-negative matrix factorization; introducing a feature mapping matrix W based on the obtained common representation and the observable marking matrix 0 Constructing a linear mapping relation from a shared subspace to a mark space to obtain an initial learning model:
s.t.(M v ) i,j ∈{0,1},M v 1=1,1 T M v =1 T ,P≥0,H v ≥0,W 0 ≥0(1)
wherein,is the replacement matrix of the v-th view, and the feature matrix is multiplied by the replacement matrix to obtain aligned multi-view data; p and H v The aligned data are obtained through non-negative moment decomposition; />An individual mapping matrix representing a v-th view; />Is a shared subspace, where k is the size of the desired data reduced dimension; p is greater than or equal to 0 and H v 0 is the non-negative constraint of the matrix; />Is a coefficient matrix corresponding to P, thus also constraining W 0 ≥0;W 0 Learning the commonality information of the heterogeneous view by sharing the mapping relation from the subspace P to the subspace Y; alpha and gamma are two hyper-parameters;
(b) Taking into account the individual mapping matrixIn fact, for encoding the corresponding characteristics of a single view, another set of coefficient matrices is defined +.>To capture the unique properties of a single view, +.>The reconstructed feature matrix PH with the v-th view personalized information v Mapping to a markup space; utilizing personality and commonality information of heterogeneous views, andthe expansion formula (1) is as follows:
s.t.(M v ) i,j ∈{0,1},M v 1=1,1 T M v =1 T ,P≥0,H v ≥0,W 0 ≥0,W v ≥0(2)
wherein P can be regarded as a dictionary matrix containing all view information, and H v Representing individual view-specific coding coefficients; PH value v The goal of (1) is to capture individual information for a particular view, while P captures shared information for all views; PH value v And P is considered as the expression of "commonality-personality" of the multi-view data;
(c) Considering alignment and mark correlation of second-order diagram structures among views; due to M v X v And M j X j Respectively represent view X v And view X j Is a properly aligned view of (1); mapping matrices with cross-viewTo represent view X v And view X j Matching degree of cross-view samples; constructing a structure matching loss item to explore a correct cross-view mapping relation; establishing a distance matrix S between respective samples for each view v A graph structure representing view v; by mapping matrix across views->Permutation distance matrix S v And S is j Making the graph connection structure of the two views as similar as possible; />Representing a marker correlation matrix, and extracting known correlation marker information by utilizing correlation in the multi-marker data through the matrix A; introducing a weight of the second order alignment of the super-parameter beta balance; graph-based matching mechanismThe non-aligned multi-view multi-label learning model of (a) is expressed as follows:
s.t.(M v ) i,j ∈{0,1},M v 1=1,1 T M v =1 T ,P≥0,H v ≥0,W 0 ≥0,M v ≥0
wherein the method comprises the steps of
S3, simplifying the expression of the non-aligned multi-view multi-mark learning model based on the graph matching mechanism in S2, and performing alternating optimization training to minimize the model until the model converges to obtain a classification predictor; the method comprises the following specific steps:
(a) Due to M v Is non-convex and it is difficult to obtain an optimal solution; handle M v Is relaxed as:
(b) Converting the objective function into an unconstrained problem by introducing Lagrangian multipliers lambda, phi, theta, omega, psi; in view v, the objective function (3) is changed over view v to the following form:
(c) Iterative optimizationFix P->M v Calculation is independent of M v′ V' noteqv; thus for each view v, for M v Optimizing independently;
standard methods solve the coupling equation (3) and constraintsThe following is obtained with respect to M using a nonlinear method v Is a rule for iterative updating of:
wherein:
(d) Iterative optimization P; fixingIn the formula (3)>Derivative of P to obtain:
using the KKT condition, i.e. phi i,j P i,j =0, resulting in the following iterative update rule for P:
(e) Iterative optimization A; fixingIn the formula (3)>Derivative A is obtained by:
let the derivative equal to 0, the following update rule for A is obtained:
(f) Iterative optimizationFix->When H is v Calculation is independent of H v V' noteqv; for each view v, for H v Optimizing alone, in formula (3)>For H v Derivative is obtained, and the following steps are obtained:
using the KKT condition, i.e. Θ i,j (H v ) i,j =0, giving the following relation to H v Is a rule for iterative updating of:
(g) Iterative optimization W 0 The method comprises the steps of carrying out a first treatment on the surface of the FixingIn the formula (3)>For W 0 Derivative is obtained, and the following steps are obtained:
using the KKT condition, i.e. Ω i,j (W 0 ) i,j =0, giving the following relation to W 0 Is a rule for iterative updating of:
(h) Iterative optimizationFix->In the formula (3)>For W v Derivative is obtained, and the following steps are obtained:
using the KKT condition, i.e. ψ i,j (W v ) i,j =0 gets the following about W v Updating rules:
(i) Repeating (c) to (h) continuously and alternatelyUpdating parameters Until the iteration stop condition is met; converging an objective function, outputting optimal parameters of the non-aligned multi-view multi-mark learning model based on the graph matching mechanism, and obtaining a classification predictor;
step S4, based on the non-aligned multi-view multi-mark learning model based on the graph matching mechanism after convergence, predicting the test set, and obtaining a mark prediction result by using the output probability, wherein the step comprises the following steps: marking prediction matricesWill be given by equationGiven.
CN202311195295.8A 2023-09-16 2023-09-16 Non-aligned multi-view multi-mark learning method based on graph matching mechanism Pending CN117409456A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311195295.8A CN117409456A (en) 2023-09-16 2023-09-16 Non-aligned multi-view multi-mark learning method based on graph matching mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311195295.8A CN117409456A (en) 2023-09-16 2023-09-16 Non-aligned multi-view multi-mark learning method based on graph matching mechanism

Publications (1)

Publication Number Publication Date
CN117409456A true CN117409456A (en) 2024-01-16

Family

ID=89487860

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311195295.8A Pending CN117409456A (en) 2023-09-16 2023-09-16 Non-aligned multi-view multi-mark learning method based on graph matching mechanism

Country Status (1)

Country Link
CN (1) CN117409456A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117690192A (en) * 2024-02-02 2024-03-12 天度(厦门)科技股份有限公司 Abnormal behavior identification method and equipment for multi-view instance-semantic consensus mining

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117690192A (en) * 2024-02-02 2024-03-12 天度(厦门)科技股份有限公司 Abnormal behavior identification method and equipment for multi-view instance-semantic consensus mining
CN117690192B (en) * 2024-02-02 2024-04-26 天度(厦门)科技股份有限公司 Abnormal behavior identification method and equipment for multi-view instance-semantic consensus mining

Similar Documents

Publication Publication Date Title
Xie et al. Hyper-Laplacian regularized multilinear multiview self-representations for clustering and semisupervised learning
Wen et al. Unified tensor framework for incomplete multi-view clustering and missing-view inferring
Zhao et al. Multi-view clustering via deep matrix factorization
Zhang et al. A survey on concept factorization: From shallow to deep representation learning
CN113239131B (en) Low-sample knowledge graph completion method based on meta-learning
Fu et al. FERLrTc: 2D+ 3D facial expression recognition via low-rank tensor completion
CN110222213A (en) A kind of image classification method based on isomery tensor resolution
Zamiri et al. MVDF-RSC: Multi-view data fusion via robust spectral clustering for geo-tagged image tagging
Lin et al. Unsupervised feature selection via orthogonal basis clustering and local structure preserving
Wang et al. Minimum error entropy based sparse representation for robust subspace clustering
CN109657611A (en) A kind of adaptive figure regularization non-negative matrix factorization method for recognition of face
Yi et al. Dual pursuit for subspace learning
CN117409456A (en) Non-aligned multi-view multi-mark learning method based on graph matching mechanism
Wei et al. Latent space robust subspace segmentation based on low-rank and locality constraints
CN110598740B (en) Spectrum embedding multi-view clustering method based on diversity and consistency learning
CN109063725B (en) Multi-view clustering-oriented multi-graph regularization depth matrix decomposition method
Sun et al. Deep alternating non-negative matrix factorisation
CN106845462A (en) The face identification method of feature and cluster is selected while induction based on triple
CN117173702A (en) Multi-view multi-mark learning method based on depth feature map fusion
Zhao et al. Ensemble subspace segmentation under blockwise constraints
CN115392474B (en) Local perception graph representation learning method based on iterative optimization
CN111241326A (en) Image visual relation referring and positioning method based on attention pyramid network
CN107993311B (en) Cost-sensitive latent semantic regression method for semi-supervised face recognition access control system
CN109614581A (en) The Non-negative Matrix Factorization clustering method locally learnt based on antithesis
Mei et al. Joint feature selection and optimal bipartite graph learning for subspace clustering

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination