CN115601571A

CN115601571A - Multi-pattern constraint typical correlation analysis method and system for multi-modal data

Info

Publication number: CN115601571A
Application number: CN202211337115.0A
Authority: CN
Inventors: 张玉龙; 段晓宇; 杜海浪; 胡雨馨; 王一诺; 郭宇; 王飞
Original assignee: Xian Jiaotong University
Current assignee: Xian Jiaotong University
Priority date: 2022-10-28
Filing date: 2022-10-28
Publication date: 2023-01-13

Abstract

The invention discloses a multi-pattern constraint typical correlation analysis method and a system for multi-mode data, which comprises the following steps: constructing a multi-view data set; constructing graph learning items based on self-expression properties of the multi-view data set, and acquiring graph structures corresponding to different views; fusing graph structures corresponding to different views through a self-adaptive weight mechanism to obtain a consistency graph; and constructing a multi-graph constraint typical correlation analysis function, and taking the consistency graph as a constraint item of the function to obtain an improved multi-view feature learning objective function. When the function obtained by the invention is used for multi-view dimensionality reduction, the consistency information among different view data can be utilized, and the similarity relation among the data can also be utilized, so that the low-dimensional characteristics can better reflect the relation among the original data, and the clustering and classifying performance is improved.

Description

Multi-pattern constraint typical correlation analysis method and system for multi-modal data

Technical Field

The invention belongs to the technical field of machine learning, and relates to a multi-pattern constraint typical correlation analysis method and system for multi-mode data.

Background

Multi-view learning is a machine learning strategy that fuses two or more view data. The typical Correlation Analysis (CCA) is an important method for multi-view learning, which finds a mapping vector (typical vector) of each view to a common space in which each view has the maximum Correlation by maximizing a Correlation coefficient between two views. The CCA is generalized by Multi-View Canonical Correlation Analysis (MCCA), so that the CCA is suitable for the case of more than two views, and the application range of the CCA is widened.

MCCA only considers the consistency characteristics among multi-view data, ignores the inherent characteristics of the data and lacks the mining and utilization of data structure relation information. Therefore, graph-constrained Multi-View Canonical Correlation Analysis (GMCCA) is proposed, the algorithm constructs structural constraints among data according to prior knowledge, namely Graph constraints, and utilizes the similarity among data samples, so that the classification and clustering performance of the Graph-constrained Multi-View Canonical Correlation Analysis (GMCCA) is obviously improved compared with that of MCCA.

The existing GMCCA algorithm has some problems, firstly, each view is composed by adopting a predefined graph structure, and the dependence on prior knowledge is large. In practical application, multiple times of trial and comparison are often needed, so that not only is the usability of the algorithm reduced, but also the most reasonable similarity matrix is difficult to construct; secondly, when constructing the constraint item of the graph, the method adopts multi-graph addition or selects certain view data to compose the graph, and the graph structure selection and fusion method is not good enough. This kind of composition method relies on multiple attempts, may lose the structure information of the unused view, and may not optimize the final result because the distribution difference between different view data and the irrelevance of partial structure information are not considered.

Disclosure of Invention

The invention aims to solve the problems that in the analysis of an algorithm in the prior art, a predefined graph structure constructs each view, the dependence on prior knowledge is large, and a graph structure selection and fusion method are not optimal, so that the final result cannot reach the optimal result.

In order to achieve the purpose, the invention adopts the following technical scheme to realize the purpose:

a multi-graph constraint canonical correlation analysis method oriented to multi-modal data comprises the following steps:

s1: constructing a multi-view data set;

s2: constructing graph learning items based on self-expression properties of the multi-view data set, and acquiring graph structures corresponding to different views;

s3: fusing graph structures corresponding to different views through a self-adaptive weight mechanism to obtain a consistency graph;

s4: and constructing a multi-graph constraint typical correlation analysis function, and taking the consistency graph as a constraint item of the function to obtain an improved multi-view feature learning objective function.

The invention is further improved in that:

the graph learning term of step S2 is constructed by equation (1):

wherein A is _m Representing each view data X _m Corresponding diagram structure.

And fusing graph structures corresponding to different views by the following formula (2):

wherein G represents a fused graph; d _m Representing the importance of each view; a. The _m Representing each view data X _m Corresponding diagram structure.

The step of constructing the multi-graph constraint typical correlation analysis function comprises the following steps:

wherein X _m Representing multi-view data for which features need to be extracted; u shape _m Representing a corresponding projection matrix; a. The _m Representing each view data X _m Corresponding graph structures; s represents the consistency characteristic of the multi-view data; l is _G Representing a Laplace matrix corresponding to G; g represents a fused graph; d _m Indicating the importance of each view.

The step S3 further comprises learning an objective function based on the obtained new multi-view features, and calculating and optimizing the objective function through alternative iteration.

The step of alternately and iteratively calculating the optimization objective function comprises the following steps:

fixation A _m D, G, solving for U _m And S, the original objective function is converted into:

to U _m The offset derivative is calculated and set to 0, and the following results are obtained:

introducing an intermediate variable:

when C is subjected to feature decomposition, S is constructed by the feature vectors corresponding to the rho feature values with the maximum C, and the obtained S is substituted into U _m Then the corresponding U can be obtained _m ；

Fixing S, U _m D, G, solving for A _m ：

Objective function pair A _m Calculating deviation and juxtaposing 0, and obtaining:

the alternating iterative computation optimization objective function further comprises the following steps:

fixing S, U _m ，d，A _m And solving G:

at this time, the original objective function is converted into:

optimizing each i respectively, and giving a determined i, the expansion formula is converted into:

definition vector c _i Let it have the J-th element

The target is further functionalized as follows:

wherein the content of the first and second substances,

is represented by A _m The ith row vector of (1);

fixing S, U _m ，G，A _m And d is solved, and the objective function is converted into:

order to

The above equation is then equivalent to the following function:

repeating the process of alternating iterative computation until the target function is converged to obtain a projection matrix U _m And a consistency feature S.

A multi-graph constraint canonical correlation analysis system for multi-modal data comprises a data set construction module, a graph structure acquisition module, a graph structure fusion module and an objective function construction module;

the data set construction module is used for constructing a multi-view data set;

the graph structure acquisition module is used for constructing graph learning items based on the self-expression properties of the multi-view data set and acquiring graph structures corresponding to different views;

the graph structure fusion module is used for fusing graph structures corresponding to different views through a self-adaptive weight mechanism to obtain a consistency graph;

and the target function construction module is used for constructing a multi-graph constraint typical correlation analysis function, and obtaining an improved multi-view feature learning target function by taking the consistency graph as a constraint item of the function.

A terminal device comprising a memory, a processor and a computer program stored in said memory and executable on said processor, said processor implementing the steps of the method of any of the present invention when executing said computer program.

A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method of any one of the invention.

Compared with the prior art, the invention has the following beneficial effects:

the invention discloses a multi-graph constraint typical correlation analysis method facing multi-modal data, which is characterized in that for a constructed multi-view data set, automatic learning of each view data is realized based on self-expression properties of the multi-view data set to obtain a graph structure reflecting the similarity relation of the view data, and the graph structures corresponding to different views are fused through a self-adaptive weight mechanism, so that structural information of unused views is prevented from being lost due to multiple attempts during composition, distribution differences among different view data are considered, and a fused graph structure with commonality is obtained.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.

FIG. 1 is a flow diagram of the present invention;

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.

Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.

In the description of the embodiments of the present invention, it should be noted that, if the terms "upper", "lower", "horizontal", "inner", etc. are used to indicate the orientation or positional relationship based on the orientation or positional relationship shown in the drawings or the orientation or positional relationship which the product of the present invention is used to usually place, it is only for convenience of describing the present invention and simplifying the description, but it is not necessary to indicate or imply that the device or element referred to must have a specific orientation, be constructed and operated in a specific orientation, and thus, should not be construed as limiting the present invention. Furthermore, the terms "first," "second," and the like are used merely to distinguish one description from another, and are not to be construed as indicating or implying relative importance.

Furthermore, the term "horizontal", if present, does not mean that the component is required to be absolutely horizontal, but may be slightly inclined. For example, "horizontal" merely means that the direction is more horizontal than "vertical" and does not mean that the structure must be perfectly horizontal, but may be slightly inclined.

In the description of the embodiments of the present invention, it should be further noted that unless otherwise explicitly stated or limited, the terms "disposed," "mounted," "connected," and "connected" should be interpreted broadly, and may be, for example, fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood by those skilled in the art according to specific situations.

The invention is described in further detail below with reference to the accompanying drawings:

referring to fig. 1, the embodiment of the invention discloses a multi-Graph constraint typical correlation analysis method (MCCA with Graph Learning and Fusion, GFMCCA) for multi-modal data, comprising the following steps of constructing a multi-view data set, realizing automatic Learning of each view data based on self-expression property of a matrix, and obtaining a Graph structure reflecting similarity relation of each view data; for graph structures corresponding to different views, different weights are given according to different importance of the views to tasks, and adaptive fusion is carried out to obtain an optimal consistency graph; and obtaining a new multi-view characteristic learning objective function by using the fused graph structure information as a constraint item of a multi-view typical correlation analysis algorithm, and solving a new model in an alternative optimization mode.

The method specifically comprises the following steps:

for an undirected graph G with N nodes, the consistency characteristic is assumed

Smoothed on the graph G, the MCCA objective function after adding the graph constraint term is as follows:

wherein，

In order to require multi-view data for feature extraction,

for the corresponding projection matrix, S ∈ R ^d×N For consistency features of multi-view data, L _G Is the laplace matrix corresponding to G.

By using the data set self-expression characteristics, graph learning items are constructed, and the corresponding objective function is established as follows:

wherein A is _m Representing each view data X _m Corresponding graph structures;

fusing graph structures of different views by using an adaptive weight mechanism; the corresponding objective function is established as follows:

wherein G is a fused graph, d _m Indicating the importance of each view.

In summary, the objective function of the complete improved algorithm is as follows:

further, as a preferred scheme of the embodiment of the present invention, the problem is solved by using alternating optimization:

(1) Fixation A _m D, G, solving for U _m And S, the original objective function is converted into:

introducing an intermediate variable:

(2) Fixing S, U _m D, G, solving for A _m ：

(3) Fixing S, U _m ，d，A _m And solving G:

at this time, the original objective function is converted into:

definition vector c _i Let its jth element

The target is further functionalized as follows:

wherein the content of the first and second substances,

is represented by A _m The ith row vector of (1);

(4) Fixing S, U _m ，G，A _m And d is solved, and the objective function is converted into:

order to

The above equation is then equivalent to the following function:

the optimization problem formula (10) and formula (12) can be solved by the algorithm proposed by j.

Consider the following optimization problem in the same form as the two problems described above:

the solution for optimal w is as follows:

w _i ＝max{v _i -θ,0} (14)

repeating the process of alternative iterative computation until the target function is converged to obtain a projection matrix U _m And a consistency feature S.

The invention discloses a specific embodiment:

given multi-view data to be characterized

Wherein M is not less than 2,D _m Dimension of the mth view data sample, and N is the number of data set samples; and giving the dimensionality d of the data after dimensionality reduction, and regular term coefficients alpha, beta, gamma and eta.

Based on an original GMCCA algorithm, an optimal consistency graph adjacency matrix is obtained through calculation by a method of self-expression learning and self-adaptive fusion, and multi-view data is subjected to relevant analysis by using fused graph structure information as a constraint item. The objective function of the complete improved algorithm is:

the specific operation comprises the following steps:

the method comprises the following steps: inputting multi-view data

And d, a data dimension after dimension reduction, and regular term coefficients alpha, beta, gamma and eta.

Step two: initializing an M-dimensional vector d, each component being d _m ＝1/M。

Step three: graph structure for initializing each view by using Gaussian kernel function

Step four: in the objective function, A _m D, G is regarded as known, and U is solved _m And S.

Step five: in the objective function, S, U _m D, G is considered known, solving for A _m 。

Step six: in the objective function, S, U _m ，d，A _m As known, solve for G.

Step seven: in the objective function, S, U _m ，G，A _m As known, solve for d.

Step eight: and repeating the fourth step to the seventh step until the target function converges.

The effects of the present invention can be further illustrated by the following experiments on a real database.

The classification and clustering effects of a multi-graph constraint canonical correlation analysis method (GFMCCA) for multi-mode data are respectively tested on a UCI handwritten form digital data set, a Caltech7 data set, a Caltech20 data set and a NUS data set, and are compared with a multi-view canonical correlation analysis algorithm (MCCA), a graph constraint canonical correlation analysis algorithm (GMCCA), a principal component analysis algorithm (PCA) and a graph constraint principal component analysis algorithm (GPCA), so that the superiority of the algorithm disclosed by the invention is verified.

The result of the classification accuracy is shown in table 1, and it can be seen that the algorithm has good performance on a plurality of classification tasks, and the classification accuracy is greatly improved compared with the GMCCA algorithm based on a predefined graph structure.

TABLE 1 Multi-Pattern constraint canonical correlation analysis (GFMCCA) method for Multi-modality data and Classification correctness results thereof versus algorithms

The accuracy of clustering, the normalized mutual information and the purity results are shown in tables 2, 3 and 4, and it can be seen that the algorithm can achieve good clustering effect on a plurality of data sets. The multi-graph constraint typical correlation analysis method oriented to multi-modal data can more effectively acquire the consistency characteristics of the multi-view data, and the algorithm can constrain the multi-view dimension reduction process through self-adaptive graph learning, so that the characteristics after dimension reduction can not only keep the consistency information among different view data, but also better utilize the similarity relation among the data.

Table 2 Multi-graph constraint canonical correlation analysis (GFMCCA) for multi-modal data and clustering accuracy result of comparison algorithm thereof

TABLE 3 clustering NMI results for multi-graph constrained canonical correlation analysis method (GFMCCA) for multi-modal data and its comparison algorithm

TABLE 4 clustering purity results of multi-graph constraint canonical correlation analysis method (GFMCCA) and its comparison algorithm for multi-modal data

When the multi-view typical correlation analysis of the graph constraint is carried out, a graph structure reflecting the similarity relation of the multi-view data set is learned and constructed based on the self-expression property of the multi-view data set instead of adopting the defined graph structure, the dependency on the prior knowledge is small, a reasonable similarity matrix can be constructed without multiple attempts, and the usability of the algorithm is enhanced.

When constructing the graph constraint item, the invention fuses the graph structures corresponding to different views through the self-adaptive weight mechanism to obtain the consistency graph, rather than simply adding multiple graphs or only selecting certain view data to construct a graph, which neither loses the structure information implied by each view, nor fully considers the distribution difference among different view data, thereby obtaining a fused graph structure with the most commonality.

When the invention is used for multi-view dimensionality reduction, the consistency information among different view data can be utilized, and the similarity relation among the data can also be utilized, so that the low-dimensional characteristics can better reflect the relation among the original data, and the clustering and classifying performance is improved.

The embodiment of the invention also discloses a multi-pattern constraint typical correlation analysis system for the multi-modal data, which comprises a data set construction module, a pattern structure acquisition module, a pattern structure fusion module and an objective function construction module;

An embodiment of the present invention provides a schematic diagram of a terminal device. The terminal device of this embodiment includes: a processor, a memory, and a computer program stored in the memory and executable on the processor. The processor realizes the steps of the above-mentioned method embodiments when executing the computer program. Alternatively, the processor implements the functions of the modules/units in the above device embodiments when executing the computer program.

The computer program may be partitioned into one or more modules/units that are stored in the memory and executed by the processor to implement the invention.

The terminal device can be a desktop computer, a notebook, a palm computer, a cloud server and other computing devices. The terminal device may include, but is not limited to, a processor, a memory.

The processor may be a Central Processing Unit (CPU), other general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, etc.

The memory may be used for storing the computer programs and/or modules, and the processor may implement various functions of the terminal device by executing or executing the computer programs and/or modules stored in the memory and calling data stored in the memory.

The modules/units integrated in the terminal device may be stored in a computer-readable storage medium if they are implemented in the form of software functional units and sold or used as separate products. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer memory, read-only memory (ROM), random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, etc. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.

The above is only a preferred embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes will occur to those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A multi-graph constraint canonical correlation analysis method oriented to multi-modal data is characterized by comprising the following steps:

s1: constructing a multi-view data set;

2. The multi-graph constraint canonical correlation analysis method for multi-modal data according to claim 1, wherein the graph learning term of step S2 is constructed by equation (1):

3. The multi-graph constraint canonical correlation analysis method for multi-modal data according to claim 1, wherein graph structures corresponding to different views are fused by equation (2):

wherein G represents a fused graph; d is a radical of _m Representing the importance of each view; a. The _m Representing each view data X _m Corresponding diagram structure.

4. The method for multi-graph constraint canonical correlation analysis based on multi-modal data according to claim 1, wherein the step of constructing the canonical correlation analysis function of multi-graph constraint is:

wherein, X _m Multi-view data representing features to be extracted; u shape _m Representing a corresponding projection matrix; a. The _m Representing each view data X _m A corresponding graph structure; s represents the consistency characteristic of the multi-view data; l is _G Representing a Laplace matrix corresponding to G; g represents a fused graph; d _m Indicating the importance of each view.

5. The multi-graph constraint canonical correlation analysis method for multi-modal data according to claim 4, wherein the step S3 further comprises learning an objective function based on the obtained new multi-view features, and computing an optimization objective function through alternate iterations.

6. The multi-graph constraint canonical correlation analysis method according to claim 1, wherein the step of computing optimization objective function by alternative iteration is:

to U _m Calculating and setting the offset derivative as 0, obtaining:

introducing an intermediate variable:

Fixing S, U _m D, G, solving for A _m ：

7. the method of claim 6, wherein the iterative computation optimization objective function further comprises the following steps:

fixing S, U _m ，d，A _m And solving G:

at this time, the original objective function is converted into:

definition vector c _i Let its jth element

The target is further functionalized as follows:

wherein the content of the first and second substances,

is represented by A _m The ith row vector of (1);

order to

The above equation is then equivalent to the following function:

8. The multi-graph constraint canonical correlation analysis system oriented to multi-modal data according to claim 1, comprising a data set construction module, a graph structure acquisition module, a graph structure fusion module and an objective function construction module;

a dataset construction module for constructing a multi-view dataset;

9. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor realizes the steps of the method according to any of claims 1-7 when executing the computer program.

10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.