CN111160426A

CN111160426A - Feature fusion method and system based on tensor fusion and LSTM network

Info

Publication number: CN111160426A
Application number: CN201911299573.8A
Authority: CN
Inventors: 董爱美; 李志刚
Original assignee: Qilu University of Technology
Current assignee: Qilu University of Technology
Priority date: 2019-12-17
Filing date: 2019-12-17
Publication date: 2020-05-15
Anticipated expiration: 2039-12-17
Also published as: CN111160426B

Abstract

The invention discloses a feature fusion method and a system based on tensor fusion and an LSTM network, which relate to the technical field of heterogeneous data, and the realization of the feature fusion method comprises the following steps: acquiring complete modal heterogeneous data, and splitting the data into A1 complete sub-modal heterogeneous data without missing data and A2 sub-modal heterogeneous data with missing data, wherein A2 sub-modal heterogeneous data are preprocessed into A2 missing sub-modal heterogeneous data; extracting characteristics of complete sub-modal heterogeneous data and missing sub-modal heterogeneous data by using low-rank expression, and performing data modeling on the relation of the complete sub-modal heterogeneous data and the missing sub-modal heterogeneous data by using tensor fusion to obtain a common matrix of the complete sub-modal heterogeneous data and the missing sub-modal heterogeneous data, wherein the common matrix is A1A 2; A1A 2 common matrixes are spliced and then input into an LSTM network, and the LSTM network outputs a fusion matrix. The method can avoid the influence caused by data errors caused by complementing the missing sub-modal heterogeneous data, and simultaneously effectively solve the problem of data missing in the heterogeneous data.

Description

Feature fusion method and system based on tensor fusion and LSTM network

Technical Field

The invention relates to the technical field of heterogeneous data, in particular to a feature fusion method based on tensor fusion and an LSTM network.

Background

The method for classifying the heterogeneous data by using the characteristics of the heterogeneous data with the upper semantic related bottom layer to express different characteristics and complementing the missing value of the heterogeneous data is always an important research method for processing the heterogeneous data. Heterogeneous data often suffers from data loss for various reasons. Although we complement the missing sub-modal heterogeneous data by various methods, since the bottom layers of the heterogeneous data represent different characteristics, an error exists between the complemented missing sub-modal heterogeneous data and the true value.

Disclosure of Invention

Aiming at the requirements and the defects of the prior art development, the invention provides a feature fusion method and a feature fusion system based on tensor fusion and an LSTM network, which do not need to complement the missing sub-modal heterogeneous data, utilize the high-level semantic association feature between the complete modal heterogeneous data and the missing sub-modal heterogeneous data, and effectively solve the condition of data missing phenomenon in the heterogeneous data.

Firstly, the invention provides a feature fusion method based on tensor fusion and an LSTM network, and the technical scheme adopted for solving the technical problems is as follows:

a feature fusion method based on tensor fusion and LSTM network, the realization content of the method includes:

step S1, obtaining heterogeneous data, wherein the obtained heterogeneous data is called complete modal heterogeneous data;

step S2, splitting the complete modal heterogeneous data into A sub-modal heterogeneous data, wherein each sub-modal heterogeneous data comprises all descriptions of the same thing, A1 and A2 are natural numbers larger than 0, the A1 sub-modal heterogeneous data do not contain missing data and are called complete sub-modal heterogeneous data, and the A2 sub-modal heterogeneous data respectively contain missing data;

step S3, preprocessing the A2 sub-modal heterogeneous data respectively: deleting all rows containing missing values in each sub-modal heterogeneous data, wherein the sub-modal heterogeneous data after data deletion is called missing sub-modal heterogeneous data;

step S4, respectively extracting the characteristics of A1 complete sub-modal heterogeneous data and A2 missing sub-modal heterogeneous data by using low-rank representation, and then performing mathematical modeling on the relationship between the complete sub-modal heterogeneous data and the missing sub-modal heterogeneous data after characteristics are extracted by using tensor fusion to obtain a common matrix of the two, wherein the common matrix is A1A 2 in total;

and S5, splicing the A1 and A2 common matrixes obtained in the step S4, inputting the spliced common matrixes to the LSTM network, fusing the spliced common matrixes by the LSTM network, and outputting a fused matrix.

In step S1, heterogeneous data that has been processed into feature vectors is acquired.

When step S3 is executed, matlab is used to find the row where the missing value is located in the a2 pieces of sub-modal heterogeneous data by using the isnan function, and the rows containing the missing value in the a2 pieces of sub-modal heterogeneous data are deleted, so that a2 pieces of missing sub-modal heterogeneous data are obtained.

When step S4 is executed, a common matrix of complete sub-modal heterogeneous data and missing sub-modal heterogeneous data is obtained, and the specific operation steps of this process include:

step S4.1, introducing a linear function of the formula (1),

y ═ ω x + b formula (1)

Wherein, ω represents weight, x represents input complete sub-modal heterogeneous data or missing sub-modal heterogeneous data, b represents bias, and y represents output vector value;

s4.2, respectively changing the complete sub-modal heterogeneous data and the missing sub-modal heterogeneous data from scalar quantities into vectors through the linear function of the formula (1) to obtain a matrix A representing the complete sub-modal heterogeneous data₁，A₂,…，A_nAnd a matrix B representing missing sub-modal heterogeneous data₁，B₂，…,B_m；

S4.3, performing mathematical modeling on the matrix of the complete sub-modal heterogeneous data and the matrix of the missing sub-modal heterogeneous data by using tensor outer products to obtain a common matrix Z^l，

The common matrix Z^lAll characteristic information of complete sub-modal heterogeneous data and missing sub-modal heterogeneous data is contained;

and step S4.4, circularly executing the step S4.2 to the step S4.3 until A1A 2 common matrixes are obtained.

When step S5 is executed, the specific operation content of outputting the fusion matrix includes:

step S5.1, arranging A1A 2 common matrixes obtained in the step S4 according to the modeling sequence;

s5.2, adjusting A1A 2 common matrixes by using a reshape function of matlab, and splicing the adjusted A1A 2 common matrixes in sequence according to rows by using the matlab;

and S5.3, inputting the LSTM network after splicing, screening important information in the spliced A1-A2 common matrixes by the LSTM network, and outputting a two-dimensional fusion matrix, wherein the two-dimensional fusion matrix comprises all characteristic information of A1 complete sub-modal heterogeneous data and A2 missing sub-modal heterogeneous data.

Specifically, the number of rows of the involved two-dimensional fusion matrix is equal to the number of rows of the complete modal heterogeneous data/sub-modal heterogeneous data, and the number of columns of the two-dimensional fusion matrix is less than the maximum value of the product of the numbers of columns of any two of the a1 a2 common matrices and is greater than the number of columns/number of rows of any one of the a1 a2 common matrices.

Secondly, the invention provides a feature fusion system based on tensor fusion and an LSTM network, and the technical scheme adopted for solving the technical problems is as follows:

a feature fusion system based on tensor fusion and LSTM network, the structure includes:

the acquisition module is used for acquiring heterogeneous data processed into the feature vectors, and the acquired heterogeneous data is called as complete modal heterogeneous data;

the splitting module is used for splitting the complete modal heterogeneous data into A sub-modal heterogeneous data, and after splitting, each sub-modal heterogeneous data comprises all descriptions of the same thing, wherein A1 and A2 are natural numbers larger than 0, A1 sub-modal heterogeneous data do not contain missing data and are called complete sub-modal heterogeneous data, and A2 sub-modal heterogeneous data respectively contain missing data;

the preprocessing module is used for preprocessing A2 sub-modal heterogeneous data to obtain A2 missing sub-modal heterogeneous data;

the characteristic extraction module is used for respectively extracting the characteristics of A1 complete sub-modal heterogeneous data and A2 missing sub-modal heterogeneous data by using low-rank representation;

the tensor fusion module is used for performing mathematical modeling on the relationship between the complete sub-modal heterogeneous data and the missing sub-modal heterogeneous data after feature extraction by utilizing tensor fusion to obtain a common matrix of the complete sub-modal heterogeneous data and the missing sub-modal heterogeneous data, wherein the common matrix is A1A 2;

the judgment circulation module is used for judging whether the tensor fusion module performs mathematical modeling on the relation between any complete sub-modal heterogeneous data and any missing sub-modal heterogeneous data or not, if so, the output result of the tensor fusion module is input into the splicing module, and if not, the tensor fusion module returns to continue performing mathematical modeling;

the splicing module is used for splicing A1 × A2 common matrixes obtained by the tensor fusion module according to a modeling sequence;

and the LSTM network module is used for receiving the splicing matrix output by the splicing module and outputting a fusion matrix, and the two-dimensional fusion matrix comprises all characteristic information of A1 complete sub-modal heterogeneous data and A2 missing sub-modal heterogeneous data.

Optionally, the preprocessing module preprocesses the a2 sub-modal heterogeneous data, and the specific operations include:

the preprocessing module firstly finds the row where the missing value is located in the A2 sub-modal heterogeneous data by using the isnan function of matlab, and then deletes the row containing the missing value in the A2 sub-modal heterogeneous data to obtain A2 missing sub-modal heterogeneous data.

Optionally, the mathematical modeling performed by the related tensor fusion module on the relationship between the complete sub-modal heterogeneous data and the missing sub-modal heterogeneous data after feature extraction to obtain the commonality matrix specifically includes:

step 1, introducing a linear function of a formula (1),

y ═ ω x + b formula (1)

step 2, differentiating complete sub-modal heterogeneous data and missing sub-modal by using the linear function of the formula (1)The structure data are respectively changed from scalar quantity into vector quantity, so that a matrix A representing complete modal heterogeneous data is obtained₁，A₂,…，A_nAnd a matrix B representing missing sub-modal heterogeneous data₁，B₂，…,B_m；

Step 3, performing mathematical modeling on the matrix of the complete sub-modal heterogeneous data and the matrix of the missing sub-modal heterogeneous data by using tensor outer products to obtain a common matrix Z^l，

and 4, circularly executing the steps 2 to 3 until A1A 2 common matrixes are obtained.

Optionally, the specific operation of the related splicing module to splice a1 × a2 common matrixes is as follows:

firstly, arranging A1A 2 common matrixes obtained in the step S4 according to a modeling sequence;

subsequently, a1 a2 commonality matrices were adjusted using the reshape function of matlab, and the adjusted a1 a2 commonality matrices were row-wise spliced using matlab.

Optionally, the involved LSTM network module receives the matrix obtained by ordered splicing, screens important information of the matrix, and outputs a two-dimensional fusion matrix. The number of rows of the two-dimensional fusion matrix is equal to the number of rows of the complete modal heterogeneous data/sub-modal heterogeneous data, and the number of columns of the two-dimensional fusion matrix is less than the maximum value of the product of the numbers of columns of any two of the A1A 2 common matrixes and greater than the number of columns/number of rows of any one of the A1A 2 common matrixes.

Compared with the prior art, the feature fusion method and system based on tensor fusion and LSTM network have the advantages that:

1) the method fully utilizes the relation among all the sub-modes of the heterogeneous data, simulates the relation among all the sub-mode heterogeneous data by establishing a mathematical model, and obtains a fusion matrix among the sub-modes through an LSTM network, thereby avoiding the influence caused by data errors caused by the fact that the sub-mode heterogeneous data are completely lost;

2) according to the method, the missing sub-modal heterogeneous data does not need to be supplemented, and the condition of data missing in the heterogeneous data is effectively solved by utilizing the high-level semantic association characteristic between the complete sub-modal heterogeneous data and the missing sub-modal heterogeneous data.

Drawings

FIG. 1 is a flow chart of a method according to a first embodiment of the present invention;

fig. 2 is a structural connection block diagram of a second embodiment of the present invention.

The reference information in the drawings indicates:

1. an acquisition module, 2, a preprocessing module, 3, a feature extraction module, 4, a tensor fusion module,

5. the system comprises a splicing module, 6, an LSTM network module, 7, a splitting module and 8, and a judgment circulating module.

Detailed Description

In order to make the technical scheme, the technical problems to be solved and the technical effects of the present invention more clearly apparent, the following technical scheme of the present invention is clearly and completely described with reference to the specific embodiments.

For the following two embodiments, it should be noted that:

in the examples, "m × y", "m × q", "m × u", "m × p", "m × e", "m × h", "s × p", "d × e", "k × h" respectively denote feature vectors of data;

if the acquired heterogeneous data is a video segment which contains three types of information, namely voice, text and image, the voice is usually processed into a feature vector by using an LSTM, the text is processed into the feature vector by using a word2vec mode, and the image is processed into the feature vector by using a CNN.

The first embodiment is as follows:

with reference to fig. 1, this embodiment proposes a feature fusion method based on tensor fusion and LSTM network, and the implementation content of the method includes:

and step S1, acquiring heterogeneous data processed into the feature vectors, wherein the acquired heterogeneous data is called complete modal heterogeneous data. In this embodiment, it is assumed that complete modal heterogeneous data m × y is acquired.

Step S2, the complete modal heterogeneous data is split into five sub-modal heterogeneous data, each sub-modal heterogeneous data includes all descriptions of the same thing, wherein two sub-modal heterogeneous data do not include missing data and are called complete sub-modal heterogeneous data, and three sub-modal heterogeneous data include missing data respectively. In this embodiment, it is assumed that the complete modal heterogeneous data m × y is split to obtain complete sub-modal heterogeneous data m × q, complete sub-modal heterogeneous data m × u, and sub-modal heterogeneous data m × p, sub-modal heterogeneous data m × e, and sub-modal heterogeneous data m × h containing missing data.

Step S3, preprocessing three sub-modal heterogeneous data containing missing data respectively: and (3) using matlab, respectively finding out the rows where the missing values in the three sub-modal heterogeneous data are located by using an isnan function, and deleting the rows containing the missing values in the three sub-modal heterogeneous data to obtain the three missing sub-modal heterogeneous data. In this embodiment, the missing sub-modal heterogeneous data s × p is obtained after deleting the row from the missing data-containing sub-modal heterogeneous data m × p, the missing sub-modal heterogeneous data d × e is obtained after deleting the row from the missing data-containing sub-modal heterogeneous data m × e, and the missing sub-modal heterogeneous data k × h is obtained after deleting the row from the missing data-containing sub-modal heterogeneous data m × h.

And step S4, respectively extracting the characteristics of two complete sub-modal heterogeneous data and three missing sub-modal heterogeneous data by using low-rank expression, and then performing mathematical modeling on the relationship between the complete sub-modal heterogeneous data and the missing sub-modal heterogeneous data after characteristics are extracted by using tensor fusion to obtain a total of six common matrixes of the complete sub-modal heterogeneous data and the missing sub-modal heterogeneous data. The specific operation steps of the process comprise:

step S4.1, introducing a linear function of the formula (1),

y ═ ω x + b formula (1)

and step S4.4, circularly executing the step S4.2 to the step S4.3 until six common matrixes are obtained.

In this embodiment, the complete sub-modal heterogeneous data m × q and the missing sub-modal heterogeneous data S × p are processed in steps S4.2 to S4.3 to obtain a common matrix mp × qs, the complete sub-modal heterogeneous data m × q and the missing sub-modal heterogeneous data d × e are processed in steps S4.2 to S4.3 to obtain a common matrix me × qd, and the complete sub-modal heterogeneous data m × q and the missing sub-modal heterogeneous data k × h are processed in steps S4.2 to S4.3 to obtain a common matrix mh × qk; and processing the complete sub-modal heterogeneous data m × u and the deletion sub-modal heterogeneous data S × p in steps S4.2 to S4.3 to obtain a common matrix mp × us, processing the complete sub-modal heterogeneous data m × u and the deletion sub-modal heterogeneous data d × e in steps S4.2 to S4.3 to obtain a common matrix me × ud, and processing the complete sub-modal heterogeneous data m × u and the deletion sub-modal heterogeneous data k × h in steps S4.2 to S4.3 to obtain a common matrix mh × uk.

Step S5, splicing the six common matrixes obtained in the step S4, inputting the common matrixes spliced by the LSTM network and the LSTM network after splicing, and outputting the fused matrixes, wherein the specific operation content of the process comprises the following steps:

s5.1, arranging the six common matrixes obtained in the step S4 according to a modeling sequence;

s5.2, adjusting the six common matrixes by using a reshape function of matlab, and splicing the adjusted six common matrixes by using the matlab in order according to rows;

and S5.3, inputting the spliced LSTM network, screening important information in the spliced six common matrixes by the LSTM network, and outputting a two-dimensional fusion matrix, wherein the two-dimensional fusion matrix comprises all characteristic information of two complete sub-modal heterogeneous data and three missing sub-modal heterogeneous data. The row number of the two-dimensional fusion matrix is equal to the row number of the complete modal heterogeneous data/sub-modal heterogeneous data, and the column number of the two-dimensional fusion matrix is smaller than the maximum value of the product of the column numbers of any two common matrixes in the six common matrixes and is larger than the column number/row number of any one common matrix in the six common matrixes.

In the embodiment, after arranging the common matrix mp × qs, the common matrix me × qd, the common matrix mh × qk, the common matrix mp × us, the common matrix me × ud and the common matrix mh × uk in sequence, adjusting the common matrix mp × qs to the common matrix m pqs, adjusting the common matrix me × qd to the common matrix m eqd, adjusting the common matrix mh × qk to the common matrix m × hqk, adjusting the common matrix mp × us to the common matrix m × pus, adjusting the common matrix me × ud to the common matrix m × eud, adjusting the common matrix mh to the common matrix m × huk, splicing the common matrix mp × us to the common matrix m × pus, splicing the common matrix me × ud to the common matrix m × eud, splicing the common matrix m × uk to the common matrix m × huk, splicing the common matrix m × m, 36358, and splicing the common matrix m × m, the common matrix m × qm, the common matrix m × k, the common matrix m × 46, the two-dimensional input and the common matrix 3668, n is less than the product of the maximum two of q, u, p, e, h, s, d, k, and n is greater than the maximum of q, u, p, e, h, s, d, k.

Example two:

with reference to fig. 2, the present embodiment provides a feature fusion system based on tensor fusion and LSTM network, and its structure includes:

the acquisition module 1 is used for acquiring heterogeneous data processed as feature vectors, and the acquired heterogeneous data is called complete modal heterogeneous data;

the splitting module 7 is configured to split the complete modal heterogeneous data into a number of sub-modal heterogeneous data, and after splitting, each sub-modal heterogeneous data includes all descriptions of the same thing, where a1 and a2 are natural numbers greater than 0, each of a1 sub-modal heterogeneous data does not include missing data and is called complete sub-modal heterogeneous data, and a2 sub-modal heterogeneous data includes missing data respectively;

the preprocessing module 2 is used for preprocessing A2 sub-modal heterogeneous data to obtain A2 missing sub-modal heterogeneous data;

the feature extraction module 3 is used for respectively extracting the features of A1 complete sub-modal heterogeneous data and A2 missing sub-modal heterogeneous data by using low-rank representation;

the tensor fusion module 4 is used for performing mathematical modeling on the relationship between the complete sub-modal heterogeneous data and the missing sub-modal heterogeneous data after feature extraction by utilizing tensor fusion to obtain a common matrix of the complete sub-modal heterogeneous data and the missing sub-modal heterogeneous data, wherein the common matrix is A1A 2;

the judgment circulation module 8 is used for judging whether the tensor fusion module 4 performs mathematical modeling on the relationship between any complete sub-modal heterogeneous data and any missing sub-modal heterogeneous data, if so, the output result of the tensor fusion module 4 is input into the splicing module 5, and if not, the output result returns to the tensor fusion module 4 to continue the mathematical modeling;

the splicing module 5 is used for splicing the A1 × A2 common matrixes obtained by the tensor fusion module 4 according to a modeling sequence;

and the LSTM network module 6 is used for receiving the splicing matrix output by the splicing module 5 and outputting a fusion matrix, and the two-dimensional fusion matrix comprises all characteristic information of A1 complete sub-modal heterogeneous data and A2 missing sub-modal heterogeneous data.

In this embodiment, the preprocessing module 2 preprocesses the a2 sub-modal heterogeneous data, and specifically performs the following operations:

the preprocessing module 2 firstly finds the row where the missing value is located in the A2 sub-modal heterogeneous data by using the isnan function of matlab, and then deletes the row containing the missing value in the A2 sub-modal heterogeneous data to obtain A2 missing sub-modal heterogeneous data.

In this embodiment, the tensor fusion module 4 performs mathematical modeling on the relationship between the complete sub-modal heterogeneous data and the missing sub-modal heterogeneous data after feature extraction to obtain the common matrix, and the specific operation is as follows:

step 1, introducing a linear function of a formula (1),

y ═ ω x + b formula (1)

step 2, converting the complete sub-modal heterogeneous data and the missing sub-modal heterogeneous data from scalar quantities into vectors respectively through the linear function of the formula (1), and obtaining a matrix A representing the complete modal heterogeneous data₁，A₂,…，A_nAnd a matrix B representing missing sub-modal heterogeneous data₁，B₂，…,B_m；

In this embodiment, the specific operation of the splicing module 5 to splice a1 × a2 common matrixes is as follows:

In this embodiment, the LSTM network module 6 receives the matrix obtained by the ordered splicing, screens important information of the matrix, and outputs a two-dimensional fusion matrix. The number of rows of the two-dimensional fusion matrix is equal to the number of rows of the complete modal heterogeneous data/sub-modal heterogeneous data, and the number of columns of the two-dimensional fusion matrix is less than the maximum value of the product of the numbers of columns of any two of the A1A 2 common matrixes and greater than the number of columns/number of rows of any one of the A1A 2 common matrixes.

In the specific implementation of this embodiment:

the assumed acquisition module 1 acquires complete modal heterogeneous data m y, the assumed splitting module 7 splits the complete modal heterogeneous data m y into five sub-modal heterogeneous data, wherein two sub-modal heterogeneous data do not contain missing data and are respectively called complete sub-modal heterogeneous data m q and complete sub-modal heterogeneous data m u, and the remaining three sub-modal heterogeneous data respectively contain missing data and are respectively called sub-modal heterogeneous data mp containing missing data, sub-modal heterogeneous data m e and sub-modal heterogeneous data m h;

then, the preprocessing module 2 deletes the row from the submode heterogeneous data m × p containing the missing data to obtain missing submode heterogeneous data s × p, deletes the row from the submode heterogeneous data m × e containing the missing data to obtain missing submodule heterogeneous data d × e, and deletes the row from the submode heterogeneous data m × h containing the missing data to obtain missing submodule heterogeneous data kh, so that s < m, d < m, k < m;

then, the feature extraction module 3 respectively extracts the features of the complete sub-modal heterogeneous data m × q, the complete sub-modal heterogeneous data m × u, the missing sub-modal heterogeneous data s × p, the missing sub-modal heterogeneous data d × e and the missing sub-modal heterogeneous data k × h by using low rank representation, the tensor fusion module 4 respectively performs mathematical modeling on the complete sub-modal heterogeneous data m × q and the missing sub-modal heterogeneous data s × p, the complete sub-modal heterogeneous data m × q and the missing sub-modal heterogeneous data d × e, the complete sub-modal heterogeneous data m × q and the missing sub-modal heterogeneous data k × h, the complete sub-modal heterogeneous data m × u and the missing sub-modal heterogeneous data s × p, the complete sub-modal heterogeneous data m u and the missing sub-modal heterogeneous data d × e, the complete sub-modal heterogeneous data m × u and the missing sub-modal heterogeneous data k × h, obtaining a common matrix mp × qs of the complete sub-modal heterogeneous data m × q and the missing sub-modal heterogeneous data s × p, a common matrix me × qd of the complete sub-modal heterogeneous data m × q and the missing sub-modal heterogeneous data d × e, a common matrix mh × kk of the complete sub-modal heterogeneous data m q and the missing sub-modal heterogeneous data k × h, a common matrix mp us of the complete sub-modal heterogeneous data m × u and the missing sub-modal heterogeneous data d × e, a common matrix me ud of the complete sub-modal heterogeneous data m × u and the missing sub-modal heterogeneous data k × h;

then, the concatenation module 5 first uses the reshape function of matlab to adjust the common matrix mp × qs to the common matrix m × pqs, the common matrix me × qd to the common matrix m × eqd, the common matrix mh × qk to the common matrix m × hqk, the common matrix mp × us to the common matrix m pus, the common matrix me × ud to the common matrix m × eud, the common matrix mh × uk to the common matrix m × huk, and then the concatenation module 5 concatenates the common matrix m × pqs, the common matrix m eqd, the common matrix m × hqk, the common matrix m × pus, the common matrix m × eud, the common matrix m × huk in rows, and completes the network concatenation module 6. The LSTM network module 6 outputs a two-dimensional fusion matrix of m × n, where n is smaller than the product of the largest two numbers of q, u, p, e, h, s, d, k, and n is larger than the largest value of q, u, p, e, h, s, d, k.

In the specific implementation process of this embodiment, the judgment circulation module 8 is further used to judge whether the tensor fusion module 4 performs mathematical modeling on the relationship between any complete sub-modality heterogeneous data and any missing sub-modality heterogeneous data, if so, the six commonality matrices output by the tensor fusion module 4 are input to the splicing module 5, and if not, the tensor fusion module 4 returns to continue performing mathematical modeling until the six commonality matrices are obtained.

In summary, by adopting the feature fusion method and system based on tensor fusion and the LSTM network, the relation between the sub-modes of the heterogeneous data is fully utilized, the relation between the heterogeneous data of each sub-mode is simulated by establishing a mathematical model, and the fusion matrix between the sub-modes is obtained through the LSTM network, so that the condition of data loss in the heterogeneous data is effectively solved.

Based on the above embodiments of the present invention, those skilled in the art should make any improvements and modifications to the present invention without departing from the principle of the present invention, and therefore, the present invention should fall into the protection scope of the present invention.

Claims

1. A feature fusion method based on tensor fusion and LSTM network is characterized in that the implementation content of the method comprises the following steps:

2. The method for feature fusion based on tensor fusion and LSTM network as claimed in claim 1, wherein step S1 obtains heterogeneous data that has been processed into feature vectors.

3. The method for feature fusion based on tensor fusion and LSTM network as claimed in claim 2, wherein in step S3, matlab is used to find the row where the missing value is located in A2 sub-modal heterogeneous data by using the isnan function, and the rows containing the missing value in A2 sub-modal heterogeneous data are deleted to obtain A2 missing sub-modal heterogeneous data.

4. The method for feature fusion based on tensor fusion and LSTM network as claimed in claim 2, wherein when step S4 is executed, the relationship between complete sub-modal heterogeneous data and missing sub-modal heterogeneous data is mathematically modeled to obtain a common matrix of the two, and the specific operation steps of this process include:

step S4.1, introducing a linear function of the formula (1),

y ═ ω x + b formula (1)

5. The method for feature fusion based on tensor fusion and LSTM network as claimed in any of claims 2-4, wherein the specific operation content of outputting the fusion matrix when performing step S5 includes:

s5.3, inputting the LSTM network after splicing, screening important information in A1-A2 common matrixes after splicing by the LSTM network, and outputting a two-dimensional fusion matrix, wherein the two-dimensional fusion matrix comprises all characteristic information of A1 complete sub-modal heterogeneous data and A2 missing sub-modal heterogeneous data;

the number of rows of the two-dimensional fusion matrix is equal to the number of rows of complete modal heterogeneous data/sub-modal heterogeneous data, and the number of columns of the two-dimensional fusion matrix is less than the maximum value of the product of the number of columns of any two of A1A 2 common matrixes and greater than the number of columns/number of rows of any one of A1A 2 common matrixes.

6. A feature fusion system based on tensor fusion and LSTM network is characterized by comprising the following structures:

7. The system of claim 6, wherein the preprocessing module preprocesses the A2 sub-modal heterogeneous data by:

8. The system for feature fusion based on tensor fusion and LSTM network as claimed in claim 6, wherein the specific operation of the tensor fusion module to mathematically model the relationship between the complete sub-modal heterogeneous data and the missing sub-modal heterogeneous data after feature extraction to obtain the commonality matrix is:

step 1, introducing a linear function of a formula (1),

y ═ ω x + b formula (1)

9. The system for feature fusion based on tensor fusion and LSTM network as claimed in claim 6, 7 or 8, wherein said stitching module stitches A1A 2 common matrixes by:

10. The system for feature fusion based on tensor fusion and LSTM network as claimed in claim 9, wherein the LSTM network module receives the matrix obtained by ordered splicing, and filters important information of the matrix to output a two-dimensional fusion matrix;

the number of rows of the two-dimensional fusion matrix is equal to the number of rows of the complete modal heterogeneous data/sub-modal heterogeneous data;

the column number of the two-dimensional fusion matrix is less than the maximum value of the column number product of any two of A1A 2 common matrixes, and is greater than the column number/row number of any one of A1A 2 common matrixes.