CN111160426B

CN111160426B - Feature fusion method and system based on tensor fusion and LSTM network

Info

Publication number: CN111160426B
Application number: CN201911299573.8A
Authority: CN
Inventors: 董爱美; 李志刚
Original assignee: Qilu University of Technology
Current assignee: Qilu University of Technology
Priority date: 2019-12-17
Filing date: 2019-12-17
Publication date: 2023-04-28
Anticipated expiration: 2039-12-17
Also published as: CN111160426A

Abstract

The invention discloses a feature fusion method and a system based on tensor fusion and LSTM network, which relate to the technical field of heterogeneous data, wherein the implementation of the feature fusion method comprises the following steps: the method comprises the steps of obtaining complete modal heterogeneous data, splitting the complete modal heterogeneous data into A1 complete sub-modal heterogeneous data without missing data and A2 sub-modal heterogeneous data with missing data, and preprocessing the A2 sub-modal heterogeneous data into A2 missing sub-modal heterogeneous data; extracting characteristics of complete sub-modal heterogeneous data and missing sub-modal heterogeneous data by using low-rank representation, and carrying out data modeling on the relation between the complete sub-modal heterogeneous data and the missing sub-modal heterogeneous data by using tensor fusion to obtain a common matrix of the complete sub-modal heterogeneous data and the missing sub-modal heterogeneous data, wherein the total number of the common matrices is A1 and A2; and splicing the A1-A2 common matrixes, and then inputting the spliced matrixes into an LSTM network, and outputting the fused matrixes by the LSTM network. The invention can avoid the influence caused by data errors caused by the completion of the deletion sub-mode heterogeneous data and effectively solve the problem of data deletion phenomenon in the heterogeneous data.

Description

Feature fusion method and system based on tensor fusion and LSTM network

Technical Field

The invention relates to the technical field of heterogeneous data, in particular to a feature fusion method based on tensor fusion and an LSTM network.

Background

Classifying the heterogeneous data by utilizing different features of the heterogeneous data with the upper-layer semantic-related bottom layer representation, and complementing the missing values of the heterogeneous data is always an important research method for heterogeneous data processing. For various reasons, heterogeneous data often has the phenomenon of data loss. Although we complement the missing sub-modal heterogeneous data by various methods, there is an error between the true value after the missing sub-modal heterogeneous data is complemented because the heterogeneous data bottom layer represents different characteristics.

Disclosure of Invention

Aiming at the needs and the shortcomings of the prior art development, the invention provides a feature fusion method and a system based on tensor fusion and LSTM network, which do not need to complement the missing sub-mode heterogeneous data, utilize the high-level semantic association features between the complete mode heterogeneous data and the missing sub-mode heterogeneous data, and effectively solve the situation of data missing phenomenon in the heterogeneous data.

Firstly, the invention provides a feature fusion method based on tensor fusion and LSTM network, which solves the technical problems as follows:

a feature fusion method based on tensor fusion and LSTM network, the implementation content of the method includes:

s1, acquiring heterogeneous data, wherein the acquired heterogeneous data is called complete modal heterogeneous data;

s2, splitting the complete modal heterogeneous data into A pieces of sub-modal heterogeneous data, wherein each piece of sub-modal heterogeneous data comprises all descriptions of the same thing, A1 and A2 are natural numbers larger than 0, all the A1 pieces of sub-modal heterogeneous data do not contain missing data, the data are called complete sub-modal heterogeneous data, and the A2 pieces of sub-modal heterogeneous data respectively contain missing data;

step S3, preprocessing the A2 sub-modal heterogeneous data respectively: deleting all lines containing the missing value in each sub-modal heterogeneous data, wherein the sub-modal heterogeneous data after deleting the data is called missing sub-modal heterogeneous data;

s4, respectively extracting the characteristics of A1 complete sub-modal heterogeneous data and A2 missing sub-modal heterogeneous data by using low-rank representation, and then performing mathematical modeling on the relation between the complete sub-modal heterogeneous data and the missing sub-modal heterogeneous data after the characteristics are extracted by using tensor fusion to obtain a common matrix of the complete sub-modal heterogeneous data and the missing sub-modal heterogeneous data, wherein the total number of the common matrices is A1;

and S5, splicing the A1-A2 common matrixes obtained in the step S4, inputting an LSTM network after splicing, fusing the spliced common matrixes by the LSTM network, and outputting a fused matrix.

Heterogeneous data that has been processed into feature vectors is acquired in step S1.

And when the step S3 is executed, matlab is used, the isnan function is utilized to find the row where the missing value is located in the A2 sub-modal heterogeneous data, and the row containing the missing value in the A2 sub-modal heterogeneous data is deleted, so that the A2 missing sub-modal heterogeneous data is obtained.

When the step S4 is executed, a common matrix of the complete sub-mode heterogeneous data and the missing sub-mode heterogeneous data is obtained, and the specific operation steps of the process comprise:

step S4.1, introducing a linear function of the formula (1),

y=ωx+b formula (1)

Wherein ω represents weight, x represents input complete sub-modal heterogeneous data or missing sub-modal heterogeneous data, b represents bias, y represents output vector value;

s4.2, changing the complete sub-modal heterogeneous data and the missing sub-modal heterogeneous data from scalar quantities into vectors respectively through a linear function of the formula (1) to obtain a matrix A representing the complete sub-modal heterogeneous data ₁ ，A ₂ ,…，A _n And a matrix B representing missing sub-modal heterogeneous data ₁ ，B ₂ ，…,B _m ；

S4.3, performing mathematical modeling on the matrix of the complete sub-modal heterogeneous data and the matrix of the missing sub-modal heterogeneous data by using tensor outer product to obtain a common matrix Z ^l ，

The commonality matrix Z ^l All characteristic information of the complete sub-modal heterogeneous data and the missing sub-modal heterogeneous data is contained;

step S4.4, circularly executing the steps S4.2-S4.3 until A1-A2 common matrixes are obtained.

When executing step S5, the specific operation content of outputting the fusion matrix includes:

step S5.1, arranging A1-A2 commonality matrixes obtained in the step S4 according to a modeling sequence;

step S5.2, adjusting A1-A2 commonality matrixes by using a reshape function of matlab, and splicing the adjusted A1-A2 commonality matrixes by using matlab according to the line order;

and S5.3, inputting an LSTM network after the splicing is completed, screening important information in the spliced A1-A2 common matrixes by the LSTM network, and outputting a two-dimensional fusion matrix, wherein the two-dimensional fusion matrix comprises all characteristic information of A1 complete sub-modal heterogeneous data and A2 missing sub-modal heterogeneous data.

Specifically, the number of rows of the involved two-dimensional fusion matrix is equal to the number of rows of the complete modal heterogeneous data/sub-modal heterogeneous data, and the number of columns of the two-dimensional fusion matrix is smaller than the maximum value of the product of the number of columns of any two common matrixes in the A1 x A2 common matrixes and larger than the number of columns/rows of any common matrix in the A1 x A2 common matrixes.

Secondly, the invention provides a feature fusion system based on tensor fusion and LSTM network, which solves the technical problems as follows:

a feature fusion system based on tensor fusion and LSTM network comprises the following structures:

the acquisition module is used for acquiring heterogeneous data which is processed into feature vectors, and the acquired heterogeneous data is called complete modal heterogeneous data;

the splitting module is used for splitting the complete modal heterogeneous data into A pieces of sub-modal heterogeneous data, after splitting, each piece of sub-modal heterogeneous data comprises all descriptions of the same thing, wherein A1 and A2 are natural numbers larger than 0, all the A1 pieces of sub-modal heterogeneous data do not contain missing data, the complete sub-modal heterogeneous data are called, and the A2 pieces of sub-modal heterogeneous data respectively contain missing data;

the preprocessing module is used for preprocessing the A2 sub-modal heterogeneous data to obtain A2 missing sub-modal heterogeneous data;

the feature extraction module is used for respectively extracting features of the A1 complete sub-modal heterogeneous data and the A2 missing sub-modal heterogeneous data by using low-rank representation;

the tensor fusion module is used for carrying out mathematical modeling on the relation between the complete sub-mode heterogeneous data and the missing sub-mode heterogeneous data after the feature extraction by utilizing tensor fusion to obtain a common matrix of the complete sub-mode heterogeneous data and the missing sub-mode heterogeneous data, wherein the total number of the common matrices is A1;

the judging and circulating module is used for judging whether the tensor fusion module carries out mathematical modeling on the relation between any complete sub-modal heterogeneous data and any missing sub-modal heterogeneous data, if so, inputting the output result of the tensor fusion module into the splicing module, and if not, returning to the tensor fusion module to continue mathematical modeling;

the splicing module is used for splicing the A1-A2 common matrixes obtained by the tensor fusion module according to the modeling sequence;

the LSTM network module is used for receiving the splicing matrix output by the splicing module and outputting a fusion matrix, and the two-dimensional fusion matrix comprises all characteristic information of A1 complete sub-modal heterogeneous data and A2 missing sub-modal heterogeneous data.

Optionally, the related preprocessing module preprocesses the A2 sub-modal heterogeneous data, and specifically comprises the following operations:

the preprocessing module firstly finds the row where the missing value in the A2 sub-modal heterogeneous data is located by using the isnan function of matlab, and then deletes the row containing the missing value in the A2 sub-modal heterogeneous data to obtain the A2 missing sub-modal heterogeneous data.

Optionally, the related tensor fusion module performs mathematical modeling on the relation between the complete sub-modal heterogeneous data and the missing sub-modal heterogeneous data after feature extraction to obtain a common matrix, which comprises the following specific operations:

step 1, introducing a linear function of the formula (1),

y=ωx+b formula (1)

step 2, changing the complete sub-modal heterogeneous data and the missing sub-modal heterogeneous data from scalar quantities into vectors respectively through a linear function of the formula (1) to obtain a matrix A representing the complete modal heterogeneous data ₁ ，A ₂ ,…，A _n And a matrix B representing missing sub-modal heterogeneous data ₁ ，B ₂ ，…,B _m ；

Step 3, mathematical modeling is carried out on the matrix of the complete sub-modal heterogeneous data and the matrix of the missing sub-modal heterogeneous data by utilizing tensor outer product to obtain a commonality matrix Z ^l ，

and 4, circularly executing the steps 2-3 until A1-A2 common matrixes are obtained.

Optionally, the specific operation of the related splicing module to splice the A1 x A2 commonality matrices is:

firstly, arranging A1 and A2 commonality matrixes obtained in the step S4 according to a modeling sequence;

and then adjusting A1-A2 commonality matrixes by using a reshape function of matlab, and splicing the adjusted A1-A2 commonality matrixes in line order by using matlab.

Optionally, the related LSTM network module receives the matrix obtained by orderly splicing, screens important information of the matrix and outputs a two-dimensional fusion matrix. The number of lines of the two-dimensional fusion matrix is equal to the number of lines of the complete modal heterogeneous data/sub-modal heterogeneous data, the number of columns of the two-dimensional fusion matrix is smaller than the maximum value of the product of the number of columns of any two common matrixes in the A1-A2 common matrixes, and is larger than the number of columns/lines of any common matrix in the A1-A2 common matrixes.

The feature fusion method and system based on tensor fusion and LSTM network has the beneficial effects compared with the prior art that:

1) According to the invention, the relation among all sub-modes of the heterogeneous data is fully utilized, the relation among all sub-modes of the heterogeneous data is simulated by establishing a mathematical model, and a fusion matrix among the sub-modes is obtained through an LSTM (link state transition) network, so that the influence caused by data errors caused by the completion of missing sub-mode heterogeneous data is avoided;

2) According to the invention, the missing sub-mode heterogeneous data is not required to be complemented, and the high-level semantic association characteristic between the complete sub-mode heterogeneous data and the missing sub-mode heterogeneous data is utilized, so that the situation of data missing phenomenon in the heterogeneous data is effectively solved.

Drawings

FIG. 1 is a flow chart of a method according to a first embodiment of the invention;

fig. 2 is a block diagram showing structural connection of a second embodiment of the present invention.

The reference numerals in the drawings represent:

1. an acquisition module, a preprocessing module, a feature extraction module, a tensor fusion module,

5. the system comprises a splicing module, a LSTM network module, a splitting module, a judging and circulating module and a splicing module.

Detailed Description

In order to make the technical scheme, the technical problems to be solved and the technical effects of the invention more clear, the technical scheme of the invention is clearly and completely described below by combining specific embodiments.

For the following two embodiments, it should be noted that:

in the embodiments, the "m×y", "m×q", "m×u", "m×p", "m×e", "m×h", "s×p", "d×e", "k×h" respectively represent feature vectors of data;

if the obtained heterogeneous data is a video segment, which contains three types of information of voice, text and image, the voice is usually processed into feature vectors by adopting an LSTM, the text is processed into feature vectors by adopting a word2vec mode, and the image is processed into feature vectors by adopting a CNN.

Embodiment one:

with reference to fig. 1, this embodiment proposes a feature fusion method based on tensor fusion and LSTM network, where implementation content of the method includes:

and S1, acquiring heterogeneous data which is processed into feature vectors, wherein the acquired heterogeneous data is called complete modal heterogeneous data. In this embodiment, it is assumed that complete modal isomer data m×y is acquired.

And S2, splitting the complete modal heterogeneous data into five pieces of sub-modal heterogeneous data, wherein each piece of sub-modal heterogeneous data comprises all descriptions of the same thing, two pieces of sub-modal heterogeneous data do not comprise missing data, the two pieces of sub-modal heterogeneous data are called complete sub-modal heterogeneous data, and three pieces of sub-modal heterogeneous data respectively comprise missing data. In this embodiment, it is assumed that the complete modal isomerism data m×y is split to obtain complete sub-modal isomerism data m×q and complete sub-modal isomerism data m×u, and sub-modal isomerism data m×p, sub-modal isomerism data m×e and sub-modal isomerism data m×h containing missing data.

Step S3, respectively preprocessing three sub-mode heterogeneous data containing missing data: and using matlab, respectively finding the row where the missing value is located in the three sub-modal heterogeneous data by using isnan function, and deleting the row containing the missing value in the three sub-modal heterogeneous data to obtain the three missing sub-modal heterogeneous data. In this embodiment, the deletion sub-mode heterogeneous data m×p containing the deletion data is deleted to obtain the deletion sub-mode heterogeneous data s×p, the deletion sub-mode heterogeneous data m×e containing the deletion data is deleted to obtain the deletion sub-mode heterogeneous data d×e, and the deletion sub-mode heterogeneous data m×h containing the deletion data is deleted to obtain the deletion sub-mode heterogeneous data k×h, so that s < m, d < m, and k < m can be known.

And S4, respectively extracting the characteristics of the two complete sub-modal heterogeneous data and the three missing sub-modal heterogeneous data by using low-rank representation, and then performing mathematical modeling on the relation between the complete sub-modal heterogeneous data and the missing sub-modal heterogeneous data after the characteristics are extracted by using tensor fusion to obtain a common matrix of the complete sub-modal heterogeneous data and the missing sub-modal heterogeneous data, wherein the total number of the common matrix is six. The specific operation steps of the process comprise:

step S4.1, introducing a linear function of the formula (1),

y=ωx+b formula (1)

The commonSex matrix Z ^l All characteristic information of the complete sub-modal heterogeneous data and the missing sub-modal heterogeneous data is contained;

and step S4.4, circularly executing the steps S4.2-S4.3 until six common matrixes are obtained.

In this embodiment, the common matrix mp_qs is obtained after the complete sub-modal heterogeneous data m_q and the missing sub-modal heterogeneous data s_p are processed in the steps of S4.2-S4.3, the common matrix me_qd is obtained after the complete sub-modal heterogeneous data m_q and the missing sub-modal heterogeneous data d_e are processed in the steps of S4.2-S4.3, and the common matrix mh_qk is obtained after the complete sub-modal heterogeneous data m_q and the missing sub-modal heterogeneous data k_h are processed in the steps of S4.2-S4.3; the method comprises the steps of processing complete sub-modal isomerism data m x u and missing sub-modal isomerism data S x p through a step S4.2-a step S4.3 to obtain a commonality matrix mp x us, processing complete sub-modal isomerism data m x u and missing sub-modal isomerism data d x e through a step S4.2-a step S4.3 to obtain a commonality matrix me x ud, and processing complete sub-modal isomerism data m x u and missing sub-modal isomerism data k x h through a step S4.2-a step S4.3 to obtain a commonality matrix mh x uk.

Step S5, splicing the six commonality matrixes obtained in the step S4, inputting an LSTM network after splicing, fusing the spliced commonality matrixes by the LSTM network, and outputting the fused matrixes, wherein the specific operation content of the process comprises the following steps:

s5.1, arranging six commonality matrixes obtained in the step S4 according to a modeling sequence;

s5.2, adjusting six common matrixes by using a reshape function of matlab, and splicing the adjusted six common matrixes according to rows by using matlab;

and S5.3, inputting an LSTM network after the splicing is completed, screening important information in the six spliced commonality matrixes by the LSTM network, and outputting a two-dimensional fusion matrix which contains all characteristic information of two complete sub-modal heterogeneous data and three missing sub-modal heterogeneous data. The number of lines of the two-dimensional fusion matrix is equal to the number of lines of the complete modal heterogeneous data/sub-modal heterogeneous data, the number of columns of the two-dimensional fusion matrix is smaller than the maximum value of the product of the number of columns of any two of the six common matrixes, and is larger than the number of columns/lines of any one of the six common matrixes.

In this embodiment, the common matrix mp_qs, the common matrix me_qd, the common matrix mh_qk, the common matrix mp_us, the common matrix me_ud, the common matrix mh_uk are sequentially arranged, the common matrix mp_qs is adjusted to be the common matrix m_ pqs, the common matrix me_qd is adjusted to be the common matrix m_ eqd, the common matrix mh_qk is adjusted to be the common matrix m_hqk, the common matrix mp_ pus, the common matrix me_ud is adjusted to be the common matrix m_ eud, the common matrix mh_uk is adjusted to be the common matrix m_ huk, the common matrix m_ qqs, the common matrix m_ eqd, the common matrix m_qk are sequentially spliced to be the common matrix m_ huk, and the common matrix m_hqk are sequentially spliced to be the common matrix m_hqk, and the common matrix m_hqk are sequentially arranged, and the two-dimensional data are input to the two-dimensional network in which is equal to or greater than the maximum value of the two-dimensional data in the network (n, the two-dimensional data) is equal to or greater than the maximum value of the two-dimensional data in the network (n) and the network (n, n) is equal to the maximum value of 24).

Embodiment two:

with reference to fig. 2, this embodiment proposes a feature fusion system based on tensor fusion and LSTM network, where the structure includes:

the acquisition module 1 is used for acquiring heterogeneous data which is processed into feature vectors, and the acquired heterogeneous data is called complete modal heterogeneous data;

the splitting module 7 is configured to split the complete modal heterogeneous data into a pieces of sub-modal heterogeneous data, where after splitting, each piece of sub-modal heterogeneous data includes all descriptions of the same thing, where A1 and A2 are natural numbers greater than 0, none of the A1 pieces of sub-modal heterogeneous data contains missing data, which is called complete sub-modal heterogeneous data, and the A2 pieces of sub-modal heterogeneous data respectively contain missing data;

the preprocessing module 2 is used for preprocessing the A2 sub-modal heterogeneous data to obtain A2 missing sub-modal heterogeneous data;

the feature extraction module 3 is used for respectively extracting features of the A1 complete sub-modal heterogeneous data and the A2 missing sub-modal heterogeneous data by using low-rank representation;

the tensor fusion module 4 is used for carrying out mathematical modeling on the relation between the complete sub-mode heterogeneous data and the missing sub-mode heterogeneous data after the feature extraction by utilizing tensor fusion to obtain a common matrix of the complete sub-mode heterogeneous data and the missing sub-mode heterogeneous data, wherein the total number of the common matrices is A1;

the judging and circulating module 8 is configured to judge whether the tensor fusion module 4 performs mathematical modeling on the relationship between any complete sub-modal heterogeneous data and any missing sub-modal heterogeneous data, if yes, input the output result of the tensor fusion module 4 into the splicing module 5, and if no, return to the tensor fusion module 4 to continue performing mathematical modeling;

the splicing module 5 is used for splicing the A1-A2 common matrixes obtained by the tensor fusion module 4 according to a modeling sequence;

the LSTM network module 6 is configured to receive the splicing matrix output by the splicing module 5, and output a fusion matrix, where the two-dimensional fusion matrix includes all feature information of A1 complete sub-modal heterogeneous data and A2 missing sub-modal heterogeneous data.

In this embodiment, the preprocessing module 2 preprocesses the A2 sub-modal heterogeneous data, which specifically includes:

the preprocessing module 2 firstly finds the row where the missing value in the A2 sub-modal heterogeneous data is located by using the isnan function of matlab, and then deletes the row containing the missing value in the A2 sub-modal heterogeneous data to obtain the A2 missing sub-modal heterogeneous data.

In this embodiment, the related tensor fusion module 4 performs mathematical modeling on the relationship between the complete sub-modal heterogeneous data and the missing sub-modal heterogeneous data after feature extraction to obtain the common matrix, which specifically includes:

step 1, introducing a linear function of the formula (1),

y=ωx+b formula (1)

step 2, carrying out linear function of the formula (1) to obtain complete sub-mode heterogeneous data and missing sub-mode heterogeneous dataThe data are respectively changed into vectors from scalar quantities, and a matrix A representing the complete modal heterogeneous data is obtained ₁ ，A ₂ ,…，A _n And a matrix B representing missing sub-modal heterogeneous data ₁ ，B ₂ ，…,B _m ；

In this embodiment, the specific operation of the splicing module 5 to splice the a1×a2 commonalities is:

In this embodiment, the related LSTM network module 6 receives the matrix obtained by orderly splicing, screens important information of the matrix, and outputs a two-dimensional fusion matrix. The number of lines of the two-dimensional fusion matrix is equal to the number of lines of the complete modal heterogeneous data/sub-modal heterogeneous data, the number of columns of the two-dimensional fusion matrix is smaller than the maximum value of the product of the number of columns of any two common matrixes in the A1-A2 common matrixes, and is larger than the number of columns/lines of any common matrix in the A1-A2 common matrixes.

The embodiment is specifically implemented as follows:

the method comprises the steps that an acquisition module 1 is assumed to acquire complete modal heterogeneous data m x y, a splitting module 7 is assumed to split the complete modal heterogeneous data m x y into five sub-modal heterogeneous data, wherein two sub-modal heterogeneous data do not contain missing data, namely complete sub-modal heterogeneous data m x q and complete sub-modal heterogeneous data m x u respectively, and the remaining three sub-modal heterogeneous data contain missing data, namely sub-modal heterogeneous data m x p, sub-modal heterogeneous data m x e and sub-modal heterogeneous data m x h containing the missing data respectively;

then, the preprocessing module 2 deletes the row from the sub-mode heterogeneous data m×p containing the missing data to obtain the missing sub-mode heterogeneous data s×p, deletes the row from the sub-mode heterogeneous data m×e containing the missing data to obtain the missing sub-mode heterogeneous data d×e, deletes the row from the sub-mode heterogeneous data m×h containing the missing data to obtain the missing sub-mode heterogeneous data k×h, so that s < m, d < m, k < m;

subsequently, the feature extraction module 3 extracts features of the complete sub-modal heterogeneous data m×q, the complete sub-modal heterogeneous data m×u, the complete sub-modal heterogeneous data s×p, the complete sub-modal heterogeneous data d×e, and the complete sub-modal heterogeneous data k×h by using low rank representation, and the tensor fusion module 4 performs mathematical modeling on the complete sub-modal heterogeneous data m×q and the complete sub-modal heterogeneous data s×p, the complete sub-modal heterogeneous data m×q and the complete sub-modal heterogeneous data d×e, the complete sub-modal heterogeneous data m×q and the complete sub-modal heterogeneous data k×h, the complete sub-modal heterogeneous data m×u and the complete sub-modal heterogeneous data s×p, the complete sub-modal heterogeneous data m×u and the complete sub-modal heterogeneous data d×e, the complete sub-modal heterogeneous data m×u and the complete sub-modal heterogeneous data k×h, obtaining a common matrix mp_qs of complete sub-modal isomerism data m_q and missing sub-modal isomerism data s_p, a common matrix me_qd of complete sub-modal isomerism data m_q and missing sub-modal isomerism data d_e, a common matrix mh_qk of complete sub-modal isomerism data m_q and missing sub-modal isomerism data k_h, a common matrix mp_us of complete sub-modal isomerism data m_u and missing sub-modal isomerism data s_p, a common matrix me_ud of complete sub-modal isomerism data m_u and missing sub-modal isomerism data d_e, and a common matrix mh_uv of complete sub-modal isomerism data m_u and missing sub-modal isomerism data k_h;

then, the splicing module 5 firstly adjusts the commonality matrix mp_qs to the commonality matrix m_ pqs, adjusts the commonality matrix me_qd to the commonality matrix m_ eqd, adjusts the commonality matrix mh_qk to the commonality matrix m_hqk, adjusts the commonality matrix mp_us to the commonality matrix m_ pus, adjusts the commonality matrix me_ud to the commonality matrix m_ eud, adjusts the commonality matrix mh_uk to the commonality matrix m_ huk, and then, the splicing module 5 sequentially splices the commonality matrix m_ pqs, the commonality matrix m_ eqd, the commonality matrix m_hqk, the commonality matrix m_ pus, the commonality matrix m_ eud, the commonality matrix m_ huk according to the line, and inputs the spliced commonality matrix m_3824 to the LSTM network module 6. The LSTM network module 6 outputs a two-dimensional fusion matrix of m×n, where n is smaller than the product of the maximum two numbers in q, u, p, e, h, s, d, k and n is greater than the maximum value in q, u, p, e, h, s, d, k.

In the process of implementing the embodiment, the judging and circulating module 8 is also used for judging whether the tensor fusion module 4 carries out mathematical modeling on the relation between any complete sub-modal heterogeneous data and any missing sub-modal heterogeneous data, if so, six common matrixes output by the tensor fusion module 4 are input into the splicing module 5, and if not, the tensor fusion module 4 is returned to continue mathematical modeling until six common matrixes are obtained.

In summary, by adopting the feature fusion method and system based on tensor fusion and LSTM network, the relation among all sub-modes of heterogeneous data is fully utilized, the relation among all sub-modes of heterogeneous data is simulated by establishing a mathematical model, and the fusion matrix among the sub-modes is obtained through the LSTM network, so that the situation of data missing phenomenon in the heterogeneous data is effectively solved.

Based on the above-mentioned embodiments of the present invention, any improvements and modifications made by those skilled in the art without departing from the principles of the present invention should fall within the scope of the present invention.

Claims

1. A feature fusion method based on tensor fusion and LSTM network is characterized in that the implementation content of the method comprises the following steps:

2. The method of claim 1, wherein heterogeneous data processed into feature vectors is acquired in step S1.

3. The method for feature fusion based on tensor fusion and LSTM network according to claim 2, wherein in the step S3, matlab is used to find the row where the missing value in the A2 sub-modal heterogeneous data is located by using isnan function, and the row containing the missing value in the A2 sub-modal heterogeneous data is deleted to obtain the A2 missing sub-modal heterogeneous data.

4. The method for feature fusion based on tensor fusion and LSTM network according to claim 2, wherein, when executing step S4, mathematical modeling is performed on the relationship between the complete sub-modal heterogeneous data and the missing sub-modal heterogeneous data to obtain a common matrix of the complete sub-modal heterogeneous data and the missing sub-modal heterogeneous data, and the specific operation steps of the process include:

step S4.1, introducing a linear function of the formula (1),

y=ωx+b formula (1)

5. The method of any one of claims 2-4, wherein outputting the specific operation content of the fusion matrix when executing step S5 comprises:

s5.3, inputting an LSTM network after splicing, screening important information in the spliced A1-A2 common matrixes by the LSTM network, and outputting a two-dimensional fusion matrix, wherein the two-dimensional fusion matrix comprises all characteristic information of A1 complete sub-modal heterogeneous data and A2 missing sub-modal heterogeneous data;

the number of lines of the two-dimensional fusion matrix is equal to the number of lines of the complete modal heterogeneous data/sub-modal heterogeneous data, the number of columns of the two-dimensional fusion matrix is smaller than the maximum value of the product of the number of columns of any two common matrixes in the A1 x A2 common matrixes, and is larger than the number of columns/lines of any common matrix in the A1 x A2 common matrixes.

6. A feature fusion system based on tensor fusion and LSTM network, characterized in that the structure thereof comprises:

the LSTM network module is used for receiving the splicing matrix output by the splicing module and outputting a two-dimensional fusion matrix, wherein the two-dimensional fusion matrix comprises all characteristic information of A1 complete sub-modal heterogeneous data and A2 missing sub-modal heterogeneous data.

7. The feature fusion system based on tensor fusion and LSTM network of claim 6, wherein the preprocessing module performs preprocessing on the A2 sub-modal heterogeneous data, specifically:

the preprocessing module firstly finds a row where a missing value is located in the A2 sub-modal heterogeneous data by using an isnan function of matlab, and then deletes the row containing the missing value in the A2 sub-modal heterogeneous data to obtain the A2 missing sub-modal heterogeneous data.

8. The feature fusion system based on tensor fusion and LSTM network of claim 6, wherein the specific operation of the tensor fusion module for mathematically modeling the relationship between the complete sub-modal heterogeneous data and the missing sub-modal heterogeneous data after feature extraction to obtain the commonality matrix is as follows:

step 1, introducing a linear function of the formula (1),

y=ωx+b formula (1)

Step 3,Mathematical modeling is carried out on the matrix of the complete sub-modal heterogeneous data and the matrix of the missing sub-modal heterogeneous data by using tensor outer product to obtain a commonality matrix Z ^l ，

9. The feature fusion system based on tensor fusion and LSTM network according to claim 6, 7 or 8, wherein the specific operation of the stitching module to stitch a1×a2 commonality matrices is:

10. The feature fusion system based on tensor fusion and LSTM network as set forth in claim 9, wherein said LSTM network module receives the matrix obtained by ordered splicing, screens the important information of the matrix, and outputs a two-dimensional fusion matrix;

the number of lines of the two-dimensional fusion matrix is equal to the number of lines of the complete modal heterogeneous data/sub-modal heterogeneous data;

the column number of the two-dimensional fusion matrix is smaller than the maximum value of the column number product of any two common matrixes in the A1-A2 common matrixes and larger than the column number/row number of any one common matrix in the A1-A2 common matrixes.