CN112528873A

CN112528873A - Signal semantic recognition method based on multi-stage semantic representation and semantic calculation

Info

Publication number: CN112528873A
Application number: CN202011476389.9A
Authority: CN
Inventors: 石光明; 杨旻曦; 高大华; 谢雪梅
Original assignee: Xidian University
Current assignee: Xidian University
Priority date: 2020-12-15
Filing date: 2020-12-15
Publication date: 2021-03-19
Anticipated expiration: 2040-12-15
Also published as: CN112528873B

Abstract

The invention provides a semantic identification method based on multilevel semantic representation and semantic calculation, which mainly solves the problems of poor interpretability and low generalization capability of signal semantic identification in the prior art. The implementation scheme is as follows: acquiring a training set and a test set; constructing a signal semantic identification network consisting of a cascade multistage semantic representation network and a semantic calculation network so as to carry out learnable multistage semantic representation on the signal and calculate the semantic category of the signal according to the semantic representation; setting a semantic representation loss function and a cross entropy loss function to train a multi-stage semantic representation network and a semantic calculation network in sequence to obtain a trained signal semantic recognition network; and acquiring a semantic recognition result of the signal to be recognized based on the trained signal semantic recognition network. The invention effectively improves the interpretability and generalization capability of signal semantic recognition. The method can be used for man-machine interaction and semantic information retrieval.

Description

Signal semantic recognition method based on multi-stage semantic representation and semantic calculation

Technical Field

The invention belongs to the technical field of artificial intelligence, and particularly relates to a signal semantic recognition method which can be used for man-machine interaction and semantic information retrieval.

Background

Signal semantic recognition refers to determining the semantic category to which a signal belongs according to the characteristics of the signal.

Before the deep learning technology is applied, researchers generally use methods such as bag-of-words feature matching or random forests to perform semantic recognition on signals. The method based on bag-of-words feature matching comprises the steps of firstly representing each semantic category by a plurality of feature sets obtained through feature engineering, and then judging the semantic category by calculating the overall similarity degree of a sample signal and the feature sets. And the random forest firstly predicts the semantic categories of the sample signals independently by a plurality of decision trees, then votes all prediction categories, and elects the category with the highest vote number as a final prediction result. However, both the bag-of-words feature matching and the random forest method require a user to design a large number of manual features according to professional knowledge in the field of target tasks, which is time-consuming, labor-consuming and difficult to implement. In addition, since the manual features are fixed after being designed, the manual features can only be used in specific scenes, and the generalization performance is poor.

In recent years, the accuracy of recognition methods of signals such as images and voices based on deep learning exceeds the human level in general data sets, and have been put into practical use in applications such as face recognition. Firstly, establishing a data set with semantic labels based on a deep learning method, then designing a model and a loss function, then performing end-to-end training on the model on the data set, and finally inputting a sample signal into the trained model to obtain a recognition result. However, artificial intelligence technology is expected to provide better services to people in more fields, and what is needed is more than higher accuracy, such as: in human-computer interaction, auxiliary medical diagnosis and automatic driving, the interpretability of an artificial intelligent model is highly required. The current deep learning method uses a deep neural network as a model, but the parameter quantity of the neural network is huge, the neural network lacks of theoretical or visual explanation, and the vulnerability problem of the attack is resisted, so that the learned characteristics are difficult to understand. In addition, because the current deep learning method generally adopts an end-to-end training method, when different problems are faced, different data sets need to be constructed, and a large amount of time and calculation power are spent on carrying out iterative training on the model, so that the generalization capability of the model is poor. In a patent application with the application publication number of CN110059741A and the name of image identification method based on semantic capsule fusion network, in 7/25/2019, a method for performing semantic identification on image signals by fusing a semantic capsule network and a convolutional neural network is disclosed. The method firstly extracts image features with specific semantic meanings through manually designed semantic primitives, then further extracts the semantic features through a plurality of parallel semantic capsules, and finally identifies images according to distinguishing features obtained by directly adding the semantic features and the image features extracted by the traditional convolutional neural network, although the interpretability and the generalization capability of the network are improved, the method has the following two disadvantages:

firstly, the distinguishing features on which the recognition is based are obtained by directly adding the semantic features extracted by the semantic capsule network and the image features which are extracted by the convolutional neural network and have poor interpretability, so the interpretability is extremely poor;

secondly, because the semantic capsule network adopts fixed and unchangeable semantic elements, semantic feature extraction can be carried out only after different semantic elements are designed for different problems, and the whole network adopts an end-to-end training method, the features of all semantic capsules are not independent, so that the semantic capsules are difficult to migrate to other problems, and the generalization capability of the method is insufficient.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provides a semantic identification method based on multi-level semantic representation and calculation so as to improve the interpretability and generalization capability of signal semantic identification.

In order to achieve the above purpose, the implementation scheme of the invention comprises the following steps:

(1) m signals with labels randomly selected from the signal semantic recognition data set form a training sample set S_aThe residual signal constitutes a test sample set S_bWherein M is more than or equal to 100;

(2) constructing a signal semantic recognition network H:

(2a) set up by N_lSemantic representation of sub-networks W_r ^(l)Composed semantically-characterized networks W_r＝{W_r ^(l)For characterizing semantic features in the signal, where N_lSemantic characterization of sub-network W for ≧ 2_r ^(l)Sequence number of (1) l is less than or equal to N_l；

(2b) Establishing a semantic computation network W consisting of a plurality of stacked graph convolution layers and a global graph average pooling layer_cFor computing semantic categories of the signal from the semantic features;

(2c) characterizing semantics into a network W_rAnd semantic computing network W_cCascading to form a signal semantic recognition network H;

(3) training a signal semantic recognition network H:

(3a) will train the sample set S_aInput to a semantic representation network W_rIn (1), and let l be 1, and S_aCharacterizing a subnetwork W as a semantic_r ^(l)Input training sample set S_a ^(l)It is iteratively trained as follows:

(3a1) setting a semantic representation subnetwork W_r ^(l)Has a loss function of L_r＝L_r1+λL_r2The maximum iteration time T is more than or equal to 10, and the initial iteration time T is 0, wherein L_r1Characterizing the independence loss function for semantics, L_r2For semantic characterisation of the intensity loss function, λ is L_f1And L_f2λ > 0;

(3a2) will S_a ^(l)Input into a semantic representation subnetwork W_r ^(l)In (1) obtaining W_r ^(l)Output of (2) O^(l)According to O^(l)Calculating L_rAnd using a gradient descent method for W_r ^(l)Updating is carried out;

(3a3) judging whether T is more than or equal to T, if so, obtaining a trained semantic tableToken network W_r ^(l)', go to (3a4), otherwise, let t be t +1, return to (3a 2);

(3a4) judging that l is more than or equal to N_lIf yes, obtaining a well-trained characterization network W_r'and its output O', perform (3b), otherwise, let l ═ l +1, and let S_a ^(l)Is O^(l-1)', return (3a 1);

(3b) for semantic computation network W_cThe following iterative training is performed:

(3b1) setting a semantic computation network W_cIs a cross entropy loss function L_cThe maximum iteration frequency Q is more than or equal to 100, and the initial iteration frequency Q is 0;

(3b2) the well-trained characterization network W_r' output O ' of the ' is input to the semantic computation network W_cIn (1) obtaining W_cOutput of (2) O_cAnd according to O_cCalculating L_cBy gradient descent of W_cUpdating is carried out;

(3b3) judging whether Q is more than or equal to Q, if so, obtaining a trained semantic computation sub-network W_cIf not, returning to (3b2) by making q + 1;

(3c) the well-trained characterization network W_r' AND trained semantic computation subnetwork W_cCarrying out cascade connection to form a trained signal semantic recognition network H';

(4) set of test samples S_bAnd inputting the signal into a trained signal semantic recognition network H' to obtain a signal semantic recognition result.

Compared with the prior art, the invention has the following beneficial effects:

firstly, in the semantic identification model based on the multilevel semantic features and calculation, the multilevel semantic feature network is established to carry out semantic representation on the signals, and the semantic calculation network based on graph convolution is established to carry out semantic identification on the semantic features of the signals, so that the defect that the interpretability is reduced because the interpretable semantic features and the signal features extracted by a neural network with poor interpretability are directly added when distinguishing features are extracted in the prior art is avoided, and the interpretability of the signal semantic identification method is effectively improved;

secondly, the semantic representation loss function is set in the constructed semantic identification model based on the multilevel semantic representation and calculation to perform unsupervised training on the semantic representation sub-network, so that learnable semantic representation is realized, and the constructed multilevel semantic representation network and the calculation network are trained in sequence, thereby avoiding the defect of insufficient generalization capability caused by using fixed and unchangeable semantic elements and an end-to-end training method in the prior art, and effectively improving the generalization capability of signal semantic identification.

Drawings

FIG. 1 is a general flow chart of an implementation of the present invention;

FIG. 2 is a diagram of a signal semantic recognition network according to the present invention;

FIG. 3 is a sub-flowchart of training a signal semantic recognition network according to the present invention.

Detailed Description

The invention is described in further detail below with reference to the figures and specific examples.

Referring to fig. 1, the implementation steps of this example are as follows:

step 1, a training sample set and a testing sample set are obtained.

And acquiring a training sample set and a testing sample set from the signal semantic recognition data set. Existing signal semantic recognition datasets include an MNIST handwritten digit dataset, a CIFAR dataset and an ImageNet dataset, and the embodiment preferably uses, but not limited to, the MNIST handwritten digit recognition dataset as the signal semantic recognition dataset. The MNIST handwritten digit recognition data set comprises 70000 single-channel handwritten digital image samples with the size of 28 multiplied by 28 and 70000 one-hot label vectors with the length of 10;

randomly selecting M60000 image signals with labels from MNIST hand-written digit recognition data set to form training sample set S_aAnd the remaining 10000 image signals with labels form a test sample set S_b。

And 2, constructing a signal semantic recognition network H.

Referring to FIG. 2, this exampleConstructed signal semantic recognition network H is represented by semantic_rAnd semantic computing network W_cThe cascade connection is composed of the following construction steps:

2.1) building a semantic representation network W_r：

2.1.1) creation of semantic transformation parameter Generator G_j ^(l)The method is used for extracting features from an input sample to obtain semantic transformation parameters:

G_j ^(l)the structure of (1) is as follows in sequence: the first convolution layer → the second convolution layer → the third convolution layer → the global average pooling layer, the number of convolution kernels of the three convolution layers is 4, the number of convolution kernels is 3 multiplied by 3, the step length is 1, and the filling number is 1;

2.1.2) building semantic converter T_j ^(l)For transforming the parameter generator G according to semantics_j ^(l)Output semantic transformation parameters vs. semantic primitives p_j ^(l)The conversion is carried out, so that the semantic elements can be dynamically adjusted according to the input samples, and the generalization performance of signal semantic recognition is effectively improved;

the existing semantic transformation methods include inversion transformation, gray-scale gamma transformation, affine transformation and rotation scaling transformation. Because the handwritten digits are all composed of arc segments with different sizes and angles, the embodiment preferably but not limited to selects and selects the rotation expansion transformation to process the semantic elements to match the arc segments with different sizes and angles, and the generalization capability of signal semantic identification is improved, namely the semantic converter T_j ^(l)According to G_j ^(l)Generated semantic transformation parameter theta_i,j ^(l)For semantic primitive p_j ^(l)Semantic transformation is performed by the following formula:

wherein the content of the first and second substances,

for semantic elements p_j ^(l)Dimension of the nth element is N_mCoordinate vector of a_n(m) is

M is not less than 1 and not more than N_m，

For transformed semantic elements p_i,j ^(l)' dimension of the N-th element is N_mCoordinate vector of b_n(m) is

The m-th-dimensional coordinate of (a),

for semantic elements p_j ^(l)The center coordinate is

The elements of (a) and (b),

for transformed semantic elements p_i,j ^(l)' center coordinate is

The elements of (a) and (b),

θ_i,j ^(l)is G_j ^(l)The semantic transformation parameters generated from the ith input sample, i ≦ 1 ≦ M, N in this example_m＝2；

2.1.3) semantic conversion parameter generator G built by 2.1.1)_j ^(l)Semantic converter T established with 2.1.2)_j ^(l)Cascading, composing semantic representation modules R_j ^(l)Wherein j is a semantic representation module R_j ^(l)J is not less than 1 and not more than N_pIn this example, N_p＝4；

2.1.4) reacting N_pSemantic representation modelBlock R_j ^(l)Parallel, constituent semantic representation subnetworks W_r ^(l)For characterizing semantic features of the same hierarchy as N_pA separate semantic representation module R_j ^(l)Because the habit that the knowledge is expressed as the combination of a plurality of knowledge points in the knowledge cognition process of the human is met, the human can conveniently understand the semantic information represented in the knowledge, and the independent semantic representation modules can be conveniently recombined to obviously improve the generalization capability;

2.1.5) reacting N_lSemantic representation of a subnetwork W_r ^(l)Cascading, composing a semantic representation network W_rFor semantically characterizing a semantic in an image signal as a structure of multiple levels, wherein l is a semantic characterizing subnetwork W_r ^(l)Sequence number of (1) l is less than or equal to N_l. In this example, N_lThis semantic representation of the network W2_rBecause the method conforms to the habit that the knowledge is divided into a plurality of hierarchical representations in the knowledge cognition process of human beings, the method is not only convenient for the human beings to understand the semantic information represented in the knowledge, but also convenient for the trained network structure to be migrated to a new problem, and the interpretability and generalization capability of signal semantic recognition are improved.

2.2) building a semantic computation network W_c：

Semantic computing network W_cFor characterizing the network W according to semantics_rThe extracted semantic features calculate semantic classes of the signal. The structure is as follows in sequence: first map convolutional layer → second map convolutional layer → global map average pooling layer, the parameter matrix sizes of these two map convolutional layers are 4 × 8 and 8 × 10, respectively.

2.3) characterizing the semantics of the Web W_rAnd semantic computing network W_cAnd carrying out cascade connection to form a signal semantic recognition network H.

And 3, training the signal semantic recognition network H.

Referring to fig. 3, the steps of training the signal semantic recognition network H in this example are as follows:

3.1) will train the sample set S_aInput to a semantic representation network W_rIn (1), initializing the semantic tableToken network W_r ^(l)The serial number l of (1);

3.2) characterizing the subnetwork W for semantic purposes_r ^(l)Performing iterative training:

3.2.1) setting up the semantic representation sub-network W_r ^(l)Is S_a ^(l)：

Judging whether l is 1, if so, judging S_aCharacterizing subnetworks W as first layer semantics_r ⁽¹⁾Input training sample set S_a ⁽¹⁾Otherwise, the last layer of the trained semantic representation sub-network W is used for representing_r ^(l-1)Output of O of `^(l-1)' characterizing a sub-network W as a semantic_r ^(l)Input training set S_a ^(l)；

3.2.2) setting up the semantic representation sub-network W_r ^(l)Has a loss function of L_r＝L_r1+λL_r2Wherein λ > 0 is L_f1And L_f2Balanced weight of, L_r1And L_r2The semantic representation independence loss function and the semantic representation strength loss function are respectively. In this example, λ ═ 1;

existing semantic representation independence loss function L_r1Comprises a variance function, an average Euclidean distance function and an existing semantic representation intensity loss function L_r2Including a minimum function and a mean function. The embodiment preferably but not limited to, the mean Euclidean distance function and the mean function are respectively taken as the semantic representation independence loss function L_r1And semantically characterizing the intensity loss function L_r2Which are respectively represented as follows:

wherein G is_a ^(l)And G_b ^(l)Are each R_a ^(l)And R_b ^(l)Semantic transformation parameter generator of (1), T_a ^(l)And T_b ^(l)Are each R_a ^(l)And R_b ^(l)Semantic converter of (1), p_a ^(l)And p_b ^(l)Are each R_a ^(l)And R_b ^(l)Corresponding semantic element, R_a ^(l)And R_b ^(l)Respectively semantically characterizing sub-networks W_r ^(l)The a and b semantic representation modules in the system are that a is more than or equal to 1 and less than or equal to N_p，1≤b≤N_p，a≠b，s_i ^(l)Characterizing a subnetwork W for semantics_r ^(l)Input training sample set S_a ^(l)In the ith training sample, i is more than or equal to 1 and less than or equal to M,

G_a ^(l)(s_i ^(l)) Is G_a ^(l)To s_i ^(l)Semantic transformation parameters, G, obtained by extracting features_b ^(l)(s_i ^(l)) Is G_b ^(l)To s_i ^(l)Semantic transformation parameters, v, obtained by extracting features_i,a ^(l)For the ith signal s_i ^(l)With the a-th transformed semantic primitive p_a ^(l)' feature map obtained by convolution, v_i,b ^(l)For the ith signal s_i ^(l)With the b-th transformed semantic primitive p_b ^(l)' feature map obtained by convolution, v_i,a ^(l)(n) and v_i,b ^(l)(n) are each v_i,a ^(l)And v_i,b ^(l)N is not less than 1 and not more than N_v，N_vIs v is_i,a ^(l)Or v_i,b ^(l)The total number of the elements in the Chinese character,

to be composed of

Transforming parameter pairs p for semantics_a ^(l)Obtained by performing semantic conversionThe semantic elements after the transformation are processed,

to be composed of

Transforming parameter pairs p for semantics_b ^(l)Performing semantic transformation to obtain transformed semantic elements;

3.2.3) setting the semantic-characterization sub-network W_r ^(l)The maximum iteration number T of the training is more than 10, and the initial iteration number T is 0. In this example, T ═ 15;

3.2.4) setting semantic primitives P^(l)：

The existing setting method of semantic elements comprises the steps of randomly clipping signals to obtain the semantic elements, manually designing the semantic elements according to common knowledge or domain professional knowledge, and selecting classical kernel functions, such as: a Gaussian kernel function, a Laplace kernel function, a wavelet kernel function and the like are used as semantic primitives;

this example preferably, but not exclusively, involves the common knowledge that handwritten numbers are formed from strokes, and that 4 arcs of different lengths are manually designed and recorded in matrices of 11 × 11 size and 1 number of channels, respectively, as a first-level semantic representation subnetwork W_r ⁽¹⁾Semantic primitives of (1)

In this embodiment, preferably, but not limited to, according to the extraction mode of the human retinal neurons on the image features, the combination of the zeroth order to third order derivatives of the gaussian kernel function can be used to approximate the prior knowledge in the neuroscience professional field, and the zeroth order to third order derivatives of the gaussian kernel function are selected and recorded in matrices with the size of 11 × 11 and the number of channels of 4 respectively as the second-order semantic representation subnetwork W_r ⁽²⁾Semantic primitives of (1)

3.2.5) calculating semantic transformation parameters: i.e. training sample s_i ^(l)Input to semantic transformation parameter generator G_j ^(l)And transforming the data into a matrix to obtain a semantic transformation parameter theta_i,j ^(l). In this example, the semantic transformation parameter θ_i,j ^(l)The size is 2 x 2;

3.2.6) semantic conversion of semantic primitives:

semantic converter T_j ^(l)According to G_j ^(l)Generated semantic transformation parameter theta_i,j ^(l)Semantic primitive p by the transformation formula in 2.1.2)_j ^(l)Performing semantic transformation to obtain transformed semantic primitive p_i,j ^(l)′；

3.2.7) convolving the transformed semantic elements with the input samples:

transforming the semantic elements p_i,j ^(l)' AND input samples s_i ^(l)Performing convolution to obtain a characteristic diagram v_i,j ^(l)And i is the same v_i,j ^(l)Splicing along the channel dimension to obtain W_r ^(l)Output of (2)

3.2.8) updating the semantic representation sub-network W by adopting a gradient descent method_r ^(l)：

Calculating a loss function L_rAnd a gradient descent method is adopted to semantically characterize the subnetwork W_r ^(l)Updating is carried out;

3.2.9) will characterize the subnetwork W for the semantic_r ^(l)Comparing the current iteration time T with the maximum iteration time T:

if T is more than or equal to T, obtaining a trained semantic representation sub-network W_r ^(l)' and its output O^(l)', perform 3.3),

otherwise, let t be t +1, go back to 3.2.5);

3.3) semantically characterizing the sub-network W_r ^(l)Number of (1) and total number of (N)_lAnd (3) comparison:

if l is not less than N_lThen all the trained semantics are characterized into sub-networks

Cascading to form a trained semantic representation network W_r', and the last layer of trained semantic characterization sub-network

Output of (2)

As W_rOutput O 'of' performs 3.4), otherwise, let l ═ l +1, return 3.2).

3.4) semantic computation network W_cPerforming iterative training:

3.4.1) setting the semantic computation network W_cLoss function and maximum number of iterations:

setting a semantic computation network W_cIs a cross entropy loss function L_cThe maximum iteration number Q is more than or equal to 100, and the initial iteration number Q is 0, wherein the cross entropy loss function L_cExpressed as follows:

wherein, y_iFor training sample set S_aThe label of the ith training sample in (1),

computing a network W for semantics_cFor y _i1 ≦ i ≦ M, in this example, Q200;

3.4.2) will train the well-characterized network W_r' the ith sample O in the output O_i'the pixel points in the' are used as vertexes, and the adjacent vertexes are connected by an edge with the weight of 1 to obtain a sample g shown in the figure_i；

3.4.3) sample g of the graph_iInput to a semanticizerComputing network W_cTo obtain a predicted result

3.4.4) based on the prediction

Calculating a loss function L_cAnd computing the network W for semantic by using a gradient descent method_cUpdating is carried out;

3.4.5) compute the network W for semantic_cComparing the current iteration number Q of training with the maximum iteration number Q:

if Q is more than or equal to Q, obtaining a trained semantic computation sub-network W_c', execute 3.5), otherwise, let q ═ q +1, return 3.4.3);

3.5) the well-trained characterizing network W_r' AND trained semantic computation subnetwork W_c'cascading to form a trained signal semantic recognition network H'.

Step 4, testing sample set S_bAnd performing signal semantic recognition.

Set of test samples S_bAnd inputting the signal into a trained signal semantic recognition network H' to obtain a signal semantic recognition result.

The foregoing description is only an example of the present invention and is not intended to limit the invention, so that it will be apparent to those skilled in the art that various changes and modifications in form and detail may be made therein without departing from the spirit and scope of the invention.

Claims

1. A signal semantic identification method based on multilevel semantic representation and semantic computation is characterized by comprising the following steps:

(1) m signals with labels randomly selected from the signal semantic recognition data set form a training sample set S_aResidual signal composition testSample set S_bWherein M is more than or equal to 100;

(2) constructing a signal semantic recognition network H:

(3) training a signal semantic recognition network H:

(3a3) judging whether T is more than or equal to T, if so, obtaining a trained semantic representation sub-network W_r ^(l)′，Executing (3a4), otherwise, making t equal to t +1, and returning to (3a 2);

2. The method of claim 1, wherein the semantic representation in (2a) characterizes subnetwork W_r ^(l)Is composed of N_pA parallel semantic representation module R_j ^(l)，1≤j≤N_p，N_pNot less than 2, each semantic representation module R_j ^(l)Semantic transformation parameter generator G comprising a cascade_j ^(l)And semantic converter T_j ^(l)，G_j ^(l)Comprising a plurality of stacked convolutional layers and a global mean pooling layer, T_j ^(l)For according to G_j ^(l)Generated semantic transformation parameter theta_i,j ^(l)For semantic primitive p_j ^(l)Performing a semantic transformation of theta_i,j ^(l)Is G_j ^(l)And i is more than or equal to 1 and less than or equal to M according to the semantic transformation parameters generated by the ith input sample.

3. The method of claim 2, wherein the semantic transformation parameter generator G_j ^(l)The structure is as follows in sequence:

first convolution layer → second convolution layer → third convolution layer → global average pooling layer.

4. Method according to claim 2, characterized in that the semantic converter T_j ^(l)According to G_j ^(l)Generated semantic transformation parameter theta_i,j ^(l)For semantic primitive p_j ^(l)Performing semantic transformation by the following formula:

wherein the content of the first and second substances,

M is not less than 1 and not more than N_m，

The m-th-dimensional coordinate of (a),

for semantic elements p_j ^(l)The center coordinate is

The elements of (a) and (b),

for transformed semantic elements p_i,j ^(l)' center coordinate is

The elements of (a) and (b),

5. the method of claim 1, wherein the semantic computation network W in (2b)_cThe structure is as follows: first graph volume layer → second graph volume layer → global graph average pooling layer.

6. The method of claim 1, wherein the semantic representation independence loss function L in (3a1)_r1And semantically characterizing the intensity loss function L_r2Which are respectively represented as follows:

wherein G is_a ^(l)And G_b ^(l)Are each R_a ^(l)And R_b ^(l)Semantic transformation parameter generator of (1), T_a ^(l)And T_b ^(l)Are each R_a ^(l)And R_b ^(l)Semantic converter of (1), p_a ^(l)And p_b ^(l)Are each R_a ^(l)And R_b ^(l)Corresponding semantic element, R_a ^(l)And R_b ^(l)Respectively semantically characterizing sub-networks W_r ^(l)The a and b semantic representation modules in the system are that a is more than or equal to 1 and less than or equal to N_p，1≤b≤N_p，a≠b，N_p≥2，s_i ^(l)Characterizing a subnetwork W for semantics_r ^(l)Input training sample set S_a ^(l)I is more than or equal to 1 and less than or equal to M and G of the ith training sample_a ^(l)(s_i ^(l)) Is G_a ^(l)To s_i ^(l)Semantic transformation parameters, G, obtained by extracting features_b ^(l)(s_i ^(l)) Is G_b ^(l)To s_i ^(l)Semantic transformation parameters, v, obtained by extracting features_i,a ^(l)For the ith signal s_i ^(l)With the a-th transformed semantic primitive p_a ^(l)' feature map obtained by convolution, v_i,b ^(l)For the ith signal s_i ^(l)With the b-th transformed semantic primitive p_b ^(l)' feature map obtained by convolution, v_i,a ^(l)(n) and v_i,b ^(l)(n) are each v_i,a ^(l)And v_i,b ^(l)N is not less than 1 and not more than N_v，N_vIs v is_i,a ^(l)Or v_i,b ^(l)Total number of middle elements.

7. The method of claim 1, wherein the cross-entropy loss function L in (3b1)_cExpressed as follows:

computing a network W for semantics_cFor y_iFor the prediction, i is more than or equal to 1 and less than or equal to M.