CN116108153B

CN116108153B - Multi-task combined training machine reading and understanding method based on gating mechanism

Info

Publication number: CN116108153B
Application number: CN202310112991.1A
Authority: CN
Inventors: 王勇; 陈秋怡; 张梅; 王永明
Original assignee: Chongqing University of Technology
Current assignee: Chongqing University of Technology
Priority date: 2023-02-14
Filing date: 2023-02-14
Publication date: 2024-01-23
Anticipated expiration: 2043-02-14
Also published as: CN116108153A

Abstract

The invention belongs to the technical field of natural language processing, and particularly relates to a multi-task combined training machine reading and understanding method based on a gating mechanism. The method comprises the following steps: an article and question coding module; an interaction module; a multi-level residual structure module; and an answer prediction module. The invention filters the interactive associated features through a gating mechanism, controls the inflow of important information and the outflow of useless information to grasp the flow of information, and accurately sends the information into an output layer to predict the answer; the multi-level residual structure is built through the thought of introducing the residual structure, and the representations after the interaction of the articles and the problems are fused with the original semantic information, so that the semantic information is more abundant, the understanding of the articles is more sufficient, and the degradation of a network is avoided; the edge loss function is added to perform multi-task combined training, so that strong coupling of classification tasks and extraction tasks is ensured, and feature differences between positive examples and negative examples are further learned.

Description

Multi-task combined training machine reading and understanding method based on gating mechanism

Technical Field

The invention belongs to the technical field of natural language processing, and particularly relates to a multi-task combined training machine reading and understanding method based on a gating mechanism.

Background

An important task in natural language processing is a question-answering system, where machine reading understanding is a popular study of the question-answering system. The study presents articles and questions from which a piece of text is extracted as an answer to the question, by reading and understanding. In real life, not all questions have answers, and in order to meet the real needs, a machine reading understanding model is required to accurately extract the answers of the questions from the articles, judge whether the answers of the questions exist or not, and is an important challenge in the field of natural language processing. From the practical application, the application of reading and understanding has penetrated into aspects of our lives. For example, in a common search engine, when a user inputs a keyword to be queried, it is necessary to find related web pages from massive website information, and it takes a lot of time. If the related art is applied to a search engine, a desired answer can be found more precisely. Other common practical application scenes also include a naughty customer service dialogue system, and the answers can be returned by inputting common questions, so that manpower and material resources are saved for enterprises.

Pre-trained language models, such as BERT, ALBERT, etc., are research hotspots in recent years of natural language processing, and are commonly used in machine reading understanding models. Many machine reading understanding models also employ a attentive mechanism to simulate human behavior with problem reading, such as BiDAF, QANet, aoA, etc. Fusion net proposes an improved reading understanding network model based on word history and full attention, wherein the full attention calculates the weighting coefficient of all the history information of the words, and simultaneously reduces the dimension of high-dimensional features in the word history, thereby improving the efficiency. The ASMI model solves the problem of insufficient robustness, provides a context attention mechanism, predicts context answers, and simultaneously provides a new negative sample generation method. These models typically highlight articles and critical information of the problem when computing attention, and by fusion, result in a semantic vector representation containing the problem and the article interactions.

The classification of questions and the extraction of answers are divided into an end-to-end model and a two-stage model. The retrospective reader model adopts two stages, combines the two stages of skip and finish, and obtains new promotion. The skip module reads the articles and questions to give a preliminary judgment, and the finish module verifies the answers to give candidates. And integrating the output of the two modules to give out a final classification result and giving corresponding answers. S & I Reader is an end-to-end reading model, and provides a finish reading module and a skip reading module, and simulates the behavior of multiple reading of people through multiple hops. Meanwhile, a multi-granularity module is added, and important characteristics of the text are enriched. The rmr+answer verifier model is an end-to-end model, and proposes a read-before-test structure, not only uses a reader to extract candidate answers and generate answer-free probabilities, but also uses an answer verifier to determine whether predicted answers are contained in an input segment, and adopts auxiliary loss for further detection.

However, in each of the above prior arts, there are the following technical problems: (1) extracting feature redundancy. After associating the article and the question feature, no control is given over the flow of information. (2) semantic information is not comprehensive. Only the upper and lower Wen Yuyi vectors obtained from the pre-training language model or only the key information semantic vectors obtained by the techniques such as attention mechanism and the like are contained, so that little information can be expressed, and meanwhile, network degradation caused by adding a network layer can occur, so that the network characterization capability is not strong. (3) classification of questions and extraction of answers are not strongly coupled. The difference between the answer questions and the question-and-answer questions cannot be learned.

In order to solve the technical problem, the invention provides a multi-task combined training machine reading and understanding method based on a gating mechanism.

Disclosure of Invention

The invention aims to provide a multi-task combined training machine reading and understanding method based on a gating mechanism, which aims to solve the problems in the prior art pointed out in the background art.

In order to achieve the above purpose, the present invention provides the following technical solutions:

the invention provides a multi-task combined training machine reading and understanding method based on a gating mechanism, which comprises the following steps:

the method comprises the steps of performing context coding on input articles and questions through an article and question coding module;

through the interaction module, important characteristics of the context information are highlighted by adopting an attention mechanism and a gating mechanism, and the highlighted key characteristics are updated;

the original semantic information is respectively fused with the representation obtained through the attention mechanism and the representation obtained through the gating mechanism through a multi-level residual error structure module;

the answers of the questions and the answers of the questions are predicted by an answer prediction module.

Further, the context encoding of the inputted articles and questions by the article and question encoding module includes:

an article defined with m words is p= { P ₁ ,p ₂ ,…,p _m The problem of n words is q= { Q ₁ ,q ₂ ,…,q _n }；

Splicing the problem Q and the article P into a fixed-length sequence: the starting position is marked by [ CLS ] and is used as sentence vector of the whole sequence;

the question Q and the article P are separated by an identifier [ SEP ], and the end of the article P is also identified by [ SEP ];

for the length of the whole sequence, if the sequence exceeds a fixed length, cutting off, and generating a next sequence by adopting a sliding window; if the sequence does not reach the fixed length, the [ PAD ] is used for filling;

the generated sequence is sent as input to the encoder side and E= { E ₁ ,e ₂ ,…,e _s As a sequence of vectors with embedded features;

sending vector E into a multi-layer transducer structure, wherein each layer comprises two parts, one part is multi-headed attention and the other part is a feed-forward layer;

the output of the encoder finally obtained by multilayer transform is used as H= { H ₁ ，h ₂ ,…,h _s And } represents.

Further, the attention mechanism of the interaction module adopts a bidirectional attention flow model, and the working principle comprises:

the similarity score between the ith article word and the jth question word is calculated using the dot product, expressed as follows:

wherein p is _i Represent the i-th article word, q _j Represents the j-th question word, T is a transposed symbol, S _ij ∈R ^m×n Representing generated S _ij The dimension is m×n;

building the attention of the article to the question and the attention of the question to the article to obtain a question-based article representation:

multiple similarity scores S _ij Forming a similarity matrix S, and carrying out row normalization on the similarity matrix S to obtain a matrix S ₁ The expression is as follows:

S ₁ ＝softmax _→ (S)

calculating, for each article word, which of the question words is most relevant to it;

the article's attention to the question highlights the feature of the question word as follows:

A _pq ＝S ₁ ·Q

wherein A is _pq Representing the attention of the article to the question, Q being the question word;

the rows are firstly maximized and then the columns are normalized to obtain a matrix S ₂ The expression is as follows:

S ₂ ＝softmax _↓ (max _→ (S))

to indicate which article word is most relevant to a word in the question word, and prove that the word is important for answering the question;

the attention of a question to an article highlights the features of the article word from the article word associated with the question word as follows:

A _qp ＝S ₂ ·P

wherein A is _qp Is the attention of the question to the article, P is the article word;

the final question-based article representation is obtained by fusion, expressed as follows:

QP＝[P；A _pq ；P·A _pq ；P·A _qp ]。

further, the operating principle of the gating mechanism of the interaction module includes:

the article words are respectively spliced with the attention of the article to the problem and the fused article representation based on the problem, and the weight value is obtained through an activation function, and is expressed as follows:

z＝sigmoid(W _z [P；A _pq ]+b _z )

r＝sigmoid(W _r [P；QP']+b _r )

wherein P represents an article word, A _pq Representing the attention of the article to the problem, QP' represents the problem-based article representation after the gating mechanism

z and r are used to determine the weight of the update portion, respectively, and update the extracted features as follows:

hz＝(1-z)⊙A _pq +z⊙P

hr＝(1-r)⊙QP'+r⊙P

and taking the average value of the two updated vectors to obtain a final gating mechanism vector, wherein the final gating mechanism vector is expressed as follows:

G＝mean(hz+hr)

where G represents the vector obtained after the gating mechanism.

Further, the fusing, by the multi-level residual structure module, the original semantic information with the representation obtained by the attention mechanism and the representation obtained by the gating mechanism respectively includes:

the fine granularity vector obtained through the attention mechanism and the gating mechanism is expressed as the effect of simulating human fine reading; the vector sequence obtained from the encoder end is used as coarse granularity vector representation, and the result of human skip is simulated;

the first level residual structure of the article P and the attention-representative QP was constructed using a jump connection, expressed as follows:

QP'＝ReLU(P+QP)

wherein ReLU is an activation function;

the second-level residual structure of the context vector representation H and the updated representation G is established by using the jump connection through a gating mechanism, and is expressed as follows:

I＝ReLU(H+G)

wherein, reLU is an activation function, I epsilon R ^s×h The dimension representing I is s×h;

the resulting I is used to determine the probability of each word in the sequence as a start-stop position.

Further, the predicting, by the answer predicting module, the answers of the questions and the answers of the answers to the questions includes:

the additional edge loss function is proposed to maximize the Euclidean distance between answer prediction and no answer classification, and the final trained loss function L contains three losses expressed as follows:

L＝Loss _ext +Loss _class +Loss _joint

in the reading process, semantic vector representation I which finally contains two granularities of thickness is obtained, the semantic vector representation I is sent to a full-connection layer, start-stop position representations of each word are respectively obtained, and in the training process, a cross entropy loss function is adopted as a training target and expressed as follows:

wherein,and->The real position labels of the ith problem starting and stopping positions are respectively, and N is the number of the problems;

for the answers of questions, a classification task is trained by the vector representation based on the context information generated by the pre-training language model, and as the answers of the questions are classified into two categories, a cross entropy loss function of the two categories is adopted in the training process, the expression is as follows:

adopting edge loss function joint training to enable the sample to narrow the distance to the answering direction corresponding to the label and far away from the opposite direction, performing one-step learning of the characteristic difference between the sample and the label, and enabling the extraction task of the answer and the classification task of the question to have strong coupling; the probability of obtaining the answer start-stop position after normalization is expressed by the obtained start-stop position, and the product of the start-stop position probability is taken as the probability of a positive sample, and is expressed as follows:

P _{has_ans} ＝softmax(P' _{start to} ·P' _Ending )

After vector representations generated by the pre-training language model are classified, the probability that the questions are not answered is obtained and is used as the probability of a negative sample, and the probability is expressed as follows:

P _{no_ans} ＝softmax(H)

the edge loss function calculates the distance between the label and the positive and negative sample probabilities, and the Euclidean distance between the label and the negative sample probability is calculated as follows:

d(x,y)＝||x-y|| ₂

the edge loss function maximizes the distance between the unanswered classification and the answered prediction during training as follows:

compared with the prior art, the invention has the beneficial effects that:

(1) The invention provides a gating mechanism, which filters the related characteristics after interaction and controls the inflow of important information and the outflow of useless information so as to grasp the flow of information and accurately send the information into an output layer to predict the answer;

(2) The invention introduces the thought of residual structure, builds a new module of multi-level residual structure, fuses the representation after the interaction of the article and the problem with the original semantic information, so that the semantic information is more comprehensive, the understanding of the article is more sufficient, and the degradation of the network is avoided;

(3) The invention provides a multi-task combined training method by adding an edge loss function, and ensures the coupling of classification tasks and extraction tasks. And constructing triples of the label, the positive example and the negative example, calculating the distance between the label and the positive example and the distance between the label and the negative example through a distance function, enabling the distance between the label and the corresponding sample to be smaller and the distance between the label and the corresponding sample to be larger and larger, and further learning the characteristic difference between the label and the corresponding sample.

Drawings

FIG. 1 is a diagram of an overall model framework of the present invention;

FIG. 2 is a triplet edge loss schematic.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

First, the system defines an article with m words as p= { P ₁ ,p ₂ ,…,p _m The problem of n words is q= { Q ₁ ,q ₂ ,…,q _n }。

Not only is an answer to the question Q found correctly from the article P, but also it is possible to accurately judge whether the question is answerable. For a answerable question, returning a start position and an end position represents that the answer is a continuous piece of text a= { p _{Start to} ,…,p _Ending -a }; for non-answerable questions, a null character is assigned to mark that it has no answer, i.e., a= []。

Referring to fig. 1, the system model mainly includes four modules: the system comprises an article and problem encoding module, an interaction module, a multi-level residual error structure module and a multi-task joint training module.

(1) Article and question coding module: performing context coding on the input articles and questions; (2) an interaction module: highlighting important characteristics of the context information by adopting an attention mechanism and a gating mechanism, and updating the highlighted key characteristics; (3) a multi-stage residual structure module: the original semantic information is respectively fused with the representation obtained through the attention mechanism and the representation obtained through the gating mechanism; (4) an answer prediction module: predict the answerability of the question and answer to the answerable question.

1. Article and question coding module

The module first concatenates the question Q and the article P into a fixed length sequence, the starting position being defined by [ CLS ]]Is identified, typically as a sentence vector for the entire sequence. With an identifier [ SEP ] between Q and P]Spaced apart, the end of P is also referred to as [ SEP ]]And (5) identification. For the length of the whole sequence, if the sequence is overlong, cutting off, and generating a next sequence by adopting a sliding window; if the sequence does not reach fixed length, then use [ PAD ]]And (5) supplementing. The generated sequence is sent as input to the encoder side and E= { E ₁ ,e ₂ ,…,e _s As a sequence of vectors with embedded features.

Vector E is sent to the multi-layer transducer structure. Wherein each layer comprises two parts, one part being a multi-headed attention and the other part being a feed-forward layer. The output of the encoder finally obtained by multilayer transform is used as H= { H ₁ ，h ₂ ,…,h _s And } represents.

2. Interactive module

1) Attention mechanism

This module uses a bi-directional attention flow model proposed by Seo et al [ Seo M, kembhavi A, faradai A, et al Bidirection attention flow for machine comprehension [ C ]// Proceedings of the 5th International Conference on Learning Representations,2017] to highlight the focus of the article and problem understanding. First, the similarity score between the ith article word and the jth question word is calculated using the dot product:

wherein p is _i Represent the i-th article word, q _j Represent the jth questionInscription, T is a transposed symbol) S _ij ∈R ^m×n Representing generated S _ij The dimension is m×n.

Then, the attention of the article to the question and the attention of the question to the article are constructed to obtain a question-based article representation. The method comprises the following steps of obtaining a similarity matrix S (S _ij Set of (S) to perform row normalization to obtain matrix S ₁ For each article word, it is calculated which question word is most relevant to it, as in equation (2). The article's attention to the question highlights the feature of the question word as shown in equation (3). Similarly, the rows are first maximized and then the columns are normalized to obtain a matrix S ₂ As in equation (4) to indicate which article word is most relevant to a word in the question word, it proves that the word is critical to answering the question. The attention of the question to the article highlights the feature of the article word according to the article word related to the question word as shown in formula (5). The final problem-based article representation is obtained using a fusion approach as in equation (6).

S ₁ ＝softmax _→ (S) (2)

A _pq ＝S ₁ ·Q (3)

S ₂ ＝softmax _↓ (max _→ (S)) (4)

A _qp ＝S ₂ ·P (5)

QP＝[P；A _pq ；P·A _pq ；P·A _qp ] (6)

Wherein A is _pq Representing the attention of the article to the question, Q being the question word; a is that _qp Is the attention of the question to the article, P is the article word;

2) Gating mechanism

In order to simulate the forgetting and memory updating behaviors of people during reading, a gating mechanism is adopted to update the characteristics after the attention mechanism.

The article words are respectively spliced with the attention of the article to the problem and the fused article representation based on the problem, and the weights are obtained through an activation function, and are expressed as follows:

z＝sigmoid(W _z [P；A _pq ]+b _z )

r＝sigmoid(W _r [P；QP']+b _r ) (7)

hz＝(1-z)⊙A _pq +z⊙P

hr＝(1-r)⊙QP'+r⊙P (8)

G＝mean(hz+hr) (9)

where G represents the vector obtained after the gating mechanism.

3. Multistage residual structure module

When people read, two reading modes, namely skip reading and finish reading, are usually adopted. Therefore, the fine granularity vector representation obtained through the attention mechanism and the gating mechanism is used as an effect for simulating human skimming, the vector sequence obtained from the encoder end is used as a coarse granularity vector representation, and a result for simulating human skimming is obtained. In order to ensure the integrity of the information and ensure that the information is smoothly transmitted to the next layer, the degradation of a network is avoided, and a multi-level residual structure is provided for respectively connecting an attention mechanism and a gating mechanism.

First, a first level residual structure of the article P and the attention expression QP is constructed using a jump connection, as shown in equation (10). And then establishing a second-level residual structure of the context vector representation H and the updated representation G by using a jump connection through a gating mechanism, as shown in a formula (11). The resulting I is used to determine the probability of each word in the sequence as a start-stop position. This is different from the previous method of obtaining probabilities only by question-based article representation. The method can better integrate the original information, can obtain the semantic information of the key part, and helps us to locate and accurately extract the answer from the two granularity of thickness.

QP'＝ReLU(P+QP) (10)

Wherein ReLU is an activation function

I＝ReLU(H+G) (11)

Wherein ReLU is an activation function

Wherein I is E R ^s×h ，I∈R ^s×h The dimension representing I is sxh.

4. Answer prediction module

The objective function typically includes an extraction task and a classification task. On this basis, the model proposes an additional edge loss function that maximizes the Euclidean distance between answer predictions and no answer classifications. The final trained loss function contains three losses, as shown in equation (12), each of which is explained in detail below.

L＝Loss _ext +Loss _class +Loss _joint (12)

Wherein,and->The actual position labels of the ith problem start and stop positions are respectively, and N is the number of the problems.

1) Answer extraction

Through the reading process, the semantic vector representation I finally containing the granularity of two kinds of granularity is obtained, and is sent to a full-connection layer to respectively obtain the start and stop position representation of each word. During training, a cross entropy loss function is used as a training target, as in equation (13).

Wherein,and->The true positions of the ith problem start-stop positions respectivelyAnd placing labels, wherein N is the number of problems.

2) Problem classification

For the answers to questions, a classification task is trained by pre-training a vector representation of the language model generated based on the context information. Since the answers to questions are classified into two categories, we use the cross entropy loss function of the two categories during training as shown in equation (14) below:

wherein y' _i Is the answers to the predicted ith question, y _i Is the answerability of the ith question mark, N is the number of questions.

3) Joint training

Referring to fig. 2, in order to keep the answers given by the answer extraction task and the question classification task consistent logically, edge loss function joint training is adopted, so that the distance between the sample and the label is reduced in the answering direction corresponding to the label, the sample is far away from the opposite direction, the feature difference between the sample and the label is learned in one step, and the answer extraction task and the question classification task have strong coupling. And (3) normalizing the obtained start and stop position representations to obtain probabilities of answer start and stop positions respectively, and taking the product of the start and stop position probabilities as the probability of a positive sample, wherein the probability is shown in a formula (15). Meanwhile, after the vector representation generated by the pre-training language model is classified, the probability that the problem has no answer is obtained and is used as the probability of a negative sample, as shown in a formula (16). The edge loss function calculates the distance between the label and the positive and negative sample probabilities, and calculates the Euclidean distance between the label and the negative sample probability, as in equation (17), margin defaults to 1. The edge loss function maximizes the distance between the unanswered classification and the answered prediction during training, as shown in equation (18).

P _{has_ans} ＝softmax(P' _{Start to} ·P' _Ending ) (15)

P _{no_ans} ＝softmax(H) (16)

d(x,y)＝||x-y|| ₂ (17)

Wherein x and y do not refer to any particular value, but merely represent that the relationship between them is in accordance with the Euclidean distance formula.

To verify the effect of the model of the invention, we performed comparative verification as shown in the following table:

thus, in combination with the above, it can be seen that:

the invention provides a multi-stage residual error structure module which is built by adding an edge loss function to perform multi-task combined training. The most important are the following three points:

1) The invention provides a gating mechanism, which filters the related characteristics after interaction and controls the inflow of important information and the outflow of useless information so as to grasp the flow of information and accurately send the information into an output layer to predict the answer;

(3) The invention provides a multi-task combined training method by adding an edge loss function, and ensures the coupling of classification tasks and extraction tasks. Constructing a triplet of the label, the positive example and the negative example, calculating the distance between the label and the positive example and the distance between the label and the negative example through a distance function, enabling the distance between the label and the corresponding sample to be smaller and the distance between the label and the corresponding sample to be larger, and further learning the characteristic difference between the label and the positive example and the distance between the label and the negative example

Although embodiments of the present invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made therein without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims

1. A method for reading and understanding a multi-task joint training machine based on a gating mechanism, which is characterized by comprising the following steps:

predicting the repliability of the questions and the answers of the repliable questions through an answer prediction module;

the method for fusing the original semantic information with the representation obtained by the attention mechanism and the representation obtained by the gating mechanism through the multi-level residual error structure module comprises the following steps:

QP′＝ReLU(P+QP)

wherein ReLU is an activation function;

through a gating mechanism, a second-stage residual structure of a context vector representation H and a vector G obtained through the gating mechanism is established by using a jump connection, and the second-stage residual structure is expressed as follows:

I＝ReLU(H+G)

the finally obtained I is used for determining the probability of each word in the sequence as a start-stop position;

the method for predicting the answers of the questions and the answers of the answers comprises the following steps:

L＝Loss _ext +Loss _class +Loss _joint

adopting edge loss function joint training to enable the sample to narrow the distance to the answering direction corresponding to the label and far away from the opposite direction, further learning the characteristic difference between the sample and the label, and enabling the extraction task of the answer and the classification task of the question to have strong coupling; the probability of obtaining the answer start-stop position after normalization is expressed by the obtained start-stop position, and the product of the start-stop position probability is taken as the probability of a positive sample, and is expressed as follows:

P _{has_ans} ＝softmax(P′ _{start to} ·P′ _Ending )

P _{no_ans} ＝softmax(H)

d(x，y)＝||x-y|| ₂

2. the method for reading and understanding a multi-task joint training machine based on a gating mechanism according to claim 1, wherein the context encoding of the inputted articles and questions by the article and question encoding module comprises:

an article defined with m words is p= { P ₁ ，p ₂ ，…，p _m The problem of n words is q= { Q ₁ ，q ₂ ，…，q _n }；

the generated sequence is sent as input to the encoder side and E= { E ₁ ，e ₂ ，…，e _s As a sequence of vectors with embedded features;

the output of the encoder finally obtained by multilayer transform is used as H= { H ₁ ，h ₂ ，…，h _s And } represents.

3. The method for reading and understanding the multi-task combined training machine based on the gating mechanism according to claim 1, wherein the attention mechanism of the interaction module adopts a bidirectional attention flow model, and the working principle comprises:

where pi represents the ith article word, q _j Represents the j-th question word, T is a transposed symbol, S _ij ∈R ^m×n Representing the generated S _ij The dimension is m×n;

S ₁ ＝softmax _→ (S)

A _pq ＝S ₁ •Q

wherein A is _pq Representing the attention of the article to the question, Q being the question;

S ₂ ＝softmax _↓ (max _→ (S))

to indicate which article word is most relevant to a word in the question word, and prove that the article word is important to answer the question;

A _qp ＝S ₂ •P

wherein A is _qp Is the attention of the question to the article, P is the article;

QP＝[P；A _pq ；P·A _pq ；P·A _qp ]。

4. the method for reading and understanding the multi-task combined training machine based on the gating mechanism according to claim 1, wherein the working principle of the gating mechanism of the interaction module comprises the following steps:

z＝sigmoid(W _z [P；A _pq ]+b _z )

r＝sigmoid(V _r [P；QP′]+b _r )

wherein P represents an article, A _pq Representing the attention of the article to the problem, QP' represents the problem-based article representation after the gating mechanism

hz＝(1-z)⊙A _pq +z⊙P

hr＝(1-r)⊙QP′+r⊙P

G＝mean(hz+hr)

where G represents the vector obtained after the gating mechanism.