CN116521857A

CN116521857A - Method and device for abstracting multi-text answer abstract of question driven abstraction based on graphic enhancement

Info

Publication number: CN116521857A
Application number: CN202310346860.XA
Authority: CN
Inventors: 杨鹏; 李冰; 赵翰林; 孙元康; 易梦
Original assignee: Southeast University
Current assignee: Southeast University
Priority date: 2023-04-03
Filing date: 2023-04-03
Publication date: 2023-08-01

Abstract

The invention discloses a graph enhancement-based question driven abstract multi-text answer abstract method, which comprises the following steps of: step 1: collecting and cleaning community type question-answering data; step 2: normalizing the data; step 3: constructing a model; step 4: the method can improve the problem constraint lack in the traditional multi-text answer abstract generation method, difficult to capture the problems of modeling relations among answers and the like, and meanwhile, the readability, fluency and conciseness of the generated abstract are improved by utilizing the graph coding information.

Description

Method and device for abstracting multi-text answer abstract of question driven abstraction based on graphic enhancement

Technical Field

The invention relates to a graph enhancement-based question driven abstract multi-text answer abstract method, and belongs to the technical field of Internet and artificial intelligence.

Background

Text summarization has been the core problem of natural language processing tasks and is an effective technique to alleviate the information overload problem faced by today's humans. Especially in the field of non-factual community questions and answers, it is difficult for a reader to quickly grasp the complete answer for a question in which a plurality of correct answers exist. Therefore, the answers are subjected to content aggregation by analyzing, refining, correlating and integrating a plurality of correct answers under the same question, so that readers can quickly and clearly understand the abstract content. This is a more challenging task than query-based multi-document summarization because of the need to fully consider the diversity and redundancy of answer information driven by questions.

While single text summaries exhibit superior performance in a sequence-to-sequence model, neural network-based models have been widely used for news summary and headline generation. However, unlike single text summaries, multiple document summaries involve identifying important information from multiple input sources and filtering redundant information, and integrating and processing it to generate a summary of information that is comprehensive, of importance and overall relevance. Therefore, simply applying a high-performance single text summarization model to multiple document summaries through transfer learning is not effective.

The multi-document abstract is mainly divided into two methods of extraction and abstraction. Extraction is a method for ordering important sentences directly from the original text. Although the results are encouraging, these methods lack a hierarchical structure that considers document clustering, and the inability to effectively construct an original orthographic piece of deep associations between hierarchical features results in a summary that lacks relevance and fluency. The extraction method only needs to use a model to identify information spans and identify effective abstracts by independently selecting paragraphs, while the abstract model must combine multiple evidence segments from long and noisy multi-document inputs and use arbitrary words and expressions to generate correct, concise and convincing abstracts on the basis of understanding the documents. With the effectiveness of the neural network approach, a neural abstract summary employing a codec structure exhibits excellent performance. Some researchers have applied it to long document summaries and achieved significant results. With the advent of large-scale multi-document summary datasets, researchers began exploring multi-document summary models. With respect to abstract abstracts, abstract abstracts remain a research hotspot in the current multi-document abstracts field. The model provided by the method also belongs to the category of abstract abstracts, but unlike the traditional abstract method, the method introduces graphic information in the encoder and decoder stages so as to ensure that the model better learns the relation between documents under the guidance of the graphic relation between the documents, thereby improving the performance of the model.

Therefore, the invention introduces a novel graph-enhancement-based question-driven abstract multi-text answer abstract method, which researches the role of guiding answer generation of questions in the coding stage through a pre-training model of coding learning language features and is repeated. In the encoding stage, a double encoding mechanism is designed, and the feature information of each answer is explicitly restrained by utilizing the questions in text encoding, so that the model can be ensured to accurately identify the information closely related to the questions in the answers. In graph coding, a multi-answer similarity graph is constructed, and graph information is coded to capture modeling relationships between answers and eliminate information redundancy. And finally, in the decoding stage, combining the text coding information with the graph coding information, and applying the combined text coding information and the graph coding information to guide abstract generation so as to ensure the performance of answer abstract.

Disclosure of Invention

In order to solve the problems and the defects existing in the prior art, the invention provides a graph-enhancement-based problem driven abstract multi-text answer abstract method, wherein the problems are used for explicitly restricting single answers, and the model is ensured to accurately identify information closely related to the problems in the answers. In addition, in the encoding stage, a similarity graph of a plurality of answers is constructed, the information of graph encoding and the characteristics of text encoding are fused, modeling relations among the answers are obtained, and information redundancy is eliminated. And finally, the graphic coding information is brought into a decoding stage, and the problem and the graphic coding are used for guiding the abstract generating process, so that the information quantity and fluency of the generated abstract are ensured.

In order to achieve the above object, the technical scheme of the present invention is as follows: the abstract multi-text answer abstract method driven by the questions based on graphic enhancement is characterized by comprising the following steps of:

step 1: collecting and cleaning community type question-answering data;

step 2: normalizing the data;

step 3: constructing a model;

step 4: model testing and answer abstract generation.

As an improvement of the present invention, step 1: the method comprises the steps of collecting and cleaning community type question and answer data, firstly crawling webpage html text from a real community type question and answer platform according to a crawler technology, and cleaning the data to remove webpage labels and non-text data including expressions, symbols, formulas and the like.

As an improvement of the present invention, step 2: and (3) data normalization, namely performing text normalization on the cleaned text according to the questions and the answers. The specific operation is to sort the questions in the text item by using a rule-based method, and normalize a plurality of answers corresponding to each question. And finally, manually writing answer summaries corresponding to the questions by using a manual technology, so that the answer summaries are used as reference answer summaries for correcting the model loss values during model training and testing.

As an improvement of the present invention, step 3: and (3) model construction, namely carrying out model construction on a graph-enhancement-based question driving abstract multi-text answer abstract method by utilizing standardized data processed in the step (2), firstly carrying out data cleaning on the question and a plurality of corresponding answers, coding by utilizing RoBERTa to generate an embedded vector, and then designing a double coding mechanism of the question driving in a coding stage, wherein the double coding mechanism is respectively a text coding representation and a graph coding representation. In the text coding representation, a transducer encoder is utilized to code the questions and each corresponding answer text, and context semantic information is obtained. In the graph encoding stage, a similar knowledge graph is constructed for a plurality of answer segments, eliminating redundant information. The graphic coding layer integrates an explicit graphic representation form into the coding process through a graphic attention mechanism on the basis of a transducer architecture to form graphic coding features. The method ensures that answer information characteristics of the question driver are fully mined in the encoding stage, and lays a foundation for the decoding stage. In the decoding stage, the text-encoded representation and the graph-encoded representation are integrated into an end-to-end decoding process and a Multi-Head hierarchy Attention mechanism is utilized to guide the digest generation process. The components of the decoding module are similar to the transducer architecture, except that a layering operation is incorporated in the Multi-Head Attention mechanism. Finally, training the model by using a training loss function; the implementation of this step can be divided into the following sub-steps:

step 3-1, constructing a model input layer, converting each word sequence in the normalized question and the corresponding multiple correct answers into word vector representation by using the pre-trained RoBERTa, and respectively obtaining mapped question word vector sequencesEach correct answer A _i ∈{A ₁ ,A ₂ ,…,A _N Word vector sequence of }

Sub-step 3-2, constructing text coding representation layer, and using a multi-layer transducer encoder to respectively encode problem word vector sequence E in the implementation _Q And each answer word vector sequenceExtracting semantic code, wherein->And->As input to the coding layer, a contextual semantic representation can be obtained by learning:

for each TransformerBlock, it consists of a multi-headed attention mechanism and two sub-layers, each using a residual connection and layer normalization,

where LayerNorm is layer normalization, MHAtt represents a multi-headed attentional mechanism, FFN is a location feed forward neural network, reLU is a hidden activation function,

after L-layer converter encoding, questions and answers are respectively encoded and outputAnd->The encoded output of the whole answer is expressed as:

because of the inconsistent proportions of question-related feature information contained in each answer paragraph, core answer information needs to be mined further, while the attention mechanism effectively represents the importance between vectors, so we employ multi-headed attention, multiple "zoom dot product attention" layers run in parallel,

MultiHead(Q,K,V)＝[head ₁ ,head ₂ ,…,head _h ]W ^O

wherein the weight matrix Attention (Q, K, V) is calculated. The final context representation multi head (Q, K, V) is constructed by concatenating the vector information of the different matrices of each attention layer. Note that the vector information of the different matrices of the layers are concatenated. W (W) _i ^Q ，W _i ^K ，W _i ^V Is a learnable parameter. Number of attention layers the number of attention layers h=8, the context answer channel is expressed asThe key information representing each answer paragraph is represented at the label level driven by the question.

In sub-step 3-3, a graph coding representation layer is constructed in which similar knowledge graphs are constructed for multiple answer paragraphs, which removes redundant information. The graphics coding layer incorporates explicit graphics representations into the coding process using a graphics attention mechanism based on the transform architecture to form graphics coding features, which ensures that answer information features driven by questions are fully explored during the coding phase, which lays a foundation for the decoding phase.

Let G denote the graphic representation matrix of the input document, where G [ i ]][j]Representing paragraph A _i And A _j Weight of relation between M _A Is based on the output of a text-coded representation consisting of a multi-headed pooled attention mechanism for which a specific model is seen in fig. 3 and a graphics-coded module.

For each headFirst calculate the input +.>Attention score of (a)/>Sum score divisionAnd for each header, calculate the attention probability distribution of all characters in the paragraph,

paragraph representation capable of obtaining final all answers

The graphic coding module consists of a plurality of graphic coding layers, wherein each graphic coding layer consists of a graphic multi-head attention mechanism and a two-layer feedforward network, and answer paragraphs representAnd a similar graphic representation matrix G between paragraphs as input to the graphic coding layer,/for example>Is a graphic coding layer G _l-1 ∈(1,G _L ) An output of (2); then, G is calculated as follows _l Output of +.>

Wherein the method comprises the steps ofIs the final representation obtained by the graph coding mechanism, and d _G ＝d _model /G _h θ is the standard deviation of the relationship representing the graph structure. Pairs of graph similarity matrix coefficients G are added to the weight relationships between the calculated paragraphs to establish potential dependencies between answers and identify redundant information, thereby enabling the model to improve representation between the learning context paragraphs of the information.

And 3-4, constructing a decoding layer, namely explicitly integrating the text coding representation and the graphic coding representation into an end-to-end decoding process, and guiding a digest generation process by utilizing a multi-head hierarchical attention mechanism, wherein the components of the decoding module are similar to a transducer architecture, and the difference is that layering operation is added in the multi-head attention mechanism. The specific implementation is as follows:

let the initial input of the decoding layer beWherein Y is _l Is the length of the digest generated by decoding. Decoding layer D _l The output of (2) is +.>And the former layer->The output of (2) isD _l The input of the layer is made,

then, the key information of paragraph level and word level is obtained through the multi-head part level attention mechanism, and the key weight of paragraph level is utilized to normalize the information weight of word level, so as to obtain the core information representation of words under different paragraph levels,

the multi-header hierarchical attention mechanism consists of two main components: paragraph-level multi-headed attention mechanisms and word-level multi-headed attention mechanisms,

the paragraph level multi-head attention mechanism focuses on the paragraph level representation obtained in graphic encoding By using an explicit graph structure to regularize the attention distribution to guide the word generation process of the current stage,

the main purpose of the word-level multi-headed attention mechanism is to guide the word generation process in the decoding of information-based representations, normalizing the weight distribution of the words in each paragraph by the weight distribution obtained at the paragraph level used in the current decoding stage, in which process the words are locally normalized, the decoding stage captures token context information in each paragraph,

wherein the method comprises the steps of d _W ＝d _model /W _h ，W _h The number of heads in the multi-head attention mechanism representing word level,/->Weight sum representing multiple words in all paragraphs, executing each head +.>In the context vector representation of (c), normalize the paragraph-level weight coefficients with the word-level weight coefficients to provide dual guidance for the digest generation process using the graphical encoded representation and the text encoded representation,

finally, parallel and linear operations are performed on the word-level context vector and paragraph-level context vector representation, and then the output of the final decoding layer is obtained through feed-forward network and layer normalization

Wherein the method comprises the steps ofFor the post-concatenation vector representation,

in the substep 3-5, training model loss is performed, all parameters are initialized in a random initialization mode, gradient counter propagation is performed by adopting an Adam optimizer to update model parameters, an initial learning rate is set to 0.002, a warm-up step length is set to 10000, a decoding part learning rate is set to 0.2, a warm-up step length is set to 8000, training epoch of the model is 100, dropout of all linear layers is 0.1, the maximum norm of the gradient is set to 2, hidden dimension in the model is set to 768, the number of layers of coding and decoding modules is 8, the number of text coding layers in a coding stage is set to 6 layers, the number of graphics coding layers is set to 2 layers, 5 optimal checkpoints are selected according to the performance of a verification set, the average result of the test set is reported, in a generation stage, beam search with a beam size of 5 is performed, model training is finished, and the model with the best performance on the verification set is saved.

In step 4, the model test and answer abstract generation are carried out, and the minimum loss value model trained by the model in step 3 is used as an optimal verification model to carry out abstract generation on the data to be subjected to the answer abstract. And carrying out objective verification on the performance of the generated answer abstract and the reference answer abstract, respectively calculating the word overlapping degree between the reference answer abstract and the generated answer abstract by utilizing ROUGE, and generating the accuracy of the answer abstract according to the longest public subsequence verification model.

The device for the abstract multi-text answer abstract method based on the graphic enhancement comprises a memory, a processor and a computer program which is stored in the memory and can run on the processor, wherein the computer program is used for realizing the abstract multi-text answer abstract method based on the graphic enhancement for the graphic enhancement when being loaded into the processor.

Compared with the prior art, the invention has the following advantages:

(1) The invention introduces a graph-enhancement-based question-driven abstract multi-text answer abstract technology, which is completely combined with a pre-training model for learning language features through coding, and research the effect of the questions on guiding answer generation in the coding stage is repeated. In the encoding stage, a double encoding mechanism is designed, and the feature information of each answer is explicitly restrained by utilizing the questions in text encoding, so that the model can be ensured to accurately identify the information closely related to the questions in the answers. In graph coding, a multi-answer similarity graph is constructed, and graph information is coded to capture modeling relationships between answers and eliminate information redundancy.

(2) The method brings the graphic coding information into the decoding stage, guides the abstract generating process by using the questions and the graphic codes, and ensures the information quantity and fluency of the generated abstract.

Drawings

FIG. 1 is a flow chart of a method according to an embodiment of the present invention;

FIG. 2 is an overall model diagram of an embodiment of the present invention;

FIG. 3 is a drawing of a multi-headed pooled attention mechanism in accordance with an embodiment of the present invention;

fig. 4 is a diagram illustrating key information in a multi-text answer abstract according to an embodiment of the invention.

Detailed Description

The invention is further illustrated below in conjunction with specific embodiments in order to enhance the understanding and appreciation of the invention.

Example 1: the method is based on a graph enhancement problem driven abstract multi-text answer abstract method, and firstly, community type question and answer data in the Internet are collected and cleaned. The input is then represented in-line with a pre-trained model having coded learning language features. In the coding stage, the function of the questions in the coding stage for guiding answer generation is studied, and a double coding mechanism is designed: in the text coding mechanism, the characteristic information of each answer is explicitly restrained by utilizing the questions in the text coding, so that the model can be ensured to accurately identify the information closely related to the questions in the answers. In the graph coding mechanism, a multi-answer similar graph is constructed, and graph information is coded to capture modeling relations among answers, so that information redundancy is eliminated. And finally, in the decoding stage, combining the text coding information with the graph coding information, and applying the combined text coding information and the graph coding information to guide abstract generation so as to ensure the performance of answer abstract. Referring to fig. 2, the detailed implementation steps are as follows:

step 1: the method comprises the steps of collecting and cleaning community type question and answer data, firstly crawling webpage html text from a real community type question and answer platform according to a crawler technology, and cleaning the data to remove webpage labels and non-text data including expressions, symbols, formulas and the like.

Step 2: and (3) data normalization, namely performing text normalization on the cleaned text according to the questions and the answers. The specific operation is to sort the questions in the text item by using a rule-based method, and normalize a plurality of answers corresponding to each question. And finally, manually writing answer summaries corresponding to the questions by using a manual technology, so that the answer summaries are used as reference answer summaries for correcting the model loss values during model training and testing.

Step 3: model construction, namely carrying out model construction on a graph-enhancement-based question driving abstract multi-text answer abstract method by using the normalized data processed in the step 2, wherein the implementation of the step is divided into the following substeps:

And 3-2, constructing a text coding representation layer. The implementation adopts a multi-layer transducer encoder to respectively encode the problem word vector sequence E _Q And each answer word vector sequenceSemantic codingExtracting->And->As input to the coding layer, a contextual semantic representation can be obtained by learning:

for each TransformerBlock, it consists of a multi-headed attention mechanism and two sublayers, each using a residual connection and layer normalization.

because of the inconsistent proportions of question-related feature information contained in each answer paragraph, core answer information needs to be mined further, while the attention mechanism effectively represents the importance between vectors. Thus, we employ multi-head attention, multiple "zoom dot product attention" layers run in parallel.

MultiHead(Q,K,V)＝[head ₁ ,head ₂ ,…,head _h ]W ^O

And 3-3, constructing a graph coding representation layer. In the graph-encoded representation, similar knowledge graphs are constructed for multiple answer paragraphs, which removes redundant information. The graphics coding layer utilizes a graphics attention mechanism to incorporate explicit graphical representations into the coding process to form graphics coding features on the basis of a transform architecture. This ensures that the answer information features of the question driver are fully explored in the encoding phase, which lays a foundation for the decoding phase.

Let G denote the graphic representation matrix of the input document, where G [ i ]][j]Representing paragraph A _i And A _j The relationship weights between them. M is M _A Is based on the output of the text encoded representation. The graphic coding representation consists of a multi-headed pooled attention mechanism and a graphic coding module. For the multi-head pool attention mechanism, it hasThe phantom is shown in fig. 3.

For each headFirst calculate the input +.>Attention score->Sum score divisionAnd for each header, calculate the attention probability distribution of all characters in the paragraph.

Paragraph representations of all final answers can be obtained

The graphic coding module is composed of a plurality of graphic coding layers, wherein each graphic coding layer is composed of a graphic multi-head attention mechanism and a two-layer feedforward network. Letting answer paragraphs representAnd a similar graphic representation matrix G between paragraphs as input to the graphic encoding layer. />Is a graphic coding layer G _l-1 ∈(1,G _L ) An output of (2); then, G is calculated as follows _l Output of +.>

Wherein the method comprises the steps ofIs the final representation obtained by the graph coding mechanism, and d _G ＝dmodel/G _h θ is the standard deviation of the relationship representing the graph structure. Pairs of graph similarity matrix coefficients G are added to the weight relationships between the calculated paragraphs to establish potential dependencies between answers and identify redundant information, thereby enabling the model to improve representation between the learning context paragraphs of the information.

And 3-4, constructing a decoding layer. The text encoded representation and the graphical encoded representation are explicitly integrated into an end-to-end decoding process and utilize a multi-header hierarchical attention mechanism to guide the digest generation process. The components of the decoding module are similar to the transducer architecture, except that we add layering operations in the multi-head attention mechanism. The specific implementation is as follows:

let the initial input of the decoding layer beWherein Y is _l Is the length of the digest generated by decoding. Decoding layer D _l The output of (2) is +.>And the former layer->The output of (2) is D _l Layer input.

Then, it is necessary to obtain the key information of paragraph level and word level through the multi-head part level attention mechanism, and normalize the information weight of word level by using the key weight of paragraph level, so as to obtain the core information representation in words at different paragraph levels.

The multi-header hierarchical attention mechanism consists of two main components: paragraph-level multi-headed attention mechanisms and word-level multi-headed attention mechanisms.

The paragraph level multi-head attention mechanism focuses on the paragraph level representation obtained in graphic encoding The word generation process of the current stage is guided by regularizing the attention distribution using an explicit graph structure.

The main purpose of the word-level multi-headed attention mechanism is to guide the word generation process in the decoding of information-based representations. The weight distribution of the words in each paragraph is normalized by the weight distribution obtained at the paragraph level used at the current decoding stage, where the words are locally normalized, which captures token context information in each paragraph.

Wherein the method comprises the steps of d _W ＝d _model /W _h ，W _h Representing the number of heads in the multi-head attention mechanism at the word level. />Representing the weighted sum of the multiple words in all paragraphs. Executing every head->In the context vector representation of (c), the paragraph-level weight coefficients are normalized with the word-level weight coefficients to provide dual guidance for the digest generation process using the graphical encoded representation and the text encoded representation.

Finally, the word-level context vector and paragraph-level context vector representation are operated in parallel and linearly, and then summed via a feed-forward networkLayer normalization to obtain the output of the final decoded layer

Wherein the method comprises the steps ofIs a spliced vector representation.

And 3-5, training model loss. The method adopts a random initialization mode to initialize all parameters, adopts an Adam optimizer to update model parameters by gradient back propagation, sets the initial learning rate to 0.002, sets the warm-up step length to 10000, sets the learning rate of a decoding part to 0.2, and sets the warm-up step length to 8000. The training epoch for this model was 100. Dropout for all linear layers was 0.1. The maximum norm of the gradient is set to 2 and the hidden dimension in the model is set to 768. The number of layers of the coding and decoding modules is 8, wherein the number of text coding layers in the coding stage is set to 6, and the number of graphics coding layers is set to 2. The 5 best checkpoints were selected according to the performance of the validation set and the average result of the test set was reported. In the generation phase, a beam search with a beam size of 5 is used. Model training is completed and the best performing model on the validation set is saved.

And 4, model test and answer abstract generation, namely performing abstract generation on the data to be subjected to the answer abstract according to the minimum loss value model trained by the model in the step 3 as an optimal verification model. And carrying out objective verification on the performance of the generated answer abstract and the reference answer abstract, respectively calculating the word overlapping degree between the reference answer abstract and the generated answer abstract by utilizing ROUGE, and generating the accuracy of the answer abstract according to the longest public subsequence verification model.

Based on the same inventive concept, the method and the device for abstracting the abstract multi-text answer based on the question driven by the graphic enhancement in the invention comprise a memory, a processor and a computer program which is stored on the memory and can run on the processor, wherein the method for abstracting the abstract multi-text answer based on the question driven by the graphic enhancement is realized when the computer program is loaded to the processor.

It will be appreciated by those skilled in the art that the embodiments described herein are for the purpose of aiding the reader in understanding the principles of the invention, and it is to be understood that the embodiments are merely illustrative of the invention and not limiting the scope of the invention, as various equivalent modifications to the invention will fall within the scope of the claims of this application after reading the invention.

Claims

1. The abstract multi-text answer abstract method driven by the questions based on graphic enhancement is characterized by comprising the following steps of:

step 1: collecting and cleaning community type question-answering data;

step 2: normalizing the data;

step 3: constructing a model;

step 4: model testing and answer abstract generation.

2. The method for abstracting a multi-text answer to a question driven by a graphic enhancement according to claim 1, wherein step 1: the method comprises the steps of collecting and cleaning community type question and answer data, firstly crawling webpage html text from a real community type question and answer platform according to a crawler technology, and cleaning the data to remove webpage labels and non-text data including expressions, symbols and formulas.

3. The method for abstracting a multi-text answer to a question driven by a graphic enhancement according to claim 1, wherein step 2: and (3) normalizing the data, normalizing the text after cleaning according to the questions and the answers, wherein the specific operation is to sort the questions in the text one by utilizing a rule-based method, normalize a plurality of answers corresponding to each question, and finally manually writing a plurality of answers corresponding to each question by utilizing a manual technology, so that correction of a model loss value is carried out by taking the answers as reference answer summaries during model training and testing.

4. The method for abstracting a multi-text answer to a question driven by a graphic enhancement according to claim 1, wherein the step 3: model construction, namely carrying out model construction on a graph-enhancement-based question driving abstract multi-text answer abstract method by using the normalized data processed in the step 2, wherein the implementation of the step is divided into the following substeps:

step 3-1, constructing a model input layer, converting each word sequence in the normalized question and the corresponding multiple correct answers into word vector representation by using the pre-trained RoBERTa, and respectively obtaining mapped question word vector sequences Each correct answer A _i ∈{A ₁ ,A ₂ ,…,A _N Word vector sequence ∈ }>

Sub-step 3-2, constructing text coding representation layers, and respectively using a multi-layer transducer encoder to respectively encode the problem word vector sequence E _Q And each answer word vector sequenceExtracting semantic code, wherein->And->As input to the coding layer, a contextual semantic representation is obtained by learning:

because of the inconsistent proportions of question-related feature information contained in each answer paragraph, core answer information needs to be mined at a deeper level, while the attention mechanism effectively represents the importance between vectors, with multiple head attention, multiple "zoom dot product attention" layers run in parallel,

MultiHead(Q,K,V)＝[head ₁ ,head ₂ ,…,head _h ]W ^O

wherein the weight matrix Attention (Q, K, V) is calculated, the final context representation MultiHead (Q, K, V) is constructed by concatenating the vector information of the different matrices of each Attention layer, the vector information of the different matrices of the Attention layers are concatenated, W _i ^Q ，W _i ^K ，W _i ^V The number of attention layers is a learnable parameter, the number of attention layers is h=8, and the context answer channel is expressed asA key information representation representing each answer paragraph at the markup level driven by the question;

sub-step 3-3, constructing a graph coding representation layer, combining explicit graph representation into the coding process by using a graph attention mechanism on the basis of a transducer architecture to form graph coding features,

let G denote the graphic representation matrix of the input document, where G [ i ]][j]Representing paragraph A _i And A _j Weight of relation between M _A Based on the output of text encoded representation, the graphic encoded representation is composed of a multi-headed pooled attention mechanism and a graphic encoding module, for the multi-headed pooled attention mechanism,

for each headFirst calculate the input +.>Attention score->And worth dividing-> And for each header, calculate the attention probability distribution of all characters in the paragraph,

paragraph representation capable of obtaining final all answers

The graphic coding module is composed of a plurality of graphic coding layersEach graphic coding layer consists of a graphic multi-head attention mechanism and a two-layer feedforward network, so that answer paragraphs representAnd a similar graphic representation matrix G between paragraphs as input to the graphic coding layer,/for example>Is a graphic coding layer G _l-1 ∈(1,G _L ) An output of (2); then, G is calculated as follows _l Output of +.>

Wherein the method comprises the steps ofIs the final representation obtained by the graph coding mechanism, and d _G ＝d _model /G _h θ is the standard deviation of the relationship representing the graph structure, pairs of graph similarity matrix coefficients G are added to the weight relationship between calculated paragraphs to establish potential dependencies between answers and identify redundant information, thereby enabling the model to improve the representation between learning condition paragraphs of information,

and 3-4, constructing a decoding layer, explicitly integrating the text coding representation and the graphic coding representation into an end-to-end decoding process, and guiding a digest generation process by utilizing a multi-head hierarchical attention mechanism, wherein the method is concretely implemented as follows:

let the initial input of the decoding layer beWherein Y is _l Is the length of the digest generated by decoding, layer D _l The output of (2) is +.>And the former layer->The output of (2) is D _l The input of the layer is made,

the paragraph level multi-head attention mechanism focuses on the paragraph level representation obtained in graphic encodingBy using an explicit graph structure to regularize the attention distribution to guide the word generation process of the current stage,

wherein the method comprises the steps ofd _W ＝d _model /W _h ，W _h Multi-headed attention mechanism representing word levelThe number of middle heads>Weight sum representing multiple words in all paragraphs, executing each head +.>In the context vector representation of (c), normalize the paragraph-level weight coefficients with the word-level weight coefficients to provide dual guidance for the digest generation process using the graphical encoded representation and the text encoded representation,

step 3-5, training model loss, initializing all parameters in a random initialization mode, adopting an Adam optimizer to perform gradient back propagation to update model parameters, setting an initial learning rate to 0.002, setting a warm-up step length to 10000, setting a decoding part learning rate to 0.2, setting a warm-up step length to 8000, training epoch of the model to 100, setting Dropout of all linear layers to 0.1, setting the maximum norm of the gradient to 2, setting the hidden dimension in the model to 768, setting the number of layers of coding and decoding modules to 8, setting the number of text coding layers to 6 in a coding stage, setting the number of graphics coding layers to 2, selecting 5 optimal check points according to the performance of a verification set, reporting the average result of the test set, searching by using a light beam with the size of 5 in a generating stage, finishing model training, and storing the model with the best performance on the verification set.

5. The method for abstracting the multi-text answer abstract based on the graphic enhancement according to claim 1, wherein in step 4, a model test and an answer abstract are generated, the data to be subjected to the answer abstract is generated in an abstract mode according to a loss value minimum model trained by the model in step 3 as an optimal verification model, the generated answer abstract and a reference answer abstract are subjected to objective verification of performance, the overlapping degree of words between the reference answer abstract and the generated answer abstract is calculated respectively by utilizing ROUGE, and the accuracy of the answer abstract is generated according to a longest public subsequence verification model.

6. A method and apparatus for providing a graphic enhancement based question driven abstract multi-text answer abstract as claimed in any one of claims 1 to 5, including a memory, a processor and a computer program stored on the memory and executable on the processor, the computer program when loaded into the processor implementing the above described graphic enhancement based abstract multi-text answer abstract method for providing a question driven abstract.