CN112464657B - Hybrid text abstract generation method, system, terminal and storage medium - Google Patents
Hybrid text abstract generation method, system, terminal and storage medium Download PDFInfo
- Publication number
- CN112464657B CN112464657B CN202011429791.1A CN202011429791A CN112464657B CN 112464657 B CN112464657 B CN 112464657B CN 202011429791 A CN202011429791 A CN 202011429791A CN 112464657 B CN112464657 B CN 112464657B
- Authority
- CN
- China
- Prior art keywords
- sentence
- vector
- text
- word
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 76
- 239000013598 vector Substances 0.000 claims abstract description 146
- 230000011218 segmentation Effects 0.000 claims abstract description 55
- 238000012549 training Methods 0.000 claims abstract description 24
- 230000002787 reinforcement Effects 0.000 claims abstract description 18
- 238000013528 artificial neural network Methods 0.000 claims abstract description 15
- 238000012512 characterization method Methods 0.000 claims abstract description 7
- 230000006870 function Effects 0.000 claims description 65
- 230000006399 behavior Effects 0.000 claims description 33
- 230000015654 memory Effects 0.000 claims description 20
- 238000000605 extraction Methods 0.000 claims description 18
- 230000008569 process Effects 0.000 claims description 16
- 238000011156 evaluation Methods 0.000 claims description 15
- 238000004590 computer program Methods 0.000 claims description 14
- 230000001360 synchronised effect Effects 0.000 claims description 5
- 238000012935 Averaging Methods 0.000 claims description 4
- 230000009471 action Effects 0.000 claims description 4
- 230000000694 effects Effects 0.000 claims description 4
- 239000000284 extract Substances 0.000 claims description 4
- 230000007774 longterm Effects 0.000 claims description 4
- 230000007246 mechanism Effects 0.000 abstract description 7
- 239000010410 layer Substances 0.000 description 5
- 238000003058 natural language processing Methods 0.000 description 3
- 230000003044 adaptive effect Effects 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000005192 partition Methods 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- JEIPFZHSYJVQDO-UHFFFAOYSA-N iron(III) oxide Inorganic materials O=[Fe]O[Fe]=O JEIPFZHSYJVQDO-UHFFFAOYSA-N 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 239000002356 single layer Substances 0.000 description 1
- 238000005728 strengthening Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Machine Translation (AREA)
Abstract
The invention provides a method and a system for generating a hybrid text abstract, which are used for carrying out sentence segmentation and word segmentation on an input text and recording the index of a sentence in which each word is positioned; carrying out word and sentence characterization on the sentence segmentation and the word segmentation result in sequence to obtain a sentence vector; copying each sentence vector, and marking copied and rewritten vector marks on the two vectors respectively; extracting important sentence vectors, and making a decision of copying or rewriting according to the vector marks; editing and modifying the sentences needing to be rewritten to obtain a text abstract; and training the adopted neural network to finish gradient updating of the parameters. A corresponding terminal and a storage medium are also provided. The invention mixes the extracted sentences and abstract sentences in the abstract for the first time, and distinguishes the sentences directly used for the abstract and the rewritten sentences through a copying or rewriting mechanism; the layered reinforcement learning method training method takes the extracted sentences as tasks from managers to workers, and improves the collaboration between the two networks.
Description
Technical Field
The invention relates to the technical field of natural language processing, in particular to a hybrid text abstract generation method, a hybrid text abstract generation system, a hybrid text abstract generation terminal and a hybrid text abstract storage medium based on hierarchical reinforcement learning.
Background
The goal of the abstract is to rewrite a long chapter to a short, smooth version while preserving the most prominent content. With the successful application of neural networks in Natural Language Processing (NLP) tasks, two data-driven branch extractions and abstract digests stand out from various approaches. The extraction method generally selects the most prominent sentences from the source articles as the abstract, the content selection is accurate, the information amount of the result is large, but the redundancy is high because the sentences are not rewritten. In contrast, the abstract method can generate a more concise abstract through compression and interpretation, but the existing model is weak in content selection and is easy to lose key information. It will thus be seen that the two branches are complementary, which facilitates the combination of their advantages and forms an informative and concise summary. There are several techniques for accomplishing merging of these two branches that have been known in the art, most of which use a first-to-extract-then-abstract framework that first extracts the sentences worth summarizing and then abstracts each sentence. However, since all sentences are indiscriminately compressed and pruned, they suffer from information loss during the abstraction phase. When the whole sentence is crucial, some important contents can be deleted by mistake, resulting in serious information loss. In addition, the training methods of these techniques are also not end-to-end due to the lack of an efficient reinforcement learning framework to connect the two modules.
At present, no explanation or report of the similar technology of the invention is found, and similar data at home and abroad are not collected.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a hybrid text abstract generation method, a hybrid text abstract generation system, a hybrid text abstract generation terminal and a hybrid text abstract storage medium based on hierarchical reinforcement learning.
The invention is realized by the following technical scheme.
According to an aspect of the present invention, there is provided a hybrid text summary generation method, including:
performing sentence and word segmentation on an input text, and recording an index of a sentence where each word is located;
carrying out word and sentence characterization on the sentence segmentation and the word segmentation result in sequence to obtain a sentence vector;
copying each sentence vector, and adding copy and rewrite vectors to the original sentence vector and the copied sentence vector as vector marks respectively;
extracting important sentence vectors, and making a decision of copying or rewriting according to the vector marks;
editing and modifying the sentences needing to be rewritten to obtain a text abstract;
and training the neural network adopted in the process of extracting important sentence vectors and editing and modifying the sentences to be rewritten to finish gradient updating of all neural network parameters.
Preferably, the sentence and word segmentation of the input text and recording the index of the sentence in which each word is located includes:
for the input text, punctuation marks are used as sentence ending marks to divide sentences;
performing word segmentation on each sentence obtained by sentence segmentation;
and recording the position information of each word in the sentence obtained by word segmentation, wherein the position information is used for indicating that each word is in the second sentence of the input text.
Preferably, the characterizing words and sentences in sequence on the result of the sentence segmentation and the word segmentation to obtain a sentence vector includes:
using a pre-training language model to express words obtained by word segmentation as vectors;
and averaging the word vectors of each sentence to obtain a sentence vector.
Preferably, the pre-training language model employs a BERT model.
Preferably, the extracting of the important sentence vector includes:
using a pointer network to carry out context expression on the sentence vectors;
calculating the relation between each sentence vector by using an attention model, and calculating the corresponding weight of each sentence vector;
and transferring the state variables of the pointer network, and sequentially selecting a plurality of important sentence vectors according to the weight of each sentence vector.
Preferably, the editing and modifying the sentence needing to be rewritten includes:
acquiring corresponding sentence original text from the input text according to the index of the sentence in which each word is positioned;
and encoding and decoding by using a pointer generation network to realize the rewriting of the text.
Preferably, the extracting of the important sentence vector and the editing and modifying of the sentence to be rewritten respectively adopt a pointer network and a pointer generation network; training the neural network adopted in the process by using layered reinforcement learning, wherein the training comprises the following steps:
the operation of two steps of extraction and rewriting is completed in sequence;
evaluating the extraction and rewriting results by using an automatic evaluation index;
and constructing a target function by taking the evaluation as a return, and performing uniform gradient updating on parameters of the pointer network and the pointer generation network.
Preferably, the constructing an objective function by using the evaluation as a return, and performing uniform gradient update on parameters of the pointer network and the pointer generation network includes:
the objective function L (θ) is constructed as:
wherein, at、ct、r、Rt、yt、btRespectively a behavior function, a state function, a return function, a feedback function, an edited text and a reference; r (a)t) Representing behavior a for the return of the pointer networktCurrent impact on the quality of the summary, Rt(at+1) Is an action atIs a feedback function of (a) represents the behavior atThe long-term influence on the quality of the summary, λ is the weighting coefficient of the feedback function, rw(yt) Generating network returns for pointers, beta generating network returns r for pointersw(yt) The weighting coefficient of (2); the behavior function is used for indicating the next behavior of the network, namely which sentence is extracted; the state function is used for representing the current state of the model; the return function is used for evaluating the current behavior atThe value of (D); the feedback function is used for evaluating the current behavior atLong term effects on the subsequent behavior of the model; after said editingText for composing a sentence of the output summary; the benchmark is used for evaluating the value of the current state, and the fluctuation of the return function can be reduced;
iterating the objective function L (θ) until convergence.
Preferably, the baseline is generated using a synchronous action-evaluation A2C algorithm.
According to another aspect of the present invention, there is provided a hybrid text summary generation system, including:
the sentence and word segmentation module is used for segmenting sentences and words of the input text and recording the index of the sentence where each word is located;
the sentence vector acquisition module is used for representing words and sentences in sequence according to the sentence segmentation result and the word segmentation result to obtain a sentence vector;
the vector marking module is used for copying each sentence vector and respectively adding copy and rewrite vectors to the original sentence vector and the copied sentence vector as vector marks;
a decision module which extracts important sentence vectors and makes a decision of copying or rewriting according to the vector marks;
the text abstract generating module is used for editing and modifying the sentences needing to be rewritten to obtain a text abstract;
and the updating module trains the process to finish gradient updating of all parameters in the decision module and the text abstract generating module.
According to a third aspect of the present invention, there is provided a terminal comprising a memory, a processor and a computer program stored on the memory and operable on the processor, wherein the processor, when executing the computer program, is operable to perform any of the methods described above.
According to a fourth aspect of the invention, there is provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, is operable to perform the method of any of the above.
Due to the adoption of the technical scheme, compared with the prior art, the invention has at least one of the following beneficial effects:
the hybrid text abstract generation method, the hybrid text abstract generation system, the hybrid text abstract generation terminal and the hybrid text abstract generation storage medium can flexibly switch between the copied sentences and the rewritten sentences according to the redundancy, so that the advantages of two branches of the abstract can be effectively combined, and both informativeness and simplicity are considered. In addition, based on layered reinforcement learning, an end-to-end reinforcement method is provided, an extraction module and a rewriting module are connected, the collaboration between the extraction module and the rewriting module is enhanced, and the extraction module and the rewriting module are dynamically adaptive to each other in the training process.
The invention provides a hybrid text abstract generating method, a system, a terminal and a storage medium, which adopt a two-step method to construct a framework: firstly, extracting a salient sentence from an input article, and distinguishing the sentence according to redundancy by using a copy rewriting decision mechanism; the final summary is then generated by copying or rewriting the selected sentence accordingly.
Compared with the existing common model information, the generated abstract is richer and the language is simpler.
Drawings
Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:
FIG. 1 is a flowchart illustrating a method for generating a hybrid text abstract according to an embodiment of the present invention;
FIG. 2 is a flow chart of a hybrid text summarization generation method according to a preferred embodiment of the present invention;
FIG. 3 is a flow chart of a hybrid text summarization generation method according to a preferred embodiment of the present invention;
FIG. 4 is a diagram illustrating the working process of the hybrid text summarization generation method according to a preferred embodiment of the present invention;
fig. 5 is a schematic diagram illustrating components of a hybrid text summarization system according to an embodiment of the present invention.
Detailed Description
The following examples illustrate the invention in detail: the embodiment is implemented on the premise of the technical scheme of the invention, and a detailed implementation mode and a specific operation process are given. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention.
Fig. 1 is a flowchart of a hybrid text summarization generation method according to an embodiment of the present invention.
As shown in fig. 1, the hybrid text summarization generating method provided in this embodiment may include the following steps:
s100, performing sentence segmentation and word segmentation on an input text, and recording an index of a sentence where each word is located;
s200, performing word and sentence characterization on the sentence and word segmentation results in sequence to obtain a sentence vector;
s300, copying each sentence vector, and adding a copy vector and a rewrite vector to the original sentence vector and the copied sentence vector respectively to serve as vector marks;
s400, extracting important sentence vectors, and making a copy or rewrite decision according to the vector marks;
s500, editing and modifying the sentence needing to be rewritten to obtain a text abstract;
s600, training the neural network adopted in the process of extracting important sentence vectors and editing and modifying the sentences needing to be rewritten, and finishing gradient updating of all neural network parameters.
In S100 of this embodiment, performing sentence segmentation and word segmentation on the input text, and recording an index of a sentence in which each word is located preferably includes:
s101, for an input text, taking punctuation marks as sentence ending marks to perform sentence division;
s102, performing word segmentation on each sentence obtained by sentence segmentation;
and S103, recording position information of each word in the sentence obtained by word segmentation, wherein the position information is used for indicating the several sentences of each word in the input text.
In S200 of this embodiment, performing word and sentence characterization on the sentence segmentation and the word segmentation result in sequence to obtain a sentence vector, preferably including:
s201, representing words obtained by word segmentation as vectors by using a pre-training language model;
s202, averaging the word vectors of each sentence to obtain a sentence vector.
In a specific application example of this embodiment, the pretrained language model preferably uses a BERT model.
In S400 of this embodiment, extracting important sentence vectors preferably includes:
s401, context expression is carried out on the sentence vectors by using a pointer network;
s402, calculating the relation between each sentence vector by using an attention model, and calculating the corresponding weight of each sentence vector;
and S403, transferring the state variables of the pointer network, and sequentially selecting a plurality of important sentence vectors according to the weight of each sentence vector.
In S500 of this embodiment, editing and modifying the sentence to be rewritten preferably includes:
s501, acquiring corresponding sentence original texts from input texts according to the index of the sentence where each word is located;
and S502, encoding and decoding by using a pointer generation network to realize the rewriting of the text.
In S600 of this embodiment, extracting important sentence vectors and editing and modifying sentences to be rewritten respectively employ a pointer network and a pointer generation network; preferably, the neural network used in the above process is trained by using hierarchical reinforcement learning, and preferably includes:
s601, sequentially finishing the operations of the two steps of extraction and rewriting;
s602, evaluating the extraction and rewriting results by using an automatic evaluation index;
s603, constructing a target function by taking the evaluation as a return, and performing uniform gradient updating on parameters of the pointer network and the pointer generation network.
In a specific application example of this embodiment, constructing an objective function with the evaluation as a return, and performing uniform gradient update on parameters of the pointer network and the pointer generation network preferably includes:
s6031, constructing an objective function L (θ) as:
wherein, at、ct、r、Rt、yt、btRespectively a behavior function, a state function, a return function, a feedback function, an edited text and a reference; r (a)t) Representing behavior a for the return of the pointer networktCurrent impact on the quality of the summary, Rt(at+1) Is an action atIs a feedback function of (a) represents the behavior atThe long-term influence on the quality of the summary, λ is the weighting coefficient of the feedback function, rw(yt) Generating network returns for pointers, beta generating network returns r for pointersw(yt) The weighting coefficient of (2); a behavior function for indicating the next behavior of the network, namely which sentence is extracted; the state function is used for representing the current state of the model; a reward function for evaluating the current behavior atCurrent impact on summary quality; feedback function for evaluating current behavior atLong term effects on the subsequent behavior of the model; the edited text is used for forming a sentence of the output abstract; the benchmark is used for evaluating the value of the current state, and the fluctuation of the return function can be reduced;
s6032, the objective function L (θ) is iterated until convergence.
In a specific application of this embodiment, the baseline is preferably generated using a synchronous action-evaluation A2C algorithm.
Fig. 2 is a flowchart of a hybrid text summarization generation method according to a preferred embodiment of the present invention.
As shown in fig. 2, the hybrid text summary generation method provided by the preferred embodiment may include the following steps:
step 1, performing sentence segmentation and word segmentation on an input text, and recording an index of a sentence where each word is located;
step 2, performing word and sentence characterization on the sentence and word segmentation results in sequence to obtain a sentence vector;
step 3, copying each sentence vector, and adding a copy vector and a rewrite vector as marks to the two copied vectors respectively;
step 4, extracting important sentence vectors by using a Pointer Network (Pointer Network), and making a decision of copying or rewriting according to the vector marks;
step 5, editing and modifying the sentence needing to be rewritten by using a Pointer-generated network (Pointer-Generator) to obtain a text abstract;
and 6, training the pointer network and the pointer generation network used in the process by using layered reinforcement learning to finish gradient updating of all neural network parameters.
As a preferred embodiment, in step 1, performing sentence segmentation and word segmentation on an input text, and recording an index of a sentence in which each word is located, includes:
step 1.1, for an input text, carrying out sentence division according to a sentence ending symbol;
step 1.2, for each sentence obtained by sentence segmentation, performing word segmentation based on a blank space;
and 1.3, recording the position information of each word in the sentence obtained by word segmentation. The position information indicates that each word is in the second sentence of the input text.
As a preferred embodiment, in step 2, performing word and sentence characterization in sequence to obtain a sentence vector, including:
step 2.1, representing words obtained by word segmentation as vectors by using a pre-training language model;
and 2.2, averaging the word vectors of each sentence to obtain a sentence vector.
As a preferred embodiment, the pre-trained language model employs a BERT model.
As a preferred embodiment, in step 4, extracting important sentence vectors by using a Pointer Network (Pointer Network) includes:
step 4.1, using a Pointer Network (Pointer Network) to carry out context expression on the sentence vectors;
step 4.2, calculating the relation between each sentence vector by using an attention model, and calculating the weight which each sentence vector should obtain;
and 4.3, transferring the state variables of the pointer network, and sequentially selecting a plurality of important sentence vectors according to the weight of each sentence vector.
In step 5, as a preferred embodiment, the editing and modifying the sentence to be rewritten using a Pointer-Generator network (Pointer-Generator) includes:
step 5.1, acquiring corresponding sentence original text from the input text according to the index of the sentence where each word is located;
and 5.2, encoding and decoding by using the pointer network to realize the rewriting of the text.
As a preferred embodiment, in step 6, the above process is trained using hierarchical reinforcement learning, which includes:
step 6.1, the operation of the two steps of extraction and rewriting is completed in sequence;
step 6.2, evaluating the extraction and rewriting results by using an automatic evaluation index (such as ROUGE);
and 6.3, constructing a target function by taking the evaluation as a return, and performing uniform gradient updating on parameters of the adopted pointer network and the pointer generation network.
As a preferred embodiment, in step 6.3, an objective function is constructed by taking the evaluation as a return, and the uniform gradient update is performed on the parameters of the adopted pointer network and the pointer generation network, including:
step 6.31, constructing an objective function L (theta) as follows:
wherein, at、ct、r、Rt、yt、btRespectively a behavior function, a state function, a return function, a feedback function, a compiled text and a reference; r (a)t) Representing behavior a for the return of the pointer networktCurrent impact on the quality of the summary, Rt (a)t+1) Is an action atIs a feedback function of (a) represents the behavior atThe long-term influence on the abstract quality, wherein lambda is a weighting coefficient of a feedback function; r isw(yt) Generating network returns for pointers, beta generating network returns r for pointersw(yt) The weighting coefficients of (a); a behavior function for indicating the next behavior of the network, i.e. which sentence is extracted; the state function is used for representing the current state of the model; a reward function for evaluating the current behavior atCurrent impact on summary quality; feedback function for evaluating current behavior atLong term effects on subsequent behavior of the model; the edited text is used for forming a sentence of the output abstract; the benchmark is used for evaluating the value of the current state, and the fluctuation of the return function can be reduced;
s6.32, iterate the objective function L (θ) until convergence.
As a preferred embodiment, the baseline is generated using a synchronized action-evaluation A2C algorithm.
Fig. 3 is a flowchart of a hybrid text summarization generation method according to another preferred embodiment of the present invention.
As shown in fig. 3, the hybrid text summarization generating method provided by the preferred embodiment may include the following steps:
step S101: sentence and word segmentation are carried out on the input text, and the index of the sentence where each word is located is recorded:
firstly, for input text, a sentence is divided according to a sentence ending symbol. And segmenting words based on blank spaces for each sentence. In this process we record the position information of which sentence each word belongs to.
Step S102: and (3) utilizing a layered BERT representation mechanism to sequentially represent words and sentences on the input text clause and clause result to obtain a sentence vector:
in this embodiment, a BERT model is used, and this step provides a hierarchical BERT representation method based on sentence coding of a two-layer BERT network. First, the entire article is placed in a pre-trained BERT model (i.e., a pre-trained language model) to allow each word to have a broad context, which helps to more accurately represent its meaning. Then, word representations are obtained by merging the hidden vectors of the last four layers of the BERT network and passing into multiple layers of perceptron layers. This step injects context and word positions throughout the article into the word vector.
A preliminary representation of each sentence is obtained by performing an average pool operation on the word vectors.
Then, to embed the sentence position information and sentence-level context into the representation, they are further input into a single-layer BERT, resulting in the final sentence vector hi。
Step S103: each sentence vector is duplicated, and the two duplicated vectors are marked with copied and rewritten vector marks respectively:
after sentence vector representation, the sentence vector is copied and two different token vectors are added, respectively, and vector h is copiedcAnd a rewrite vector hr。
the token vector is a trainable parameter that helps the model distinguish two different operations for each sentence. Each sentence now has two different versions of the vector. When the pointer network selects a duplicate version of a sentence, it will be added directly to the summary without any version. Conversely, if the rewrite version is selected, the sentence is rewritten (compressed or rewritten) to reduce redundancy.
Step S104: important sentences are extracted using a pointer network and a decision to copy or rewrite is made based on the vector tags.
Each sentence now has two different versions of the vector. The vector in step S103 is selected using a pointer network that uses an attention mechanism to select important sentences. There are two choices for the pointer network to select each sentence. A sentence selection copyWhen it is version, it will be added directly to the summary without any editing. Conversely, if the overwrite version is selectedThe sentence will be rewritten (compressed or rewritten) to reduce redundancy. By the method, two motion spaces can be successfully combined into one space, so that the motion space is suitable for the current reinforcement learning.
Step S105: editing and modifying the sentence needing to be rewritten by using a coder-decoder to make the sentence simple and smooth:
this step copies or rewrites the corresponding sentence according to the decision of step S104, generating a final digest. Their duplication operations are to preserve all information in case the extracted sentences are already sufficiently concise, while the rewriting operations are to simplify or interpret the superfluous sentences. The corresponding sentence is required to be rewritten using an encoder-aligner-decoder network with a replication mechanism.
Step S106: training the pointer network and the pointer generation network adopted in the process by using layered reinforcement learning to finish gradient updating of all neural network parameters:
in the hierarchical reinforcement learning HRL method, an extraction module, i.e., a Pointer Network (Pointer Network), is regarded as a manager operating at a sentence layer, and a sentence editing module, i.e., a Pointer-generating Network (Pointer-Generator), is regarded as a worker operating at a word layer. The task is the decision of the selected sentence and the copying or rewriting. We also consider the worker's rewards in estimating the administrator's rewards, which more accurately describes the impact of the administrator's behavior on the summary. Wherein the manager uses the objective function in each round
Performing parameter update, wherein r (a)t) Is the manager (the extraction module) in its own report, and rw(yt) In return for the worker (editing module).
For the worker we perform parameter updates using the following objective function:
wherein b istThe reference function may be generated in any way.
The basic idea of the hybrid text abstract generation method provided by the embodiment of the invention is as follows: as shown in fig. 4, a two-step method is used to construct the framework. Salient sentences (important sentences) are first extracted from the input text, and the sentences are distinguished according to redundancy using a copy or rewrite mechanism. The final summary is then generated by copying or rewriting the selected sentence accordingly. In addition, the embodiment of the invention provides an end-to-end neural network training method based on layered reinforcement learning, two independent steps of extraction and editing are connected, the collaboration between the two steps is enhanced, and the two steps are dynamically adaptive to each other in the training process.
Another embodiment of the present invention provides a hybrid text summarization generating system, as shown in fig. 5, which may include: the sentence segmentation and word segmentation system comprises a sentence segmentation and word segmentation module, a sentence vector acquisition module, a vector marking module, a decision module, a text abstract generation module and an updating module.
Wherein:
the sentence and word segmentation module is used for segmenting sentences and words of the input text and recording the index of the sentence where each word is located;
the sentence vector acquisition module is used for representing words and sentences in sequence according to the sentence segmentation result and the word segmentation result to obtain a sentence vector;
the vector marking module is used for copying each sentence vector and respectively adding copy and rewrite vectors to the original sentence vector and the copied sentence vector as vector marks;
a decision module which extracts important sentence vectors and makes a decision of copying or rewriting according to the vector marks;
the text abstract generating module is used for editing and modifying the sentences needing to be rewritten to obtain a text abstract;
and the updating module trains the process to finish gradient updating of all parameters in the decision module and the text abstract generating module.
A third embodiment of the present invention provides a terminal, which includes a memory, a processor, and a computer program stored in the memory and capable of running on the processor, and the processor, when executing the computer program, can be configured to perform the method of any one of the above embodiments.
Optionally, a memory for storing a program; a Memory, which may include a volatile Memory (RAM), such as a Random Access Memory (SRAM), a Double Data Rate Synchronous Dynamic Random Access Memory (DDR SDRAM), and the like; the memory may also comprise a non-volatile memory, such as a flash memory. The memories are used to store computer programs (e.g., applications, functional modules, etc. that implement the above-described methods), computer instructions, etc., which may be stored in partition in the memory or memories. And the computer programs, computer instructions, data, etc. described above may be invoked by a processor.
The computer programs, computer instructions, etc. described above may be stored in partitions in one or more memories. And the computer programs, computer instructions, data, etc. described above may be invoked by a processor.
A processor for executing the computer program stored in the memory to implement the steps of the method according to the above embodiments. Reference may be made in particular to the description relating to the preceding method embodiment.
The processor and the memory may be separate structures or may be an integrated structure integrated together. When the processor and the memory are separate structures, the memory and the processor may be coupled by a bus.
A fourth embodiment of the invention provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, is operable to perform the method of any one of the above-described embodiments of the invention.
The hybrid text abstract generation method, the hybrid text abstract generation system, the hybrid text abstract generation terminal and the hybrid text abstract generation storage medium provided by the embodiment of the invention use a large-scale pre-training language model to perform vector representation of a text; copying the sentence vector and marking a copy or rewrite vector label to assist the decision of copying or rewriting the sentence; extracting the sentences which are most critical to the abstract by using a pointer network and determining to copy or rewrite the sentences; rewriting sentences to be rewritten by using a Pointer-generating network (Pointer-Generator) to make the sentences concise and smooth; and constructing a layered line strengthening training for two networks of a Pointer Network (Pointer Network) and a Pointer generation Network (Pointer-Generator). The hybrid automatic text summarization method and terminal based on hierarchical reinforcement learning provided by the embodiment of the invention combine the operations of copying and rewriting summarization generation, retain important information to the greatest extent, avoid unnecessary grammar errors, improve the generation quality, and optimize the cooperation relationship between the extraction and rewriting neural networks by hierarchical reinforcement learning.
In summary, the hybrid text summarization generation method, system, terminal and storage medium provided by the above embodiments of the present invention are a new hybrid summarization framework, and the extracted sentences and the rewritten sentences are mixed in the summary for the first time. The above-described embodiments of the present invention design a copy or rewrite mechanism to distinguish between sentences that can be directly used for summarization and sentences that need to be rewritten. In addition, the embodiment of the invention also provides an end-to-end layered reinforcement learning method for training and extracting the heavy-editing two-step model, and the method takes the extracted sentences as tasks from managers to workers, so that the collaboration between a Pointer Network (Pointer Network) as an extraction Network and a Pointer generation Network (Pointer-Generator) as an editing Network is greatly improved.
Those skilled in the art will appreciate that the modules and methods and steps described herein can be implemented on any hardware basis, operating system, programming language, and deep learning framework. The carrier with which the solution is implemented depends on the specific application of the solution and design constraints. Skilled artisans may implement the described functionality in varying ways for each particular application scenario or hardware carrier, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
It should be noted that, the steps in the method provided by the present invention may be implemented by using corresponding modules, devices, units, and the like in the system, and those skilled in the art may implement the composition of the system by referring to the technical solution of the method, that is, the embodiment in the method may be understood as a preferred example for constructing the system, and will not be described herein again.
Those skilled in the art will appreciate that, in addition to implementing the system and its various devices provided by the present invention in purely computer readable program code means, the method steps can be fully programmed to implement the same functions by implementing the system and its various devices in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Therefore, the system and various devices thereof provided by the present invention can be regarded as a hardware component, and the devices included in the system and various devices thereof for realizing various functions can also be regarded as structures in the hardware component; means for performing the functions may also be regarded as being embodied as software modules or as knots in hardware components for performing the method
The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes or modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention.
Claims (10)
1. A method for generating a hybrid text abstract, comprising:
performing sentence and word segmentation on an input text, and recording an index of a sentence where each word is located;
carrying out word and sentence characterization on the sentence segmentation and the word segmentation result in sequence to obtain a sentence vector;
copying each sentence vector, and adding copy and rewrite vectors to the original sentence vector and the copied sentence vector as vector marks respectively;
extracting important sentence vectors, and making a decision of copying or rewriting according to the vector marks;
editing and modifying the sentences needing to be rewritten to obtain a text abstract;
the editing and modifying the sentence needing to be rewritten comprises the following steps:
acquiring corresponding sentence original text from the input text according to the index of the sentence in which each word is positioned;
encoding and decoding operation is carried out by using a pointer generation network, and rewriting of the text is realized;
training a neural network adopted in the process of extracting important sentence vectors and editing and modifying sentences to be rewritten to finish gradient updating of all neural network parameters;
the extracting of the important sentence vectors comprises:
using a pointer network to carry out context expression on the sentence vectors;
calculating the relation between each sentence vector by using an attention model, and calculating the corresponding weight of each sentence vector;
and transferring the state variables of the pointer network, and sequentially selecting a plurality of important sentence vectors according to the weight of each sentence vector.
2. The method of generating a hybrid text abstract of claim 1, wherein the segmenting the input text into sentences and words and recording an index of the sentence in which each word is located comprises:
for the input text, punctuation marks are used as sentence ending marks to divide sentences;
performing word segmentation on each sentence obtained by sentence segmentation;
and recording the position information of each word in the sentence obtained by word segmentation, wherein the position information is used for indicating that each word is in the second sentence of the input text.
3. The method for generating a hybrid text abstract according to claim 1, wherein the characterizing the words and sentences in sequence on the sentence division and word division results to obtain a sentence vector comprises:
using a pre-training language model to express words obtained by word segmentation as vectors;
and averaging the word vectors of each sentence to obtain a sentence vector.
4. The hybrid text summarization generation method of claim 3 wherein the pre-trained language model employs a BERT model.
5. The hybrid text summarization generation method of claim 1 wherein the extracting of important sentence vectors and the editing and modifying of the sentences to be rewritten employ a pointer network and a pointer generation network, respectively; training the neural network adopted in the process by using layered reinforcement learning, wherein the training comprises the following steps:
the operation of two steps of extraction and rewriting are completed in sequence;
evaluating the extraction and rewriting results by using an automatic evaluation index;
and constructing a target function by taking the evaluation as a return, and performing uniform gradient updating on parameters of the pointer network and the pointer generation network.
6. The hybrid text summarization generation method of claim 5 wherein the step of constructing an objective function in return for the evaluation and performing a uniform gradient update on the parameters of the pointer network and the pointer generation network comprises:
the objective function L (θ) is constructed as:
wherein, at、ct、r、Rt、yt、btRespectively a behavior function, a state function, a return function, a feedback function, an edited text and a reference; r (a)t) Representing behavior a for the return of the pointer networktCurrent impact on the quality of the summary, Rt(at+1) Is an action atIs a feedback function of (a) represents the behavior atThe long-term influence on the quality of the summary, λ is the weighting coefficient of the feedback function, rw(yt) Generating network returns for pointers, beta generating network returns r for pointersw(yt) The weighting coefficient of (2); the behavior function is used for indicating the next behavior of the network, namely which sentence is extracted; the state function is used for representing the current state of the model; the return function is used for evaluating the current behavior atThe value of (D); the feedback function is used for evaluating the current behavior atLong term effects on the subsequent behavior of the model; the edited text is used for forming a sentence of the output abstract; the benchmark is used for evaluating the value of the current state;
iterating the objective function L (θ) until convergence.
7. The hybrid text summary generation method of claim 6, wherein the reference is generated using a synchronous action-rating A2C algorithm.
8. A hybrid text summarization system, comprising:
the sentence and word segmentation module is used for segmenting sentences and words of the input text and recording the index of the sentence where each word is located;
a sentence vector acquisition module, which is used for representing words and sentences in sequence according to the results of the sentence segmentation and the word segmentation to obtain a sentence vector;
the vector marking module is used for copying each sentence vector and respectively adding copy and rewrite vectors to the original sentence vector and the copied sentence vector as vector marks;
a decision module which extracts important sentence vectors and makes a decision of copying or rewriting according to the vector marks;
the extracting of the important sentence vectors comprises:
using a pointer network to carry out context expression on the sentence vectors;
calculating the relation between each sentence vector by using an attention model, and calculating the corresponding weight of each sentence vector;
transferring state variables of the pointer network, and sequentially selecting a plurality of important sentence vectors according to the weight of each sentence vector;
the text abstract generating module is used for editing and modifying the sentences needing to be rewritten to obtain a text abstract;
the editing and modifying the sentence needing to be rewritten comprises the following steps:
acquiring corresponding sentence original text from the input text according to the index of the sentence in which each word is positioned;
encoding and decoding operation is carried out by using a pointer generation network, and rewriting of the text is realized;
and the updating module finishes gradient updating of all parameters in the decision module and the text abstract generating module through training.
9. A terminal of a hybrid text summarization method, comprising a memory, a processor and a computer program stored on the memory and operable on the processor, wherein the processor is operable to perform the method of any of claims 1-7 when executing the computer program.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, is adapted to carry out the method of any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011429791.1A CN112464657B (en) | 2020-12-07 | 2020-12-07 | Hybrid text abstract generation method, system, terminal and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011429791.1A CN112464657B (en) | 2020-12-07 | 2020-12-07 | Hybrid text abstract generation method, system, terminal and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112464657A CN112464657A (en) | 2021-03-09 |
CN112464657B true CN112464657B (en) | 2022-07-08 |
Family
ID=74800445
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011429791.1A Active CN112464657B (en) | 2020-12-07 | 2020-12-07 | Hybrid text abstract generation method, system, terminal and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112464657B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113127632B (en) * | 2021-05-17 | 2022-07-26 | 同济大学 | Text summarization method and device based on heterogeneous graph, storage medium and terminal |
CN113362858B (en) * | 2021-07-27 | 2023-10-31 | 中国平安人寿保险股份有限公司 | Voice emotion classification method, device, equipment and medium |
TWI847696B (en) * | 2023-05-15 | 2024-07-01 | 中國信託商業銀行股份有限公司 | Summary generation method based on prompt engineering and its computing device |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103617158A (en) * | 2013-12-17 | 2014-03-05 | 苏州大学张家港工业技术研究院 | Method for generating emotion abstract of dialogue text |
CN109471933A (en) * | 2018-10-11 | 2019-03-15 | 平安科技(深圳)有限公司 | A kind of generation method of text snippet, storage medium and server |
CN109657051A (en) * | 2018-11-30 | 2019-04-19 | 平安科技(深圳)有限公司 | Text snippet generation method, device, computer equipment and storage medium |
CN110348016A (en) * | 2019-07-15 | 2019-10-18 | 昆明理工大学 | Text snippet generation method based on sentence association attention mechanism |
CN110705313A (en) * | 2019-10-09 | 2020-01-17 | 沈阳航空航天大学 | Text abstract generation method based on feature extraction and semantic enhancement |
CN111177366A (en) * | 2019-12-30 | 2020-05-19 | 北京航空航天大学 | Method, device and system for automatically generating extraction type document abstract based on query mechanism |
CN111666764A (en) * | 2020-06-02 | 2020-09-15 | 南京优慧信安科技有限公司 | XLNET-based automatic summarization method and device |
CN111723194A (en) * | 2019-03-18 | 2020-09-29 | 阿里巴巴集团控股有限公司 | Abstract generation method, device and equipment |
CN114139497A (en) * | 2021-12-13 | 2022-03-04 | 国家电网有限公司大数据中心 | Text abstract extraction method based on BERTSUM model |
-
2020
- 2020-12-07 CN CN202011429791.1A patent/CN112464657B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103617158A (en) * | 2013-12-17 | 2014-03-05 | 苏州大学张家港工业技术研究院 | Method for generating emotion abstract of dialogue text |
CN109471933A (en) * | 2018-10-11 | 2019-03-15 | 平安科技(深圳)有限公司 | A kind of generation method of text snippet, storage medium and server |
CN109657051A (en) * | 2018-11-30 | 2019-04-19 | 平安科技(深圳)有限公司 | Text snippet generation method, device, computer equipment and storage medium |
CN111723194A (en) * | 2019-03-18 | 2020-09-29 | 阿里巴巴集团控股有限公司 | Abstract generation method, device and equipment |
CN110348016A (en) * | 2019-07-15 | 2019-10-18 | 昆明理工大学 | Text snippet generation method based on sentence association attention mechanism |
CN110705313A (en) * | 2019-10-09 | 2020-01-17 | 沈阳航空航天大学 | Text abstract generation method based on feature extraction and semantic enhancement |
CN111177366A (en) * | 2019-12-30 | 2020-05-19 | 北京航空航天大学 | Method, device and system for automatically generating extraction type document abstract based on query mechanism |
CN111666764A (en) * | 2020-06-02 | 2020-09-15 | 南京优慧信安科技有限公司 | XLNET-based automatic summarization method and device |
CN114139497A (en) * | 2021-12-13 | 2022-03-04 | 国家电网有限公司大数据中心 | Text abstract extraction method based on BERTSUM model |
Also Published As
Publication number | Publication date |
---|---|
CN112464657A (en) | 2021-03-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112464657B (en) | Hybrid text abstract generation method, system, terminal and storage medium | |
CN109597891B (en) | Text emotion analysis method based on bidirectional long-and-short-term memory neural network | |
JP7087938B2 (en) | Question generator, question generation method and program | |
CN111651557B (en) | Automatic text generation method and device and computer readable storage medium | |
CN110263330B (en) | Method, device, equipment and storage medium for rewriting problem statement | |
CN106951512A (en) | A kind of end-to-end session control method based on hybrid coding network | |
Kreminski et al. | Felt: a simple story sifter | |
CN110019471A (en) | Text is generated from structural data | |
CN107526743A (en) | Method and apparatus for compressed file system metadata | |
CN111191002A (en) | Neural code searching method and device based on hierarchical embedding | |
CN109614457B (en) | Deep learning-based geographic information identification method and device | |
CN113127604B (en) | Comment text-based fine-grained item recommendation method and system | |
CN111191015A (en) | Neural network movie knowledge intelligent dialogue method | |
CN110110331A (en) | Document creation method, device, medium and calculating equipment | |
Purnell et al. | Old English vowels: Diachrony, privativity, and phonological representations | |
CN115934147A (en) | Automatic software restoration method and system, electronic equipment and storage medium | |
CN113569033A (en) | Government affair problem generation method and device | |
CN112818100A (en) | Knowledge tracking method and system fusing question difficulty | |
CN113806489A (en) | Method, electronic device and computer program product for dataset creation | |
CN110276081A (en) | Document creation method, device and storage medium | |
CN116992942A (en) | Natural language model optimization method, device, natural language model, equipment and medium | |
CN114358021B (en) | Task type dialogue statement reply generation method based on deep learning and storage medium | |
CN115080723B (en) | Automatic generation method for reading and understanding problems | |
CN116484868A (en) | Cross-domain named entity recognition method and device based on diffusion model generation | |
Jurdziński et al. | Shrinking restarting automata |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |