CN112464657A - Hybrid text abstract generation method, system, terminal and storage medium - Google Patents

Hybrid text abstract generation method, system, terminal and storage medium Download PDF

Info

Publication number
CN112464657A
CN112464657A CN202011429791.1A CN202011429791A CN112464657A CN 112464657 A CN112464657 A CN 112464657A CN 202011429791 A CN202011429791 A CN 202011429791A CN 112464657 A CN112464657 A CN 112464657A
Authority
CN
China
Prior art keywords
sentence
vector
text
word
sentences
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011429791.1A
Other languages
Chinese (zh)
Other versions
CN112464657B (en
Inventor
金耀辉
何浩
肖力强
陈文清
田济东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
Original Assignee
Shanghai Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University filed Critical Shanghai Jiaotong University
Priority to CN202011429791.1A priority Critical patent/CN112464657B/en
Publication of CN112464657A publication Critical patent/CN112464657A/en
Application granted granted Critical
Publication of CN112464657B publication Critical patent/CN112464657B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

The invention provides a method and a system for generating a hybrid text abstract, which are used for carrying out sentence segmentation and word segmentation on an input text and recording the index of a sentence in which each word is positioned; carrying out word and sentence characterization on the sentence segmentation and the word segmentation result in sequence to obtain a sentence vector; copying each sentence vector, and marking copied and rewritten vector marks on the two vectors respectively; extracting important sentence vectors, and making a decision of copying or rewriting according to the vector marks; editing and modifying the sentences needing to be rewritten to obtain a text abstract; and training the adopted neural network to finish gradient updating of the parameters. A corresponding terminal and a storage medium are also provided. The invention mixes the extracted sentences and abstract sentences in the abstract for the first time, and distinguishes the sentences directly used for the abstract and the rewritten sentences through a copying or rewriting mechanism; the layered reinforcement learning method training method takes the extracted sentences as tasks from managers to workers, and improves the collaboration between the two networks.

Description

Hybrid text abstract generation method, system, terminal and storage medium
Technical Field
The invention relates to the technical field of natural language processing, in particular to a hybrid text abstract generation method, a hybrid text abstract generation system, a hybrid text abstract generation terminal and a hybrid text abstract storage medium based on hierarchical reinforcement learning.
Background
The goal of the abstract is to rewrite a long chapter to a short, smooth version while preserving the most prominent content. With the successful application of neural networks in Natural Language Processing (NLP) tasks, two data-driven branch extractions and abstract digests stand out from various approaches. The extraction method generally selects the most prominent sentences from the source articles as the abstract, the content selection is accurate, the information amount of the result is large, but the redundancy is high because the sentences are not rewritten. In contrast, the abstract method can generate a more concise abstract through compression and interpretation, but the existing model is weak in content selection and is easy to lose key information. It will thus be seen that the two branches are complementary, which facilitates the combination of their advantages and forms an informative and concise summary. There are several techniques for accomplishing merging of these two branches that have been known in the art, most of which use a first-to-extract-then-abstract framework that first extracts the sentences worth summarizing and then abstracts each sentence. However, since all sentences are indiscriminately compressed and pruned, they suffer from information loss during the abstraction phase. When the whole sentence is crucial, some important contents can be deleted by mistake, resulting in serious information loss. In addition, the training methods of these techniques are also not end-to-end due to the lack of an efficient reinforcement learning framework to connect the two modules.
At present, no explanation or report of the similar technology of the invention is found, and similar data at home and abroad are not collected.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a hybrid text abstract generation method, a hybrid text abstract generation system, a hybrid text abstract generation terminal and a hybrid text abstract storage medium based on hierarchical reinforcement learning.
The invention is realized by the following technical scheme.
According to an aspect of the present invention, there is provided a hybrid text summary generation method, including:
performing sentence and word segmentation on an input text, and recording an index of a sentence where each word is located;
carrying out word and sentence characterization on the sentence segmentation and the word segmentation result in sequence to obtain a sentence vector;
copying each sentence vector, and adding copy and rewrite vectors to the original sentence vector and the copied sentence vector as vector marks respectively;
extracting important sentence vectors, and making a decision of copying or rewriting according to the vector marks;
editing and modifying the sentences needing to be rewritten to obtain a text abstract;
and training the neural network adopted in the process of extracting important sentence vectors and editing and modifying the sentences to be rewritten to finish gradient updating of all neural network parameters.
Preferably, the sentence and word segmentation of the input text and recording the index of the sentence in which each word is located includes:
for the input text, punctuation marks are used as sentence ending marks to divide sentences;
performing word segmentation on each sentence obtained by sentence segmentation;
and recording the position information of each word in the sentence obtained by word segmentation, wherein the position information is used for indicating that each word is in the second sentence of the input text.
Preferably, the characterizing words and sentences in sequence on the result of the sentence segmentation and the word segmentation to obtain a sentence vector includes:
using a pre-training language model to express words obtained by word segmentation as vectors;
and averaging the word vectors of each sentence to obtain a sentence vector.
Preferably, the pre-training language model employs a BERT model.
Preferably, the extracting of the important sentence vector includes:
using a pointer network to carry out context expression on the sentence vectors;
calculating the relation between each sentence vector by using an attention model, and calculating the corresponding weight of each sentence vector;
and transferring the state variables of the pointer network, and sequentially selecting a plurality of important sentence vectors according to the weight of each sentence vector.
Preferably, the editing and modifying the sentence needing to be rewritten includes:
acquiring corresponding sentence original text from the input text according to the index of the sentence in which each word is positioned;
and encoding and decoding by using a pointer generation network to realize the rewriting of the text.
Preferably, the extracting of the important sentence vector and the editing and modifying of the sentence to be rewritten respectively adopt a pointer network and a pointer generation network; training the neural network adopted in the process by using layered reinforcement learning, wherein the training comprises the following steps:
the operation of two steps of extraction and rewriting are completed in sequence;
evaluating the extraction and rewriting results by using an automatic evaluation index;
and constructing a target function by taking the evaluation as a return, and performing uniform gradient updating on parameters of the pointer network and the pointer generation network.
Preferably, the constructing an objective function by using the evaluation as a return, and performing uniform gradient update on parameters of the pointer network and the pointer generation network includes:
the objective function L (θ) is constructed as:
Figure BDA0002820234730000031
wherein, at、ct、r、Rt、yt、btRespectively a behavior function, a state function, a return function, a feedback function, an edited text and a reference; r (a)t) Representing behavior a for the return of the pointer networktCurrent impact on the quality of the summary, Rt(at+1) Is an action atIs a feedback function of (a) represents the behavior atThe long-term influence on the quality of the summary, λ is the weighting coefficient of the feedback function, rw(yt) Generating network returns for pointers, beta generating network returns r for pointersw(yt) The weighting coefficient of (2); the behavior function is used for indicating the next behavior of the network, namely which sentence is extracted; the state function is used for representing the current state of the model; the return function is used for evaluating the current behavior atThe value of (D); the feedback function is used for evaluating the current behavior atLong term effects on the subsequent behavior of the model; the edited text is used for forming a sentence of the output abstract; the benchmark is used for evaluating the value of the current state, and the fluctuation of the return function can be reduced;
iterating the objective function L (θ) until convergence.
Preferably, the baseline is generated using a synchronous action-evaluation A2C algorithm.
According to another aspect of the present invention, there is provided a hybrid text summary generation system, including:
the sentence and word segmentation module is used for segmenting sentences and words of the input text and recording the index of the sentence where each word is located;
a sentence vector acquisition module, which is used for representing words and sentences in sequence according to the results of the sentence segmentation and the word segmentation to obtain a sentence vector;
the vector marking module is used for copying each sentence vector and respectively adding copy and rewrite vectors to the original sentence vector and the copied sentence vector as vector marks;
a decision module which extracts important sentence vectors and makes a decision of copying or rewriting according to the vector marks;
the text abstract generating module is used for editing and modifying the sentences needing to be rewritten to obtain a text abstract;
and the updating module trains the process to finish gradient updating of all parameters in the decision module and the text abstract generating module.
According to a third aspect of the present invention, there is provided a terminal comprising a memory, a processor and a computer program stored on the memory and operable on the processor, wherein the processor, when executing the computer program, is operable to perform any of the methods described above.
According to a fourth aspect of the invention, there is provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, is operable to perform the method of any of the above.
Due to the adoption of the technical scheme, compared with the prior art, the invention has at least one of the following beneficial effects:
the hybrid text abstract generation method, the hybrid text abstract generation system, the hybrid text abstract generation terminal and the hybrid text abstract generation storage medium can flexibly switch between the copied sentences and the rewritten sentences according to the redundancy, so that the advantages of two branches of the abstract can be effectively combined, and both informativeness and simplicity are considered. In addition, based on layered reinforcement learning, an end-to-end reinforcement method is provided, an extraction module and a rewriting module are connected, the collaboration between the extraction module and the rewriting module is enhanced, and the extraction module and the rewriting module are dynamically adaptive to each other in the training process.
The invention provides a hybrid text abstract generating method, a system, a terminal and a storage medium, which adopt a two-step method to construct a framework: firstly, extracting a salient sentence from an input article, and distinguishing the sentence according to redundancy by using a copy rewriting decision mechanism; the final summary is then generated by copying or rewriting the selected sentence accordingly.
Compared with the existing common model information, the generated abstract is richer and the language is simpler.
Drawings
Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:
FIG. 1 is a flowchart illustrating a method for generating a hybrid text abstract according to an embodiment of the present invention;
FIG. 2 is a flow chart of a hybrid text summarization generation method according to a preferred embodiment of the present invention;
FIG. 3 is a flow chart of a hybrid text summarization generation method according to a preferred embodiment of the present invention;
FIG. 4 is a diagram illustrating the working process of the hybrid text summarization generation method according to a preferred embodiment of the present invention;
fig. 5 is a schematic diagram illustrating components of a hybrid text summarization system according to an embodiment of the present invention.
Detailed Description
The following examples illustrate the invention in detail: the embodiment is implemented on the premise of the technical scheme of the invention, and a detailed implementation mode and a specific operation process are given. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention.
Fig. 1 is a flowchart of a hybrid text summarization generation method according to an embodiment of the present invention.
As shown in fig. 1, the hybrid text summarization generating method provided in this embodiment may include the following steps:
s100, performing sentence segmentation and word segmentation on an input text, and recording an index of a sentence where each word is located;
s200, performing word and sentence characterization on the sentence and word segmentation results in sequence to obtain a sentence vector;
s300, copying each sentence vector, and adding a copy vector and a rewrite vector to the original sentence vector and the copied sentence vector respectively to serve as vector marks;
s400, extracting important sentence vectors, and making a copy or rewrite decision according to the vector marks;
s500, editing and modifying the sentence needing to be rewritten to obtain a text abstract;
s600, training the neural network adopted in the process of extracting important sentence vectors and editing and modifying the sentences needing to be rewritten, and finishing gradient updating of all neural network parameters.
In S100 of this embodiment, performing sentence segmentation and word segmentation on the input text, and recording an index of a sentence in which each word is located preferably includes:
s101, for an input text, taking punctuation marks as sentence ending marks to perform sentence division;
s102, performing word segmentation on each sentence obtained by sentence segmentation;
and S103, recording position information of each word in the sentence obtained by word segmentation, wherein the position information is used for indicating the several sentences of each word in the input text.
In S200 of this embodiment, performing word and sentence characterization on the sentence segmentation and the word segmentation result in sequence to obtain a sentence vector, preferably including:
s201, representing words obtained by word segmentation as vectors by using a pre-training language model;
s202, averaging the word vectors of each sentence to obtain a sentence vector.
In a specific application example of this embodiment, the pretrained language model preferably uses a BERT model.
In S400 of this embodiment, extracting important sentence vectors preferably includes:
s401, using a pointer network to represent the context of the sentence vectors;
s402, calculating the relation between each sentence vector by using an attention model, and calculating the corresponding weight of each sentence vector;
and S403, transferring the state variables of the pointer network, and sequentially selecting a plurality of important sentence vectors according to the weight of each sentence vector.
In S500 of this embodiment, editing and modifying the sentence to be rewritten preferably includes:
s501, acquiring corresponding sentence original texts from input texts according to the index of the sentence where each word is located;
and S502, encoding and decoding the text by using the pointer generation network to realize the rewriting of the text.
In S600 of this embodiment, extracting important sentence vectors and editing and modifying sentences to be rewritten respectively employ a pointer network and a pointer generation network; preferably, the training of the neural network used in the above process using hierarchical reinforcement learning preferably includes:
s601, finishing the operations of the two steps of extraction and rewriting in sequence;
s602, evaluating the extraction and rewriting results by using an automatic evaluation index;
s603, constructing a target function by taking the evaluation as a return, and performing uniform gradient updating on parameters of the pointer network and the pointer generation network.
In a specific application example of this embodiment, constructing an objective function with the evaluation as a reward, and performing uniform gradient update on parameters of the pointer network and the pointer generation network preferably includes:
s6031, constructing an objective function L (θ) as:
Figure BDA0002820234730000061
wherein, at、ct、r、Rt、yt、btRespectively a behavior function, a state function, a return function, a feedback function, an edited text and a reference; r (a)t) Representing behavior a for the return of the pointer networktCurrent impact on the quality of the summary, Rt(at+1) Is an action atIs a feedback function of (a) represents the behavior atThe long-term influence on the quality of the summary, λ is the weighting coefficient of the feedback function, rw(yt) Generating network returns for pointers, beta generating network returns r for pointersw(yt) The weighting coefficient of (2); behavioral function for indicatingThe next action of the network is to extract which sentence; the state function is used for representing the current state of the model; a reward function for evaluating the current behavior atCurrent impact on summary quality; feedback function for evaluating current behavior atLong term effects on the subsequent behavior of the model; the edited text is used for forming a sentence of the output abstract; the benchmark is used for evaluating the value of the current state, and the fluctuation of the return function can be reduced;
s6032, the objective function L (θ) is iterated until convergence.
In a specific example of this embodiment, the baseline is preferably generated using a synchronous action-evaluation A2C algorithm.
Fig. 2 is a flowchart of a hybrid text summarization generation method according to a preferred embodiment of the present invention.
As shown in fig. 2, the hybrid text summary generation method provided by the preferred embodiment may include the following steps:
step 1, performing sentence segmentation and word segmentation on an input text, and recording an index of a sentence where each word is located;
step 2, performing word and sentence characterization on the sentence and word segmentation results in sequence to obtain a sentence vector;
step 3, copying each sentence vector, and adding a copy vector and a rewrite vector as marks to the two copied vectors respectively;
step 4, extracting important sentence vectors by using a Pointer Network (Pointer Network), and making a decision of copying or rewriting according to the vector marks;
step 5, editing and modifying the sentence needing to be rewritten by using a Pointer-generated network (Pointer-Generator) to obtain a text abstract;
and 6, training the pointer network and the pointer generation network used in the process by using layered reinforcement learning to finish gradient updating of all neural network parameters.
As a preferred embodiment, in step 1, performing sentence segmentation and word segmentation on an input text, and recording an index of a sentence in which each word is located, includes:
step 1.1, for an input text, carrying out sentence division according to a sentence ending symbol;
step 1.2, for each sentence obtained by sentence segmentation, performing word segmentation based on a space;
and 1.3, recording the position information of each word in the sentence obtained by word segmentation. The position information indicates that each word is in the second sentence of the input text.
As a preferred embodiment, in step 2, performing word and sentence characterization in sequence to obtain a sentence vector, including:
step 2.1, representing words obtained by word segmentation as vectors by using a pre-training language model;
and 2.2, averaging the word vectors of each sentence to obtain a sentence vector.
As a preferred embodiment, the pre-trained language model employs a BERT model.
As a preferred embodiment, in step 4, extracting important sentence vectors by using a Pointer Network (Pointer Network) includes:
step 4.1, using a Pointer Network (Pointer Network) to carry out context expression on the sentence vectors;
step 4.2, calculating the relation between each sentence vector by using an attention model, and calculating the weight which each sentence vector should obtain;
and 4.3, transferring the state variables of the pointer network, and sequentially selecting a plurality of important sentence vectors according to the weight of each sentence vector.
In step 5, as a preferred embodiment, the editing and modifying the sentence to be rewritten using a Pointer-Generator network (Pointer-Generator) includes:
step 5.1, acquiring corresponding sentence original text from the input text according to the index of the sentence where each word is located;
and 5.2, encoding and decoding by using the pointer network to realize the rewriting of the text.
As a preferred embodiment, in step 6, the above process is trained using hierarchical reinforcement learning, which includes:
step 6.1, the operation of the two steps of extraction and rewriting is completed in sequence;
step 6.2, evaluating the extraction and rewriting results by using an automatic evaluation index (such as ROUGE);
and 6.3, constructing a target function by taking the evaluation as a return, and performing uniform gradient updating on parameters of the adopted pointer network and the pointer generation network.
As a preferred embodiment, in step 6.3, an objective function is constructed by taking the evaluation as a return, and the uniform gradient update is performed on the parameters of the adopted pointer network and the pointer generation network, including:
step 6.31, constructing an objective function L (theta) as follows:
Figure BDA0002820234730000081
wherein, at、ct、r、Rt、yt、btRespectively a behavior function, a state function, a return function, a feedback function, a compiled text and a reference; r (a)t) Representing behavior a for the return of the pointer networktCurrent impact on the quality of the summary, Rt (a)t+1) Is an action atIs a feedback function of (a) represents the behavior atThe long-term influence on the abstract quality, wherein lambda is a weighting coefficient of a feedback function; r isw(yt) Generating network returns for pointers, beta generating network returns r for pointersw(yt) The weighting coefficient of (2); a behavior function for indicating the next behavior of the network, i.e. which sentence is extracted; the state function is used for representing the current state of the model; a reward function for evaluating the current behavior atCurrent impact on summary quality; feedback function for evaluating current behavior atLong term effects on the subsequent behavior of the model; the edited text is used for forming a sentence of the output abstract; the benchmark is used for evaluating the value of the current state, and the fluctuation of the return function can be reduced;
s6.32, iterate the objective function L (θ) until convergence.
As a preferred embodiment, the baseline is generated using a synchronized action-evaluation A2C algorithm.
Fig. 3 is a flowchart of a hybrid text summarization generation method according to another preferred embodiment of the present invention.
As shown in fig. 3, the hybrid text summarization generating method provided by the preferred embodiment may include the following steps:
step S101: sentence and word segmentation are carried out on the input text, and the index of the sentence where each word is located is recorded:
firstly, for input text, a sentence is divided according to a sentence ending symbol. And segmenting words based on blank spaces for each sentence. In this process we record the position information of which sentence each word belongs to.
Step S102: and (3) utilizing a layered BERT representation mechanism to sequentially represent words and sentences on the input text clause and clause result to obtain a sentence vector:
in this embodiment, a BERT model is used, and this step provides a hierarchical BERT representation method based on sentence coding of a two-layer BERT network. The whole article is first put into a pre-trained BERT model (i.e., a pre-trained language model) to make each word have a broad context, which helps to more accurately represent its meaning. Then, word representations are obtained by merging the hidden vectors of the last four layers of the BERT network and passing into multiple layers of perceptron layers. This step injects context and word positions throughout the article into the word vector.
A preliminary representation of each sentence is obtained by performing an average pool operation on the word vectors.
Then, to embed the sentence position information and sentence-level context into the representation, they are further input into a single-layer BERT, resulting in the final sentence vector hi
Step S103: each sentence vector is duplicated, and the two duplicated vectors are marked with copied and rewritten vector marks respectively:
after sentence vector representation, the sentence vector is copied and added separatelyTwo different marker vectors, copy vector hcAnd a rewrite vector hr
Figure BDA0002820234730000091
Figure BDA0002820234730000092
Wherein,
Figure BDA0002820234730000093
in order to copy the vector, the vector is copied,
Figure BDA0002820234730000094
is an overwrite vector;
the token vector is a trainable parameter that helps the model distinguish two different operations for each sentence. Each sentence now has two different versions of the vector. When the pointer network selects a duplicate version of a sentence, it will be added directly to the summary without any version. Conversely, if the rewrite version is selected, the sentence is rewritten (compressed or rewritten) to reduce redundancy.
Step S104: important sentences are extracted using a pointer network and a decision to copy or rewrite is made based on the vector tags.
Each sentence now has two different versions of the vector. The vector in step S103 is selected using a pointer network that uses an attention mechanism to select important sentences. There are two choices for the pointer network to select each sentence. A sentence selection copy
Figure BDA0002820234730000097
When version, it will be added directly to the summary without any editing. Conversely, if the overwrite version is selected
Figure BDA0002820234730000096
The sentence will be rewritten (compressed or rewritten) to reduce redundancy. Tong (Chinese character of 'tong')By the method, two motion spaces can be successfully combined into one space, so that the motion space is suitable for the current reinforcement learning.
Step S105: editing and modifying the sentence needing to be rewritten by using a coder-decoder to make the sentence simple and smooth:
this step copies or rewrites the corresponding sentence according to the decision of step S104, generating a final digest. Their copying operation is to retain all information in case the extracted sentence is already sufficiently concise, while the rewriting operation is to simplify or interpret the superfluous sentence. The corresponding sentence is required to be rewritten using an encoder-aligner-decoder network with a replication mechanism.
Step S106: training the pointer network and the pointer generation network adopted in the process by using layered reinforcement learning to finish gradient updating of all neural network parameters:
in the hierarchical reinforcement learning HRL method, an extraction module, i.e., a Pointer Network (Pointer Network), is regarded as a manager operating at a sentence layer, and a sentence editing module, i.e., a Pointer-generating Network (Pointer-Generator), is regarded as a worker operating at a word layer. The task is the decision of the selected sentence and the copying or rewriting. We also consider the worker's rewards in estimating the administrator's rewards, which more accurately describes the impact of the administrator's behavior on the summary. Wherein the manager uses the objective function in each round
Figure BDA0002820234730000095
Performing parameter update, wherein r (a)t) Is the manager (the extraction module) in its own report, and rw(yt) In return for the worker (editing module).
For the worker we perform parameter updates using the following objective function:
Figure BDA0002820234730000101
wherein b istCan be anyThe method generates a reference function.
The basic idea of the hybrid text abstract generation method provided by the above embodiment of the present invention is as follows: as shown in fig. 4, a two-step method is used to construct the framework. Salient sentences (important sentences) are first extracted from the input text, and the sentences are distinguished according to redundancy using a copy or rewrite mechanism. The final summary is then generated by copying or rewriting the selected sentence accordingly. In addition, the embodiment of the invention provides an end-to-end neural network training method based on layered reinforcement learning, two independent steps of extraction and editing are connected, the collaboration between the two steps is enhanced, and the two steps are dynamically adaptive to each other in the training process.
Another embodiment of the present invention provides a hybrid text summarization generating system, as shown in fig. 5, which may include: the sentence segmentation and word segmentation system comprises a sentence segmentation and word segmentation module, a sentence vector acquisition module, a vector marking module, a decision module, a text abstract generation module and an updating module.
Wherein:
the sentence and word segmentation module is used for segmenting sentences and words of the input text and recording the index of the sentence where each word is located;
a sentence vector acquisition module, which is used for representing words and sentences in sequence according to the results of the sentence segmentation and the word segmentation to obtain a sentence vector;
the vector marking module is used for copying each sentence vector and respectively adding copy and rewrite vectors to the original sentence vector and the copied sentence vector as vector marks;
a decision module which extracts important sentence vectors and makes a decision of copying or rewriting according to the vector marks;
the text abstract generating module is used for editing and modifying the sentences needing to be rewritten to obtain a text abstract;
and the updating module trains the process to finish gradient updating of all parameters in the decision module and the text abstract generating module.
A third embodiment of the present invention provides a terminal, including a memory, a processor, and a computer program stored on the memory and capable of running on the processor, wherein the processor, when executing the computer program, is capable of performing the method of any one of the above embodiments.
Optionally, a memory for storing a program; a Memory, which may include a volatile Memory (RAM), such as a Random Access Memory (SRAM), a Double Data Rate Synchronous Dynamic Random Access Memory (DDR SDRAM), and the like; the memory may also comprise a non-volatile memory, such as a flash memory. The memories are used to store computer programs (e.g., applications, functional modules, etc. that implement the above-described methods), computer instructions, etc., which may be stored in partition in the memory or memories. And the computer programs, computer instructions, data, etc. described above may be invoked by a processor.
The computer programs, computer instructions, etc. described above may be stored in one or more memories in a partitioned manner. And the computer programs, computer instructions, data, etc. described above may be invoked by a processor.
A processor for executing the computer program stored in the memory to implement the steps of the method according to the above embodiments. Reference may be made in particular to the description relating to the preceding method embodiment.
The processor and the memory may be separate structures or may be an integrated structure integrated together. When the processor and the memory are separate structures, the memory, the processor may be coupled by a bus.
A fourth embodiment of the invention provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, is operable to perform the method of any one of the above-described embodiments of the invention.
The hybrid text abstract generation method, the hybrid text abstract generation system, the hybrid text abstract generation terminal and the hybrid text abstract generation storage medium provided by the embodiment of the invention use a large-scale pre-training language model to perform vector representation of a text; copying the sentence vector and marking a copy or rewrite vector label to assist the decision of copying or rewriting the sentence; extracting the sentences which are most critical to the abstract by using a pointer network and determining to copy or rewrite the sentences; rewriting sentences to be rewritten by using a Pointer-generating network (Pointer-Generator) to make the sentences concise and smooth; and constructing a layered line strengthening training for two networks of a Pointer Network (Pointer Network) and a Pointer generation Network (Pointer-Generator). The hybrid automatic text summarization method and terminal based on hierarchical reinforcement learning provided by the embodiments of the present invention combine two summarization generation operations of copying and rewriting, retain important information to the greatest extent, avoid unnecessary syntax errors, improve the generation quality, and optimize the cooperation relationship between the two neural networks of extraction and rewriting through hierarchical reinforcement learning.
In summary, the hybrid text summarization generation method, system, terminal and storage medium provided by the above embodiments of the present invention are a new hybrid summarization framework, and the extracted sentences and the rewritten sentences are mixed in the summary for the first time. The above-described embodiments of the present invention design a copy or rewrite mechanism to distinguish between sentences that can be directly used for summarization and sentences that need to be rewritten. In addition, the embodiment of the invention also provides an end-to-end layered reinforcement learning method for training and extracting the heavy-editing two-step model, and the method takes the extracted sentences as tasks from managers to workers, so that the collaboration between a Pointer Network (Pointer Network) as an extraction Network and a Pointer generation Network (Pointer-Generator) as an editing Network is greatly improved.
Those skilled in the art will appreciate that the modules and methods and steps described herein can be implemented on any hardware basis, operating system, programming language, and deep learning framework. The carrier with which the solution is implemented depends on the specific application of the solution and design constraints. Skilled artisans may implement the described functionality in varying ways for each particular application scenario or hardware carrier, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
It should be noted that, the steps in the method provided by the present invention may be implemented by using corresponding modules, devices, units, and the like in the system, and those skilled in the art may implement the composition of the system by referring to the technical solution of the method, that is, the embodiment in the method may be understood as a preferred example for constructing the system, and will not be described herein again.
Those skilled in the art will appreciate that, in addition to implementing the system and its various devices provided by the present invention in purely computer readable program code means, the method steps can be fully programmed to implement the same functions by implementing the system and its various devices in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Therefore, the system and various devices thereof provided by the present invention can be regarded as a hardware component, and the devices included in the system and various devices thereof for realizing various functions can also be regarded as structures in the hardware component; means for performing the functions may also be regarded as being embodied as software modules or as knots in hardware components for performing the method
The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes and modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention.

Claims (12)

1. A hybrid text summary generation method, comprising:
performing sentence and word segmentation on an input text, and recording an index of a sentence where each word is located;
carrying out word and sentence characterization on the sentence segmentation and the word segmentation result in sequence to obtain a sentence vector;
copying each sentence vector, and adding copy and rewrite vectors to the original sentence vector and the copied sentence vector as vector marks respectively;
extracting important sentence vectors, and making a decision of copying or rewriting according to the vector marks;
editing and modifying the sentences needing to be rewritten to obtain a text abstract;
and training the neural network adopted in the process of extracting important sentence vectors and editing and modifying the sentences to be rewritten to finish gradient updating of all neural network parameters.
2. The hybrid text summarization generation method of claim 1, wherein the segmenting the input text into sentences and words and recording an index of the sentence in which each word is located comprises:
for the input text, punctuation marks are used as sentence ending marks to divide sentences;
performing word segmentation on each sentence obtained by sentence segmentation;
and recording the position information of each word in the sentence obtained by word segmentation, wherein the position information is used for indicating that each word is in the second sentence of the input text.
3. The method of generating a hybrid text summary according to claim 1, wherein the characterizing the words and sentences in sequence on the result of the sentence segmentation and the word segmentation to obtain a sentence vector comprises:
using a pre-training language model to express words obtained by word segmentation as vectors;
and averaging the word vectors of each sentence to obtain a sentence vector.
4. The hybrid text summarization generation method of claim 3 wherein the pre-trained language model employs a BERT model.
5. The hybrid text summarization generation method of claim 1 wherein the extracting important sentence vectors comprises:
using a pointer network to carry out context expression on the sentence vectors;
calculating the relation between each sentence vector by using an attention model, and calculating the corresponding weight of each sentence vector;
and transferring the state variables of the pointer network, and sequentially selecting a plurality of important sentence vectors according to the weight of each sentence vector.
6. The hybrid text summarization generation method of claim 1 wherein the editing and modifying the sentence that needs to be rewritten comprises:
acquiring corresponding sentence original text from the input text according to the index of the sentence in which each word is positioned;
and encoding and decoding by using a pointer generation network to realize the rewriting of the text.
7. The hybrid text summarization generation method of claim 1 wherein the extracting of important sentence vectors and the editing and modifying of the sentences to be rewritten employ a pointer network and a pointer generation network, respectively; training the neural network adopted in the process by using layered reinforcement learning, wherein the training comprises the following steps:
the operation of two steps of extraction and rewriting are completed in sequence;
evaluating the extraction and rewriting results by using an automatic evaluation index;
and constructing a target function by taking the evaluation as a return, and performing uniform gradient updating on parameters of the pointer network and the pointer generation network.
8. The hybrid text summarization generation method of claim 7, wherein the step of constructing an objective function using the evaluation as a reward to perform uniform gradient update on parameters of the pointer network and the pointer generation network comprises:
the objective function L (θ) is constructed as:
Figure FDA0002820234720000021
wherein, at、ct、r、Rt、yt、btRespectively a behavior function, a state function, a return function, a feedback function, an edited text and a reference; r (a)t) Representing behavior a for the return of the pointer networktCurrent impact on the quality of the summary, Rt(at+1) Is an action atIs a feedback function of (a) represents the behavior atThe long-term influence on the quality of the summary, λ is the weighting coefficient of the feedback function, rw(yt) Generating network returns for pointers, beta generating network returns r for pointersw(yt) The weighting coefficient of (2); the behavior function is used for indicating the next behavior of the network, namely which sentence is extracted; the state function is used for representing the current state of the model; the return function is used for evaluating the current behavior atThe value of (D); the feedback function is used for evaluating the current behavior atLong term effects on the subsequent behavior of the model; the edited text is used for forming a sentence of the output abstract; the benchmark is used for evaluating the value of the current state;
iterating the objective function L (θ) until convergence.
9. The hybrid text summarization method of claim 8 wherein the baseline is generated using a synchronous action-rating A2C algorithm.
10. A hybrid text summarization system, comprising:
the sentence and word segmentation module is used for segmenting sentences and words of the input text and recording the index of the sentence where each word is located;
a sentence vector acquisition module, which is used for representing words and sentences in sequence according to the results of the sentence segmentation and the word segmentation to obtain a sentence vector;
the vector marking module is used for copying each sentence vector and respectively adding copy and rewrite vectors to the original sentence vector and the copied sentence vector as vector marks;
a decision module which extracts important sentence vectors and makes a decision of copying or rewriting according to the vector marks;
the text abstract generating module is used for editing and modifying the sentences needing to be rewritten to obtain a text abstract;
and the updating module trains the process to finish gradient updating of all parameters in the decision module and the text abstract generating module.
11. A terminal comprising a memory, a processor and a computer program stored on the memory and operable on the processor, wherein the computer program, when executed by the processor, is operable to perform the method of any of claims 1 to 9.
12. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, is adapted to carry out the method of any one of claims 1-9.
CN202011429791.1A 2020-12-07 2020-12-07 Hybrid text abstract generation method, system, terminal and storage medium Active CN112464657B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011429791.1A CN112464657B (en) 2020-12-07 2020-12-07 Hybrid text abstract generation method, system, terminal and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011429791.1A CN112464657B (en) 2020-12-07 2020-12-07 Hybrid text abstract generation method, system, terminal and storage medium

Publications (2)

Publication Number Publication Date
CN112464657A true CN112464657A (en) 2021-03-09
CN112464657B CN112464657B (en) 2022-07-08

Family

ID=74800445

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011429791.1A Active CN112464657B (en) 2020-12-07 2020-12-07 Hybrid text abstract generation method, system, terminal and storage medium

Country Status (1)

Country Link
CN (1) CN112464657B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113127632A (en) * 2021-05-17 2021-07-16 同济大学 Text summarization method and device based on heterogeneous graph, storage medium and terminal
CN113362858A (en) * 2021-07-27 2021-09-07 中国平安人寿保险股份有限公司 Voice emotion classification method, device, equipment and medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103617158A (en) * 2013-12-17 2014-03-05 苏州大学张家港工业技术研究院 Method for generating emotion abstract of dialogue text
CN109471933A (en) * 2018-10-11 2019-03-15 平安科技(深圳)有限公司 A kind of generation method of text snippet, storage medium and server
CN109657051A (en) * 2018-11-30 2019-04-19 平安科技(深圳)有限公司 Text snippet generation method, device, computer equipment and storage medium
CN110348016A (en) * 2019-07-15 2019-10-18 昆明理工大学 Text snippet generation method based on sentence association attention mechanism
CN110705313A (en) * 2019-10-09 2020-01-17 沈阳航空航天大学 Text abstract generation method based on feature extraction and semantic enhancement
CN111177366A (en) * 2019-12-30 2020-05-19 北京航空航天大学 Method, device and system for automatically generating extraction type document abstract based on query mechanism
CN111666764A (en) * 2020-06-02 2020-09-15 南京优慧信安科技有限公司 XLNET-based automatic summarization method and device
CN111723194A (en) * 2019-03-18 2020-09-29 阿里巴巴集团控股有限公司 Abstract generation method, device and equipment
CN114139497A (en) * 2021-12-13 2022-03-04 国家电网有限公司大数据中心 Text abstract extraction method based on BERTSUM model

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103617158A (en) * 2013-12-17 2014-03-05 苏州大学张家港工业技术研究院 Method for generating emotion abstract of dialogue text
CN109471933A (en) * 2018-10-11 2019-03-15 平安科技(深圳)有限公司 A kind of generation method of text snippet, storage medium and server
CN109657051A (en) * 2018-11-30 2019-04-19 平安科技(深圳)有限公司 Text snippet generation method, device, computer equipment and storage medium
CN111723194A (en) * 2019-03-18 2020-09-29 阿里巴巴集团控股有限公司 Abstract generation method, device and equipment
CN110348016A (en) * 2019-07-15 2019-10-18 昆明理工大学 Text snippet generation method based on sentence association attention mechanism
CN110705313A (en) * 2019-10-09 2020-01-17 沈阳航空航天大学 Text abstract generation method based on feature extraction and semantic enhancement
CN111177366A (en) * 2019-12-30 2020-05-19 北京航空航天大学 Method, device and system for automatically generating extraction type document abstract based on query mechanism
CN111666764A (en) * 2020-06-02 2020-09-15 南京优慧信安科技有限公司 XLNET-based automatic summarization method and device
CN114139497A (en) * 2021-12-13 2022-03-04 国家电网有限公司大数据中心 Text abstract extraction method based on BERTSUM model

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113127632A (en) * 2021-05-17 2021-07-16 同济大学 Text summarization method and device based on heterogeneous graph, storage medium and terminal
CN113362858A (en) * 2021-07-27 2021-09-07 中国平安人寿保险股份有限公司 Voice emotion classification method, device, equipment and medium
CN113362858B (en) * 2021-07-27 2023-10-31 中国平安人寿保险股份有限公司 Voice emotion classification method, device, equipment and medium

Also Published As

Publication number Publication date
CN112464657B (en) 2022-07-08

Similar Documents

Publication Publication Date Title
Huang et al. Gamepad: A learning environment for theorem proving
CN109597891B (en) Text emotion analysis method based on bidirectional long-and-short-term memory neural network
CN1983266B (en) File system storing transaction records in flash-like media
CN112464657B (en) Hybrid text abstract generation method, system, terminal and storage medium
CN106951512A (en) A kind of end-to-end session control method based on hybrid coding network
CN107506414A (en) A kind of code based on shot and long term memory network recommends method
CN112464658B (en) Text abstract generation method, system, terminal and medium based on sentence fusion
CN110019471A (en) Text is generated from structural data
CN111047482A (en) Knowledge tracking system and method based on hierarchical memory network
CN113127604B (en) Comment text-based fine-grained item recommendation method and system
Purnell et al. Old English vowels: Diachrony, privativity, and phonological representations
CN110110331A (en) Document creation method, device, medium and calculating equipment
CN115934147A (en) Automatic software restoration method and system, electronic equipment and storage medium
CN117390336A (en) Webpage process automation method, device, equipment and storage medium
CN116992942B (en) Natural language model optimization method, device, natural language model, equipment and medium
CN113806489A (en) Method, electronic device and computer program product for dataset creation
CN110276081A (en) Document creation method, device and storage medium
CN111737417B (en) Method and device for correcting natural language generated result
CN109614457B (en) Deep learning-based geographic information identification method and device
CN114358021B (en) Task type dialogue statement reply generation method based on deep learning and storage medium
CN116502648A (en) Machine reading understanding semantic reasoning method based on multi-hop reasoning
CN115270795A (en) Small sample learning-based named entity recognition technology in environmental assessment field
CN112069777B (en) Two-stage data-to-text generation method based on skeleton
Guillon et al. Two-Way Automata and One-Tape Machines: Read Only Versus Linear Time
CN110532391A (en) A kind of method and device of text part-of-speech tagging

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant