CN113642324B

CN113642324B - Text abstract generation method and device, electronic equipment and storage medium

Info

Publication number: CN113642324B
Application number: CN202110963382.8A
Authority: CN
Inventors: 念天磊
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2021-08-20
Filing date: 2021-08-20
Publication date: 2024-02-09
Anticipated expiration: 2041-08-20
Also published as: CN113642324A

Abstract

The disclosure provides a text abstract generation method, a device, electronic equipment and a storage medium, and relates to the technical field of computer technology, in particular to the technical field of natural language processing. The specific implementation scheme is as follows: inputting the target text into a text abstract generating model; selecting a target word set with the accumulated probability exceeding a preset threshold value from the candidate word set according to the descending order of the prediction probability of each candidate word aiming at the candidate word set output by the text abstract generation model in each round of word prediction, and selecting the prediction word of the current round from the target word set; and in response to the selected predicted word as the end identifier, splicing the predicted words of each round in turn to obtain the abstract text of the target text. The number of words in the target word set can be dynamically adjusted along with the probability distribution of the candidate words, and the predicted words are selected from the target word set, so that the diversity of abstract texts is improved.

Description

Text abstract generation method and device, electronic equipment and storage medium

Technical Field

The present disclosure relates to the field of computer technology, and in particular, to the field of natural language processing technology.

Background

The text abstract generation is an important research field in natural language processing, can convert a longer text into a shorter text containing key information, and plays an important role in the fields of intelligent question-answering robots, news abstracts, landing page abstracts and the like.

Disclosure of Invention

The disclosure provides a text abstract generation method, a device, equipment and a storage medium.

According to an aspect of the present disclosure, there is provided a text summary generating method, including: inputting the target text into a text abstract generating model;

selecting a target word set with the accumulated probability exceeding a preset threshold value from the candidate word set according to the descending order of the prediction probability of each candidate word aiming at the candidate word set output by the text abstract generation model in each round of word prediction, and selecting the prediction word of the current round from the target word set;

and in response to the selected predicted word as the end identifier, splicing the predicted words of each round in turn to obtain the abstract text of the target text.

According to another aspect of the present disclosure, there is provided a text summary generation model training method including:

acquiring an initial model;

acquiring a plurality of groups of sample text sets; wherein each sample text set comprises a sample target text and a plurality of sample abstract texts of different text styles corresponding to the sample target text;

sequentially taking each sample target text and one sample abstract text corresponding to the sample target text as training samples of the initial model, and iteratively training the initial model;

judging whether the termination condition of model training is satisfied, if so, determining the current model as a text abstract generation model.

According to another aspect of the present disclosure, there is provided a text digest generating apparatus including:

the input module is used for inputting the target text into the text abstract generation model;

the prediction module is used for predicting and outputting candidate word sets of each round of words aiming at the text abstract generation model, selecting a target word set with the accumulated probability exceeding a preset threshold value from the candidate word sets according to the descending order of the prediction probability of each candidate word, and selecting the predicted word of the current round from the target word set;

and the splicing module is used for responding to the selected predicted word as the ending identifier, and splicing the predicted words of each round in turn to obtain the abstract text of the target text.

According to another aspect of the present disclosure, there is provided a text summary generation model training apparatus, including:

the first acquisition module is used for acquiring an initial model;

the second acquisition module is used for acquiring a plurality of groups of sample text sets; wherein each sample text set comprises a sample target text and a plurality of sample abstract texts of different text styles corresponding to the sample target text;

the iterative training module is used for sequentially taking each sample target text and one sample abstract text corresponding to the sample target text as training samples of the initial model and iteratively training the initial model;

and the judging module is used for judging whether the termination condition of model training is met, and if so, determining the current model as a text abstract generating model.

According to still another aspect of the present disclosure, there is provided an electronic apparatus including:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a text digest generation method and/or a text digest generation model training method.

According to yet another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform a text digest generation method and/or a text digest generation model training method.

According to yet another aspect of the present disclosure, a computer program product is provided, comprising a computer program which, when executed by a processor, implements a text digest generation method and/or a text digest generation model training method.

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.

Drawings

The drawings are for a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

fig. 1 is a schematic flow chart of a text summary generation method according to an embodiment of the disclosure;

FIG. 2 is a schematic flow chart of a training method for a text summary generation model according to an embodiment of the disclosure;

FIG. 3 is a block diagram of an apparatus for implementing a text excerpt generation method of an embodiment of the present disclosure;

FIG. 4 is a block diagram of an apparatus for implementing a text summarization model training method of an embodiment of the present disclosure;

fig. 5 is a block diagram of an electronic device for implementing a text excerpt generation method of an embodiment of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

In the related art, a longer text is input into a text abstract generating model, the text abstract generating model calculates the prediction result of each round of word prediction, then a decoding module selects a word with higher probability from the prediction result as a prediction word, and the text abstract can be obtained by splicing the prediction words of each round.

Decoding is understood to mean selecting a word from among a plurality of predicted words as the output word of the current round. The decoding algorithm of the decoding module directly affects the final output text digest, and common decoding algorithms include greedy search, beam search, and topk sampling.

The word with the highest conditional probability is selected as the current output each time by the greedy search decoding algorithm. The beam search decoding algorithm simultaneously reserves a plurality of paths with highest probability during decoding, and selects one path with highest probability as current output. the topk sampling decoding algorithm takes the first k words with highest conditional probability, reforms the first k words into new probability distribution, and samples the new probability distribution as the current output.

In the above decoding technique, the diversity of text summaries generated based on the first two decoding algorithms is insufficient. the topk sampling decoding algorithm is relatively fixed, a proper k value for any scene is difficult to find, words with small probability are easily introduced due to the fact that the k value is too large, text is not smooth, and available words are lost due to the fact that the k value is too small. The probability distribution at the decoding end will change with the change of the input text, so that it is difficult to determine the general k value.

For the text summary generation field, one typical application scenario is summary generation of landing pages. The landing page can be in various forms such as an advertisement landing page, an enterprise popularization landing page and the like.

The text in the landing page is generally subjected to abstract extraction, so that abstract text is obtained and used as a title of a clicking link of the landing page to attract a user to click.

However, in the related art, after model training is completed, the number of generated candidates for abstract text of the landing page is small, and the style of the speech organization is single, which is not beneficial to the selection of the landing page with diversity.

In order to improve diversity of abstract texts, the disclosure improves a text abstract generation method, a device, an electronic device and a storage medium.

In one embodiment of the present disclosure, a text summary generating method is provided, the method including:

inputting the target text into a text abstract generating model;

and when the selected predicted word is an end identifier, splicing the predicted words of each round in sequence to obtain the abstract text of the target text.

Therefore, the number of words contained in the target word set can be dynamically adjusted along with the probability distribution of the candidate words, and in each round of word prediction process, one predicted word is selected from a plurality of words contained in the target word set, so that the diversity of abstract texts can be ensured.

And the target word set is a core small subset formed by words with higher probability, the predicted words are selected from the core small subset, the words with low probability cannot be selected, and further the abstract text cannot be disordered. In addition, since the words are selected in descending order of the predicted probability, the available words having a high probability are not lost.

The text abstract generating method, the device, the electronic equipment and the storage medium provided by the embodiment of the disclosure are respectively described in detail below.

Referring to fig. 1, fig. 1 is a schematic flow chart of a text summary generating method according to an embodiment of the disclosure, as shown in fig. 1, the method may include the following steps:

s101: the target text is input into a text abstract generation model.

In the embodiment of the disclosure, the target text is text that needs summary generation, such as text in a landing page.

The text abstract generation model may be an autoregressive language model that is capable of predicting the next probable word to follow based on the above. The text abstract generation model comprises a prediction module and a decoding module, wherein the prediction module is used for outputting a candidate word set in each round of word prediction; and the decoding module is used for selecting one word from the candidate word set based on a decoding algorithm and taking the word as a predicted word of the current round.

S102: selecting a target word set with the cumulative probability exceeding a preset threshold value from the candidate word sets according to the descending order of the prediction probability of each candidate word aiming at the candidate word set output by the text abstract generation model in each round of word prediction, and selecting the prediction word of the current round from the target word set.

The word may be a word or a word.

In the embodiment of the disclosure, for a candidate word set output by a text abstract generation model in each round of word prediction, a target word set with the accumulated probability exceeding a preset threshold value is selected from the candidate word sets according to the descending order of the prediction probability of each candidate word.

As an example, the preset threshold is 0.9, in a certain round of word prediction process, the candidate word set includes { a, B, C, D, E … }, the prediction probabilities are {0.5,0.3,0.12,0.03,0.012 … }, and since 0.5+0.3<0.9,0.5+0.3+0.12>0.9, the target word set with the cumulative probability exceeding 0.9 is selected as { a, B, C }, and the number of words included in the target word set in this round is 3.

As another example, the preset threshold is 0.9, in a certain round of word prediction process, the candidate word sets include { a, B, C, D, E … }, the prediction probabilities are {0.4,0.3,0.12,0.09,0.05 … }, and since 0.4+0.3+0.12<0.9,0.4+0.3+0.12+0.09>0.9, the target word sets with the cumulative probability exceeding 0.9 are selected as { a, B, C, D }, in descending order of the prediction probabilities, the number of words included in the target word sets in the present round is 4.

Therefore, the number of words in the target word set can be dynamically adjusted along with the probability distribution of the candidate words, and in each round of word prediction process, one predicted word is selected from a plurality of words contained in the target word set, so that the diversity of abstract texts can be ensured.

When the predicted word of the current round is selected from the target word set, the predicted word can be selected randomly, and sampling can be performed based on the prediction probability.

S103: and in response to the selected predicted word as the end identifier, splicing the predicted words of each round in turn to obtain the abstract text of the target text.

In an embodiment of the present disclosure, the set of candidate words that the text summarization generation model predicts output at each round of words is determined based on the target text and the predicted words that were selected from the previous round.

That is, in each round of word prediction, the predicted word is taken as the above, and the next word is predicted in combination with the target text.

When a certain round of word prediction process is performed, the selected predicted word is an end identifier, and the end identifier indicates that the abstract text is ended, and the splicing result of the predicted word in each round is the abstract text of the target text.

The text abstract generation method overcomes the problem of poor diversity of the traditional decoding strategy, can be used based on the existing model without retraining the model, and is suitable for all abstract text generation tasks.

If the method for generating the text abstract provided by the embodiment of the disclosure is applied to abstract generation of the landing page, the target text is the text in the landing page, and the abstract text generated according to the target text has better diversity, can be used as a title of a clicking link of the landing page, and can better attract users to click.

In one embodiment of the present disclosure, to further increase the diversity of generating text summaries, a text summary generation model may be trained based on multiple sets of sample text sets, each sample text set containing one sample target text and a plurality of different text styles of sample summary text.

Specifically, referring to fig. 2, fig. 2 is a schematic flow chart of a training method for a text summarization model according to an embodiment of the present disclosure, where the training method may include the following steps:

s201: an initial model is obtained.

In the embodiment of the present disclosure, an RNN (Recurrent Neural Networks, recurrent neural network) model, an Encoder-Decoder (encoding-decoding) model, or the like may be selected as the initial model.

S202: acquiring a plurality of groups of sample text sets; wherein each sample text set contains one sample target text and a plurality of sample abstract texts of different text styles corresponding to the sample target text.

In the embodiment of the disclosure, the sample abstract text of a plurality of different text styles corresponding to the sample target text can be acquired in the following manner: and acquiring a sample target text and an initial abstract text corresponding to the sample target text. And carrying out text style rewriting on the initial abstract text to obtain a plurality of rewritten abstract texts. The initial summary text and the rewritten summary text are determined as sample summary text of a plurality of different text styles corresponding to the sample target text.

The sample target text may be text in a landing page, and the initial summary text may be a title of the landing page.

Text style may also be understood as language organization style, or speech style. Such as a succinct style, a rich style, a spoken style, etc.

In order to obtain abstract texts with different text styles, as training samples of the model, the initial abstract text can be rewritten in the text style based on a related algorithm or model rewritten by the text; manual rewriting can also be adopted, and the duplicate removal of text semantics is noted in the rewriting process.

As one example, the sample target text is an introduction to the merchandise "XX" in the landing page, and the initial summary text is a link title of the landing page, specifically "professional production various types of XX". In order to enrich the conversation organization style of the link title of the landing page, the initial abstract text can be rewritten in text style to obtain abstract texts of different conversation organization styles, such as professional XX solution, professional production XX, direct sales of manufacturer, welcome incoming call consultation or high quality XX, and the price is more economical. It can be seen that the rewritten abstract text contains multiple text styles.

In the embodiment of the disclosure, the initial abstract text and the rewritten abstract text are used as sample abstract text of sample target text, that is, one sample target text corresponds to a plurality of sample abstract texts with different text styles.

S203: and sequentially taking each sample target text and one sample abstract text corresponding to the sample target text as training samples of the initial model, and iteratively training the initial model.

In the embodiment of the disclosure, in each round of iterative training, a sample target text and a sample abstract text corresponding to the sample target text are used as input to train an initial model.

Specifically, in each iteration training, a sample target text is input into an initial model, and model parameters are adjusted based on the difference between the output abstract text and the sample abstract text.

Wherein the process of training a model based on a single sample target text and a single sample summary text can also be seen in the related art.

S204: judging whether the termination condition of model training is satisfied, if so, determining the current model as a text abstract generation model.

The termination condition for model training may be that the number of iterations reaches a preset number of iterations, or that the loss function value of the model is less than a preset threshold.

Because one sample target text corresponds to a plurality of sample abstract texts with different text styles, the text abstract generation model after training can learn different text styles after repeated iterative training, and the diversity of the output text abstract is further improved.

That is, after training is completed, the target text is input into the text abstract generating model, and in each round of word prediction process, the prediction module in the text abstract generating model can output a plurality of candidate words with relatively flat probability,

furthermore, it should be noted that in the present disclosure, the decoding algorithm is optimized on the one hand, and the training data of the model is optimized on the other hand, and the two aspects are organically combined, and the two aspects complement each other.

Specifically, the initial abstract text is rewritten in text style, so that a plurality of sample abstract texts with different text styles are obtained and used for training a model. The trained model can learn different text styles, so that a prediction module in the model can output a plurality of candidate words which are distributed flatly in each round of word prediction process, and the plurality of candidate words can belong to different text styles.

In each round of decoding process, a target word set with the accumulated probability exceeding a preset threshold value is determined, wherein the target word set is a small core subset composed of words with higher probability and different text styles, and then a predicted word is selected from the small core subset. It can be seen that if the target text is input into the digest generation model a plurality of times, a plurality of digest texts of different text styles can be generated.

As an example, if the model is not trained by training data of the model, that is, by using sample data of a single text style, the model only learns the single text style, which results in a relatively single output, that is, candidate words output by the prediction module in the model are likely to be distributed in a peak, for example, the probability is {0.7,0.1,0.08 … }, and in this case, only the word corresponding to the probability 0.7 is selected, and the generated abstract text is coherent.

In the disclosure, the text abstract generating model is trained by using sample abstract texts with various text styles, so that the text abstract generating model can learn different text styles, and therefore, candidate words output by a prediction module in the model are rich, i.e. the output candidate words are distributed flatly, for example, the probability is {0.35,0.3,0.28 … }, and the like, and further, the decoding algorithm provided by the disclosure is combined to select the predicted words from the candidate words. For example, the preset probability threshold is 0.9, and since 0.35+0.3<0.9 and 0.35+0.3+0.28>0.9, the first three candidate words form a target word set, and one word is selected as a predicted word, so that the diversity of the generated text can be ensured, and the generated abstract text is consistent no matter which predicted word is selected.

Therefore, in the disclosure, the optimization of the two aspects is combined to achieve the effect that one is added with one and two are added, so that the diversity of the generated abstract text can be remarkably improved.

According to experimental measurement and calculation, under the same target text, adopting a beam search decoding algorithm, wherein the number of generated abstract text candidates is 1, and the smoothness is 95%; adopting topk sampling decoding algorithm, wherein the generated abstract text candidate quantity is 6, and the smoothness is 85%; by adopting the method provided by the disclosure, the generated abstract text candidate number is 8, and the smoothness is 93%. Therefore, the text abstract generation method provided by the disclosure improves the diversity of abstract texts on the premise of ensuring higher smoothness.

The text abstract generation method provided by the disclosure is applied to the field of advertisement landing pages, so that the link titles of the advertisement landing pages with different conversation styles can be generated, the attraction to users is improved, and the conversion rate of the advertisement landing pages is improved.

Referring to fig. 3, fig. 3 is a block diagram of an apparatus for implementing a text summarization generating method according to an embodiment of the present disclosure, as shown in fig. 3, the apparatus may include:

an input module 301, configured to input a target text into a text abstract generation model;

the prediction module 302 is configured to predict, for the text abstract generation model, a set of candidate words output by each round of word, select, from the set of candidate words, a set of target words with a cumulative probability exceeding a preset threshold in descending order of the prediction probability of each candidate word, and select a predicted word of a current round from the set of target words;

and a splicing module 303, configured to sequentially splice the predicted words of each round in response to the selected predicted word being the end identifier, so as to obtain the abstract text of the target text.

In one embodiment of the present disclosure, the set of candidate words that the text summarization generation model predicts output at each round of words is determined based on the target text and the predicted words that were previously selected round.

In one embodiment of the present disclosure, on the basis of the apparatus shown in fig. 3, further includes: the training module is used for training the text abstract generation model in advance in the following way:

acquiring an initial model;

Referring to fig. 4, fig. 4 is a block diagram of an apparatus for implementing a text summarization model training method of an embodiment of the present disclosure, as shown in fig. 4, the apparatus may include:

a first obtaining module 401, configured to obtain an initial model;

a second obtaining module 402, configured to obtain a plurality of groups of sample text sets; wherein each sample text set comprises a sample target text and a plurality of sample abstract texts of different text styles corresponding to the sample target text;

an iterative training module 403, configured to sequentially take each sample target text and one sample abstract text corresponding to the sample target text as training samples of the initial model, and iteratively train the initial model;

and a judging module 404, configured to judge whether a termination condition of the model training is satisfied, and if so, determine the current model as a text abstract generating model.

In one embodiment of the present disclosure, further comprising: a rewriting module for:

the method comprises the following steps of obtaining sample abstract texts of a plurality of different text styles corresponding to the sample target text:

acquiring the sample target text and an initial abstract text corresponding to the sample target text;

performing text style rewriting on the initial abstract text to obtain a plurality of rewritten abstract texts;

and determining the initial abstract text and the rewritten abstract text as sample abstract texts of a plurality of different text styles corresponding to the sample target text.

According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium and a computer program product.

The present disclosure provides an electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

The present disclosure provides a non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform a text summarization method and/or a text summarization model training method.

The present disclosure provides a computer program product comprising a computer program which, when executed by a processor, implements a text digest generation method and/or a text digest generation model training method.

Fig. 5 illustrates a schematic block diagram of an example electronic device 500 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 5, the apparatus 500 includes a computing unit 501 that can perform various suitable actions and processes according to a computer program stored in a Read Only Memory (ROM) 502 or a computer program loaded from a storage unit 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data required for the operation of the device 500 can also be stored. The computing unit 501, ROM 502, and RAM 503 are connected to each other by a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.

Various components in the device 500 are connected to the I/O interface 505, including: an input unit 506 such as a keyboard, a mouse, etc.; an output unit 507 such as various types of displays, speakers, and the like; a storage unit 508 such as a magnetic disk, an optical disk, or the like; and a communication unit 509 such as a network card, modem, wireless communication transceiver, etc. The communication unit 509 allows the device 500 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.

The computing unit 501 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 501 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 501 performs the various methods and processes described above, such as the text digest generation method. For example, in some embodiments, the text excerpt generation method may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 508. In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 500 via the ROM 502 and/or the communication unit 509. When the computer program is loaded into RAM 503 and executed by computing unit 501, one or more steps of the text excerpt generation method described above may be performed. Alternatively, in other embodiments, the computing unit 501 may be configured to perform the text digest generation method by any other suitable means (e.g., by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.

The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server incorporating a blockchain.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel or sequentially or in a different order, provided that the desired results of the technical solutions of the present disclosure are achieved, and are not limited herein.

The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.

Claims

1. A text summary generation method, comprising:

inputting the target text into a text abstract generating model;

responding to the selected predicted word as an end identifier, and splicing the predicted words of each round in turn to obtain the abstract text of the target text;

the text abstract generation model is trained in advance by adopting the following modes:

acquiring an initial model;

2. The method of claim 1, wherein,

the set of candidate words that the text summarization generation model predicts output at each round of words is determined based on the target text and the predicted words that were selected from previous rounds.

3. A text abstract generation model training method, comprising:

acquiring an initial model;

judging whether a termination condition of model training is met, if yes, determining a current model as a text abstract generating model so as to facilitate inputting a target text into the text abstract generating model, predicting a candidate word set output by each round of words aiming at the text abstract generating model, selecting a target word set with the accumulated probability exceeding a preset threshold value from the candidate word set according to the descending order of the prediction probability of each candidate word, selecting a predicted word of the current round from the target word set, and sequentially splicing the predicted words of each round in response to the selected predicted word as an end identifier to obtain the abstract text of the target text.

4. A method according to claim 3, further comprising: the method comprises the following steps of obtaining sample abstract texts of a plurality of different text styles corresponding to the sample target text:

5. A text summary generation apparatus comprising:

the splicing module is used for responding to the selected predicted word as an end identifier, and splicing the predicted words of each round in turn to obtain the abstract text of the target text;

the training module is used for training the text abstract generation model in advance in the following way:

acquiring an initial model;

6. The apparatus of claim 5, wherein,

7. A text summary generation model training apparatus, comprising:

the first acquisition module is used for acquiring an initial model;

the judging module is used for judging whether the termination condition of model training is met, if so, determining the current model as a text abstract generating model so as to facilitate inputting a target text into the text abstract generating model, aiming at a candidate word set which is predicted and output by the text abstract generating model in each round of words, selecting a target word set with the cumulative probability exceeding a preset threshold value from the candidate word set according to the descending order of the prediction probability of each candidate word, selecting a predicted word of the current round from the target word set, and sequentially splicing the predicted words of each round in response to the selected predicted word as an end identifier to obtain the abstract text of the target text.

8. The apparatus of claim 7, further comprising: a rewriting module for:

9. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-4.

10. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-4.