CN113642324B - Text abstract generation method and device, electronic equipment and storage medium - Google Patents

Text abstract generation method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN113642324B
CN113642324B CN202110963382.8A CN202110963382A CN113642324B CN 113642324 B CN113642324 B CN 113642324B CN 202110963382 A CN202110963382 A CN 202110963382A CN 113642324 B CN113642324 B CN 113642324B
Authority
CN
China
Prior art keywords
text
abstract
sample
model
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110963382.8A
Other languages
Chinese (zh)
Other versions
CN113642324A (en
Inventor
念天磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202110963382.8A priority Critical patent/CN113642324B/en
Publication of CN113642324A publication Critical patent/CN113642324A/en
Application granted granted Critical
Publication of CN113642324B publication Critical patent/CN113642324B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosure provides a text abstract generation method, a device, electronic equipment and a storage medium, and relates to the technical field of computer technology, in particular to the technical field of natural language processing. The specific implementation scheme is as follows: inputting the target text into a text abstract generating model; selecting a target word set with the accumulated probability exceeding a preset threshold value from the candidate word set according to the descending order of the prediction probability of each candidate word aiming at the candidate word set output by the text abstract generation model in each round of word prediction, and selecting the prediction word of the current round from the target word set; and in response to the selected predicted word as the end identifier, splicing the predicted words of each round in turn to obtain the abstract text of the target text. The number of words in the target word set can be dynamically adjusted along with the probability distribution of the candidate words, and the predicted words are selected from the target word set, so that the diversity of abstract texts is improved.

Description

Text abstract generation method and device, electronic equipment and storage medium
Technical Field
The present disclosure relates to the field of computer technology, and in particular, to the field of natural language processing technology.
Background
The text abstract generation is an important research field in natural language processing, can convert a longer text into a shorter text containing key information, and plays an important role in the fields of intelligent question-answering robots, news abstracts, landing page abstracts and the like.
Disclosure of Invention
The disclosure provides a text abstract generation method, a device, equipment and a storage medium.
According to an aspect of the present disclosure, there is provided a text summary generating method, including: inputting the target text into a text abstract generating model;
selecting a target word set with the accumulated probability exceeding a preset threshold value from the candidate word set according to the descending order of the prediction probability of each candidate word aiming at the candidate word set output by the text abstract generation model in each round of word prediction, and selecting the prediction word of the current round from the target word set;
and in response to the selected predicted word as the end identifier, splicing the predicted words of each round in turn to obtain the abstract text of the target text.
According to another aspect of the present disclosure, there is provided a text summary generation model training method including:
acquiring an initial model;
acquiring a plurality of groups of sample text sets; wherein each sample text set comprises a sample target text and a plurality of sample abstract texts of different text styles corresponding to the sample target text;
sequentially taking each sample target text and one sample abstract text corresponding to the sample target text as training samples of the initial model, and iteratively training the initial model;
judging whether the termination condition of model training is satisfied, if so, determining the current model as a text abstract generation model.
According to another aspect of the present disclosure, there is provided a text digest generating apparatus including:
the input module is used for inputting the target text into the text abstract generation model;
the prediction module is used for predicting and outputting candidate word sets of each round of words aiming at the text abstract generation model, selecting a target word set with the accumulated probability exceeding a preset threshold value from the candidate word sets according to the descending order of the prediction probability of each candidate word, and selecting the predicted word of the current round from the target word set;
and the splicing module is used for responding to the selected predicted word as the ending identifier, and splicing the predicted words of each round in turn to obtain the abstract text of the target text.
According to another aspect of the present disclosure, there is provided a text summary generation model training apparatus, including:
the first acquisition module is used for acquiring an initial model;
the second acquisition module is used for acquiring a plurality of groups of sample text sets; wherein each sample text set comprises a sample target text and a plurality of sample abstract texts of different text styles corresponding to the sample target text;
the iterative training module is used for sequentially taking each sample target text and one sample abstract text corresponding to the sample target text as training samples of the initial model and iteratively training the initial model;
and the judging module is used for judging whether the termination condition of model training is met, and if so, determining the current model as a text abstract generating model.
According to still another aspect of the present disclosure, there is provided an electronic apparatus including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a text digest generation method and/or a text digest generation model training method.
According to yet another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform a text digest generation method and/or a text digest generation model training method.
According to yet another aspect of the present disclosure, a computer program product is provided, comprising a computer program which, when executed by a processor, implements a text digest generation method and/or a text digest generation model training method.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.
Drawings
The drawings are for a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
fig. 1 is a schematic flow chart of a text summary generation method according to an embodiment of the disclosure;
FIG. 2 is a schematic flow chart of a training method for a text summary generation model according to an embodiment of the disclosure;
FIG. 3 is a block diagram of an apparatus for implementing a text excerpt generation method of an embodiment of the present disclosure;
FIG. 4 is a block diagram of an apparatus for implementing a text summarization model training method of an embodiment of the present disclosure;
fig. 5 is a block diagram of an electronic device for implementing a text excerpt generation method of an embodiment of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
The text abstract generation is an important research field in natural language processing, can convert a longer text into a shorter text containing key information, and plays an important role in the fields of intelligent question-answering robots, news abstracts, landing page abstracts and the like.
In the related art, a longer text is input into a text abstract generating model, the text abstract generating model calculates the prediction result of each round of word prediction, then a decoding module selects a word with higher probability from the prediction result as a prediction word, and the text abstract can be obtained by splicing the prediction words of each round.
Decoding is understood to mean selecting a word from among a plurality of predicted words as the output word of the current round. The decoding algorithm of the decoding module directly affects the final output text digest, and common decoding algorithms include greedy search, beam search, and topk sampling.
The word with the highest conditional probability is selected as the current output each time by the greedy search decoding algorithm. The beam search decoding algorithm simultaneously reserves a plurality of paths with highest probability during decoding, and selects one path with highest probability as current output. the topk sampling decoding algorithm takes the first k words with highest conditional probability, reforms the first k words into new probability distribution, and samples the new probability distribution as the current output.
In the above decoding technique, the diversity of text summaries generated based on the first two decoding algorithms is insufficient. the topk sampling decoding algorithm is relatively fixed, a proper k value for any scene is difficult to find, words with small probability are easily introduced due to the fact that the k value is too large, text is not smooth, and available words are lost due to the fact that the k value is too small. The probability distribution at the decoding end will change with the change of the input text, so that it is difficult to determine the general k value.
For the text summary generation field, one typical application scenario is summary generation of landing pages. The landing page can be in various forms such as an advertisement landing page, an enterprise popularization landing page and the like.
The text in the landing page is generally subjected to abstract extraction, so that abstract text is obtained and used as a title of a clicking link of the landing page to attract a user to click.
However, in the related art, after model training is completed, the number of generated candidates for abstract text of the landing page is small, and the style of the speech organization is single, which is not beneficial to the selection of the landing page with diversity.
In order to improve diversity of abstract texts, the disclosure improves a text abstract generation method, a device, an electronic device and a storage medium.
In one embodiment of the present disclosure, a text summary generating method is provided, the method including:
inputting the target text into a text abstract generating model;
selecting a target word set with the accumulated probability exceeding a preset threshold value from the candidate word set according to the descending order of the prediction probability of each candidate word aiming at the candidate word set output by the text abstract generation model in each round of word prediction, and selecting the prediction word of the current round from the target word set;
and when the selected predicted word is an end identifier, splicing the predicted words of each round in sequence to obtain the abstract text of the target text.
Therefore, the number of words contained in the target word set can be dynamically adjusted along with the probability distribution of the candidate words, and in each round of word prediction process, one predicted word is selected from a plurality of words contained in the target word set, so that the diversity of abstract texts can be ensured.
And the target word set is a core small subset formed by words with higher probability, the predicted words are selected from the core small subset, the words with low probability cannot be selected, and further the abstract text cannot be disordered. In addition, since the words are selected in descending order of the predicted probability, the available words having a high probability are not lost.
The text abstract generating method, the device, the electronic equipment and the storage medium provided by the embodiment of the disclosure are respectively described in detail below.
Referring to fig. 1, fig. 1 is a schematic flow chart of a text summary generating method according to an embodiment of the disclosure, as shown in fig. 1, the method may include the following steps:
s101: the target text is input into a text abstract generation model.
In the embodiment of the disclosure, the target text is text that needs summary generation, such as text in a landing page.
The text abstract generation model may be an autoregressive language model that is capable of predicting the next probable word to follow based on the above. The text abstract generation model comprises a prediction module and a decoding module, wherein the prediction module is used for outputting a candidate word set in each round of word prediction; and the decoding module is used for selecting one word from the candidate word set based on a decoding algorithm and taking the word as a predicted word of the current round.
S102: selecting a target word set with the cumulative probability exceeding a preset threshold value from the candidate word sets according to the descending order of the prediction probability of each candidate word aiming at the candidate word set output by the text abstract generation model in each round of word prediction, and selecting the prediction word of the current round from the target word set.
The word may be a word or a word.
In the embodiment of the disclosure, for a candidate word set output by a text abstract generation model in each round of word prediction, a target word set with the accumulated probability exceeding a preset threshold value is selected from the candidate word sets according to the descending order of the prediction probability of each candidate word.
As an example, the preset threshold is 0.9, in a certain round of word prediction process, the candidate word set includes { a, B, C, D, E … }, the prediction probabilities are {0.5,0.3,0.12,0.03,0.012 … }, and since 0.5+0.3<0.9,0.5+0.3+0.12>0.9, the target word set with the cumulative probability exceeding 0.9 is selected as { a, B, C }, and the number of words included in the target word set in this round is 3.
As another example, the preset threshold is 0.9, in a certain round of word prediction process, the candidate word sets include { a, B, C, D, E … }, the prediction probabilities are {0.4,0.3,0.12,0.09,0.05 … }, and since 0.4+0.3+0.12<0.9,0.4+0.3+0.12+0.09>0.9, the target word sets with the cumulative probability exceeding 0.9 are selected as { a, B, C, D }, in descending order of the prediction probabilities, the number of words included in the target word sets in the present round is 4.
Therefore, the number of words in the target word set can be dynamically adjusted along with the probability distribution of the candidate words, and in each round of word prediction process, one predicted word is selected from a plurality of words contained in the target word set, so that the diversity of abstract texts can be ensured.
When the predicted word of the current round is selected from the target word set, the predicted word can be selected randomly, and sampling can be performed based on the prediction probability.
S103: and in response to the selected predicted word as the end identifier, splicing the predicted words of each round in turn to obtain the abstract text of the target text.
In an embodiment of the present disclosure, the set of candidate words that the text summarization generation model predicts output at each round of words is determined based on the target text and the predicted words that were selected from the previous round.
That is, in each round of word prediction, the predicted word is taken as the above, and the next word is predicted in combination with the target text.
When a certain round of word prediction process is performed, the selected predicted word is an end identifier, and the end identifier indicates that the abstract text is ended, and the splicing result of the predicted word in each round is the abstract text of the target text.
Therefore, the number of words contained in the target word set can be dynamically adjusted along with the probability distribution of the candidate words, and in each round of word prediction process, one predicted word is selected from a plurality of words contained in the target word set, so that the diversity of abstract texts can be ensured.
And the target word set is a core small subset formed by words with higher probability, the predicted words are selected from the core small subset, the words with low probability cannot be selected, and further the abstract text cannot be disordered. In addition, since the words are selected in descending order of the predicted probability, the available words having a high probability are not lost.
The text abstract generation method overcomes the problem of poor diversity of the traditional decoding strategy, can be used based on the existing model without retraining the model, and is suitable for all abstract text generation tasks.
If the method for generating the text abstract provided by the embodiment of the disclosure is applied to abstract generation of the landing page, the target text is the text in the landing page, and the abstract text generated according to the target text has better diversity, can be used as a title of a clicking link of the landing page, and can better attract users to click.
In one embodiment of the present disclosure, to further increase the diversity of generating text summaries, a text summary generation model may be trained based on multiple sets of sample text sets, each sample text set containing one sample target text and a plurality of different text styles of sample summary text.
Specifically, referring to fig. 2, fig. 2 is a schematic flow chart of a training method for a text summarization model according to an embodiment of the present disclosure, where the training method may include the following steps:
s201: an initial model is obtained.
In the embodiment of the present disclosure, an RNN (Recurrent Neural Networks, recurrent neural network) model, an Encoder-Decoder (encoding-decoding) model, or the like may be selected as the initial model.
S202: acquiring a plurality of groups of sample text sets; wherein each sample text set contains one sample target text and a plurality of sample abstract texts of different text styles corresponding to the sample target text.
In the embodiment of the disclosure, the sample abstract text of a plurality of different text styles corresponding to the sample target text can be acquired in the following manner: and acquiring a sample target text and an initial abstract text corresponding to the sample target text. And carrying out text style rewriting on the initial abstract text to obtain a plurality of rewritten abstract texts. The initial summary text and the rewritten summary text are determined as sample summary text of a plurality of different text styles corresponding to the sample target text.
The sample target text may be text in a landing page, and the initial summary text may be a title of the landing page.
Text style may also be understood as language organization style, or speech style. Such as a succinct style, a rich style, a spoken style, etc.
In order to obtain abstract texts with different text styles, as training samples of the model, the initial abstract text can be rewritten in the text style based on a related algorithm or model rewritten by the text; manual rewriting can also be adopted, and the duplicate removal of text semantics is noted in the rewriting process.
As one example, the sample target text is an introduction to the merchandise "XX" in the landing page, and the initial summary text is a link title of the landing page, specifically "professional production various types of XX". In order to enrich the conversation organization style of the link title of the landing page, the initial abstract text can be rewritten in text style to obtain abstract texts of different conversation organization styles, such as professional XX solution, professional production XX, direct sales of manufacturer, welcome incoming call consultation or high quality XX, and the price is more economical. It can be seen that the rewritten abstract text contains multiple text styles.
In the embodiment of the disclosure, the initial abstract text and the rewritten abstract text are used as sample abstract text of sample target text, that is, one sample target text corresponds to a plurality of sample abstract texts with different text styles.
S203: and sequentially taking each sample target text and one sample abstract text corresponding to the sample target text as training samples of the initial model, and iteratively training the initial model.
In the embodiment of the disclosure, in each round of iterative training, a sample target text and a sample abstract text corresponding to the sample target text are used as input to train an initial model.
Specifically, in each iteration training, a sample target text is input into an initial model, and model parameters are adjusted based on the difference between the output abstract text and the sample abstract text.
Wherein the process of training a model based on a single sample target text and a single sample summary text can also be seen in the related art.
S204: judging whether the termination condition of model training is satisfied, if so, determining the current model as a text abstract generation model.
The termination condition for model training may be that the number of iterations reaches a preset number of iterations, or that the loss function value of the model is less than a preset threshold.
Because one sample target text corresponds to a plurality of sample abstract texts with different text styles, the text abstract generation model after training can learn different text styles after repeated iterative training, and the diversity of the output text abstract is further improved.
That is, after training is completed, the target text is input into the text abstract generating model, and in each round of word prediction process, the prediction module in the text abstract generating model can output a plurality of candidate words with relatively flat probability,
furthermore, it should be noted that in the present disclosure, the decoding algorithm is optimized on the one hand, and the training data of the model is optimized on the other hand, and the two aspects are organically combined, and the two aspects complement each other.
Specifically, the initial abstract text is rewritten in text style, so that a plurality of sample abstract texts with different text styles are obtained and used for training a model. The trained model can learn different text styles, so that a prediction module in the model can output a plurality of candidate words which are distributed flatly in each round of word prediction process, and the plurality of candidate words can belong to different text styles.
In each round of decoding process, a target word set with the accumulated probability exceeding a preset threshold value is determined, wherein the target word set is a small core subset composed of words with higher probability and different text styles, and then a predicted word is selected from the small core subset. It can be seen that if the target text is input into the digest generation model a plurality of times, a plurality of digest texts of different text styles can be generated.
As an example, if the model is not trained by training data of the model, that is, by using sample data of a single text style, the model only learns the single text style, which results in a relatively single output, that is, candidate words output by the prediction module in the model are likely to be distributed in a peak, for example, the probability is {0.7,0.1,0.08 … }, and in this case, only the word corresponding to the probability 0.7 is selected, and the generated abstract text is coherent.
In the disclosure, the text abstract generating model is trained by using sample abstract texts with various text styles, so that the text abstract generating model can learn different text styles, and therefore, candidate words output by a prediction module in the model are rich, i.e. the output candidate words are distributed flatly, for example, the probability is {0.35,0.3,0.28 … }, and the like, and further, the decoding algorithm provided by the disclosure is combined to select the predicted words from the candidate words. For example, the preset probability threshold is 0.9, and since 0.35+0.3<0.9 and 0.35+0.3+0.28>0.9, the first three candidate words form a target word set, and one word is selected as a predicted word, so that the diversity of the generated text can be ensured, and the generated abstract text is consistent no matter which predicted word is selected.
Therefore, in the disclosure, the optimization of the two aspects is combined to achieve the effect that one is added with one and two are added, so that the diversity of the generated abstract text can be remarkably improved.
According to experimental measurement and calculation, under the same target text, adopting a beam search decoding algorithm, wherein the number of generated abstract text candidates is 1, and the smoothness is 95%; adopting topk sampling decoding algorithm, wherein the generated abstract text candidate quantity is 6, and the smoothness is 85%; by adopting the method provided by the disclosure, the generated abstract text candidate number is 8, and the smoothness is 93%. Therefore, the text abstract generation method provided by the disclosure improves the diversity of abstract texts on the premise of ensuring higher smoothness.
The text abstract generation method provided by the disclosure is applied to the field of advertisement landing pages, so that the link titles of the advertisement landing pages with different conversation styles can be generated, the attraction to users is improved, and the conversion rate of the advertisement landing pages is improved.
Referring to fig. 3, fig. 3 is a block diagram of an apparatus for implementing a text summarization generating method according to an embodiment of the present disclosure, as shown in fig. 3, the apparatus may include:
an input module 301, configured to input a target text into a text abstract generation model;
the prediction module 302 is configured to predict, for the text abstract generation model, a set of candidate words output by each round of word, select, from the set of candidate words, a set of target words with a cumulative probability exceeding a preset threshold in descending order of the prediction probability of each candidate word, and select a predicted word of a current round from the set of target words;
and a splicing module 303, configured to sequentially splice the predicted words of each round in response to the selected predicted word being the end identifier, so as to obtain the abstract text of the target text.
In one embodiment of the present disclosure, the set of candidate words that the text summarization generation model predicts output at each round of words is determined based on the target text and the predicted words that were previously selected round.
In one embodiment of the present disclosure, on the basis of the apparatus shown in fig. 3, further includes: the training module is used for training the text abstract generation model in advance in the following way:
acquiring an initial model;
acquiring a plurality of groups of sample text sets; wherein each sample text set comprises a sample target text and a plurality of sample abstract texts of different text styles corresponding to the sample target text;
sequentially taking each sample target text and one sample abstract text corresponding to the sample target text as training samples of the initial model, and iteratively training the initial model;
judging whether the termination condition of model training is satisfied, if so, determining the current model as a text abstract generation model.
Referring to fig. 4, fig. 4 is a block diagram of an apparatus for implementing a text summarization model training method of an embodiment of the present disclosure, as shown in fig. 4, the apparatus may include:
a first obtaining module 401, configured to obtain an initial model;
a second obtaining module 402, configured to obtain a plurality of groups of sample text sets; wherein each sample text set comprises a sample target text and a plurality of sample abstract texts of different text styles corresponding to the sample target text;
an iterative training module 403, configured to sequentially take each sample target text and one sample abstract text corresponding to the sample target text as training samples of the initial model, and iteratively train the initial model;
and a judging module 404, configured to judge whether a termination condition of the model training is satisfied, and if so, determine the current model as a text abstract generating model.
In one embodiment of the present disclosure, further comprising: a rewriting module for:
the method comprises the following steps of obtaining sample abstract texts of a plurality of different text styles corresponding to the sample target text:
acquiring the sample target text and an initial abstract text corresponding to the sample target text;
performing text style rewriting on the initial abstract text to obtain a plurality of rewritten abstract texts;
and determining the initial abstract text and the rewritten abstract text as sample abstract texts of a plurality of different text styles corresponding to the sample target text.
According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium and a computer program product.
The present disclosure provides an electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a text digest generation method and/or a text digest generation model training method.
The present disclosure provides a non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform a text summarization method and/or a text summarization model training method.
The present disclosure provides a computer program product comprising a computer program which, when executed by a processor, implements a text digest generation method and/or a text digest generation model training method.
Fig. 5 illustrates a schematic block diagram of an example electronic device 500 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 5, the apparatus 500 includes a computing unit 501 that can perform various suitable actions and processes according to a computer program stored in a Read Only Memory (ROM) 502 or a computer program loaded from a storage unit 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data required for the operation of the device 500 can also be stored. The computing unit 501, ROM 502, and RAM 503 are connected to each other by a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.
Various components in the device 500 are connected to the I/O interface 505, including: an input unit 506 such as a keyboard, a mouse, etc.; an output unit 507 such as various types of displays, speakers, and the like; a storage unit 508 such as a magnetic disk, an optical disk, or the like; and a communication unit 509 such as a network card, modem, wireless communication transceiver, etc. The communication unit 509 allows the device 500 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.
The computing unit 501 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 501 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 501 performs the various methods and processes described above, such as the text digest generation method. For example, in some embodiments, the text excerpt generation method may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 508. In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 500 via the ROM 502 and/or the communication unit 509. When the computer program is loaded into RAM 503 and executed by computing unit 501, one or more steps of the text excerpt generation method described above may be performed. Alternatively, in other embodiments, the computing unit 501 may be configured to perform the text digest generation method by any other suitable means (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server incorporating a blockchain.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel or sequentially or in a different order, provided that the desired results of the technical solutions of the present disclosure are achieved, and are not limited herein.
The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.

Claims (10)

1. A text summary generation method, comprising:
inputting the target text into a text abstract generating model;
selecting a target word set with the accumulated probability exceeding a preset threshold value from the candidate word set according to the descending order of the prediction probability of each candidate word aiming at the candidate word set output by the text abstract generation model in each round of word prediction, and selecting the prediction word of the current round from the target word set;
responding to the selected predicted word as an end identifier, and splicing the predicted words of each round in turn to obtain the abstract text of the target text;
the text abstract generation model is trained in advance by adopting the following modes:
acquiring an initial model;
acquiring a plurality of groups of sample text sets; wherein each sample text set comprises a sample target text and a plurality of sample abstract texts of different text styles corresponding to the sample target text;
sequentially taking each sample target text and one sample abstract text corresponding to the sample target text as training samples of the initial model, and iteratively training the initial model;
judging whether the termination condition of model training is satisfied, if so, determining the current model as a text abstract generation model.
2. The method of claim 1, wherein,
the set of candidate words that the text summarization generation model predicts output at each round of words is determined based on the target text and the predicted words that were selected from previous rounds.
3. A text abstract generation model training method, comprising:
acquiring an initial model;
acquiring a plurality of groups of sample text sets; wherein each sample text set comprises a sample target text and a plurality of sample abstract texts of different text styles corresponding to the sample target text;
sequentially taking each sample target text and one sample abstract text corresponding to the sample target text as training samples of the initial model, and iteratively training the initial model;
judging whether a termination condition of model training is met, if yes, determining a current model as a text abstract generating model so as to facilitate inputting a target text into the text abstract generating model, predicting a candidate word set output by each round of words aiming at the text abstract generating model, selecting a target word set with the accumulated probability exceeding a preset threshold value from the candidate word set according to the descending order of the prediction probability of each candidate word, selecting a predicted word of the current round from the target word set, and sequentially splicing the predicted words of each round in response to the selected predicted word as an end identifier to obtain the abstract text of the target text.
4. A method according to claim 3, further comprising: the method comprises the following steps of obtaining sample abstract texts of a plurality of different text styles corresponding to the sample target text:
acquiring the sample target text and an initial abstract text corresponding to the sample target text;
performing text style rewriting on the initial abstract text to obtain a plurality of rewritten abstract texts;
and determining the initial abstract text and the rewritten abstract text as sample abstract texts of a plurality of different text styles corresponding to the sample target text.
5. A text summary generation apparatus comprising:
the input module is used for inputting the target text into the text abstract generation model;
the prediction module is used for predicting and outputting candidate word sets of each round of words aiming at the text abstract generation model, selecting a target word set with the accumulated probability exceeding a preset threshold value from the candidate word sets according to the descending order of the prediction probability of each candidate word, and selecting the predicted word of the current round from the target word set;
the splicing module is used for responding to the selected predicted word as an end identifier, and splicing the predicted words of each round in turn to obtain the abstract text of the target text;
the training module is used for training the text abstract generation model in advance in the following way:
acquiring an initial model;
acquiring a plurality of groups of sample text sets; wherein each sample text set comprises a sample target text and a plurality of sample abstract texts of different text styles corresponding to the sample target text;
sequentially taking each sample target text and one sample abstract text corresponding to the sample target text as training samples of the initial model, and iteratively training the initial model;
judging whether the termination condition of model training is satisfied, if so, determining the current model as a text abstract generation model.
6. The apparatus of claim 5, wherein,
the set of candidate words that the text summarization generation model predicts output at each round of words is determined based on the target text and the predicted words that were selected from previous rounds.
7. A text summary generation model training apparatus, comprising:
the first acquisition module is used for acquiring an initial model;
the second acquisition module is used for acquiring a plurality of groups of sample text sets; wherein each sample text set comprises a sample target text and a plurality of sample abstract texts of different text styles corresponding to the sample target text;
the iterative training module is used for sequentially taking each sample target text and one sample abstract text corresponding to the sample target text as training samples of the initial model and iteratively training the initial model;
the judging module is used for judging whether the termination condition of model training is met, if so, determining the current model as a text abstract generating model so as to facilitate inputting a target text into the text abstract generating model, aiming at a candidate word set which is predicted and output by the text abstract generating model in each round of words, selecting a target word set with the cumulative probability exceeding a preset threshold value from the candidate word set according to the descending order of the prediction probability of each candidate word, selecting a predicted word of the current round from the target word set, and sequentially splicing the predicted words of each round in response to the selected predicted word as an end identifier to obtain the abstract text of the target text.
8. The apparatus of claim 7, further comprising: a rewriting module for:
the method comprises the following steps of obtaining sample abstract texts of a plurality of different text styles corresponding to the sample target text:
acquiring the sample target text and an initial abstract text corresponding to the sample target text;
performing text style rewriting on the initial abstract text to obtain a plurality of rewritten abstract texts;
and determining the initial abstract text and the rewritten abstract text as sample abstract texts of a plurality of different text styles corresponding to the sample target text.
9. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-4.
10. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-4.
CN202110963382.8A 2021-08-20 2021-08-20 Text abstract generation method and device, electronic equipment and storage medium Active CN113642324B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110963382.8A CN113642324B (en) 2021-08-20 2021-08-20 Text abstract generation method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110963382.8A CN113642324B (en) 2021-08-20 2021-08-20 Text abstract generation method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113642324A CN113642324A (en) 2021-11-12
CN113642324B true CN113642324B (en) 2024-02-09

Family

ID=78423280

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110963382.8A Active CN113642324B (en) 2021-08-20 2021-08-20 Text abstract generation method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113642324B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115600586B (en) * 2022-12-15 2023-04-11 阿里巴巴(中国)有限公司 Abstract text generation method, computing device and storage medium
CN116245088B (en) * 2022-12-27 2023-11-17 北京百度网讯科技有限公司 Decoding processing method, device and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003056451A1 (en) * 2001-12-27 2003-07-10 Communications Research Laboratory,Independent Administrative Institution Text generating method and text generator
CN112883722A (en) * 2021-03-04 2021-06-01 中山大学 Distributed text summarization method based on cloud data center
CN113190675A (en) * 2021-05-12 2021-07-30 平安国际智慧城市科技股份有限公司 Text abstract generation method and device, computer equipment and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11176330B2 (en) * 2019-07-22 2021-11-16 Advanced New Technologies Co., Ltd. Generating recommendation information
CN111339268B (en) * 2020-02-19 2023-08-15 北京百度网讯科技有限公司 Entity word recognition method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003056451A1 (en) * 2001-12-27 2003-07-10 Communications Research Laboratory,Independent Administrative Institution Text generating method and text generator
CN112883722A (en) * 2021-03-04 2021-06-01 中山大学 Distributed text summarization method based on cloud data center
CN113190675A (en) * 2021-05-12 2021-07-30 平安国际智慧城市科技股份有限公司 Text abstract generation method and device, computer equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Faster Compressed Top-k Document Retrieval;Wing-Kai Hon;IEEE;全文 *
基于变种概率图模型的文本生成算法;刘廷镇;张华;;计算机应用(第S1期);全文 *

Also Published As

Publication number Publication date
CN113642324A (en) 2021-11-12

Similar Documents

Publication Publication Date Title
CN110019742B (en) Method and device for processing information
CN113642324B (en) Text abstract generation method and device, electronic equipment and storage medium
KR102573637B1 (en) Entity linking method and device, electronic equipment and storage medium
US11947920B2 (en) Man-machine dialogue method and system, computer device and medium
CN113836925B (en) Training method and device for pre-training language model, electronic equipment and storage medium
CN111709252B (en) Model improvement method and device based on pre-trained semantic model
US20230259712A1 (en) Sound effect adding method and apparatus, storage medium, and electronic device
CN111680517A (en) Method, apparatus, device and storage medium for training a model
CN112527383A (en) Method, apparatus, device, medium, and program for generating multitask model
CN113239157B (en) Method, device, equipment and storage medium for training conversation model
CN113590776A (en) Text processing method and device based on knowledge graph, electronic equipment and medium
CN112506359B (en) Method and device for providing candidate long sentences in input method and electronic equipment
CN112528641A (en) Method and device for establishing information extraction model, electronic equipment and readable storage medium
CN114399772B (en) Sample generation, model training and track recognition methods, devices, equipment and media
CN111538817B (en) Man-machine interaction method and device
CN113190746B (en) Recommendation model evaluation method and device and electronic equipment
CN112270169B (en) Method and device for predicting dialogue roles, electronic equipment and storage medium
CN117633356A (en) Resource processing method and device based on large language model, electronic equipment and medium
CN111522928A (en) Knowledge extraction method, device, equipment and medium
CN115130470B (en) Method, device, equipment and medium for generating text keywords
CN116049370A (en) Information query method and training method and device of information generation model
CN115203564A (en) Information flow recommendation method and device and computer program product
CN113254824B (en) Content determination method, device, medium, and program product
CN115905490A (en) Man-machine interaction dialogue method, device and equipment
CN113204616B (en) Training of text extraction model and text extraction method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant