CN113407707A - Method and device for generating text abstract - Google Patents

Method and device for generating text abstract Download PDF

Info

Publication number
CN113407707A
CN113407707A CN202010182475.2A CN202010182475A CN113407707A CN 113407707 A CN113407707 A CN 113407707A CN 202010182475 A CN202010182475 A CN 202010182475A CN 113407707 A CN113407707 A CN 113407707A
Authority
CN
China
Prior art keywords
text
text data
abstract
hidden layer
attention
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010182475.2A
Other languages
Chinese (zh)
Inventor
李浩然
徐松
袁鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Wodong Tianjun Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Wodong Tianjun Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Wodong Tianjun Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN202010182475.2A priority Critical patent/CN113407707A/en
Publication of CN113407707A publication Critical patent/CN113407707A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/34Browsing; Visualisation therefor
    • G06F16/345Summarisation for human users

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a method and a device for generating a text abstract, and relates to the technical field of computers. One embodiment of the method comprises: coding text data of the abstract to be generated to obtain a hidden layer sequence; and decoding the hidden layer sequence according to a preset element dictionary and an element-based coverage mechanism to generate a text abstract. The method avoids applying repeated attention to the same element, further reduces repeated description of the same element in the abstract, reduces the redundancy of the generated abstract, enables the generated text abstract to be more simplified and accurate, and can cover more information.

Description

Method and device for generating text abstract
Technical Field
The invention relates to the technical field of computers, in particular to a method and a device for generating a text abstract.
Background
The automatic generation of the text abstract is a technology for automatically generating the short abstract according to the detailed text description based on the natural language generation technology. Generally, a text summarization auto-generation model includes an encoder and a decoder. Inputting a detailed description of a text, and encoding the detailed description by an encoder to generate a hidden layer sequence; the decoder generates a target summary word by means of an attention mechanism using the hidden layer sequence.
In order to meet the simplification degree of the abstract, the redundancy of the abstract needs to be controlled, and the generation of repeated content is reduced, so that more information is described by a short text abstract. The common practice is to reduce the repeated attention that the decoder applies to the source vocabulary by using a coverage mechanism, thereby reducing the generation of repeated words.
In the process of implementing the invention, the inventor finds that at least the following problems exist in the prior art:
existing coverage mechanisms are word-based, i.e., reduce the generation of repeated words. In fact, different words may express the same semantic meaning, for example, the semantic meaning expressed by the words "mute" and "low noise" is the same. When the decoder generates "silence", the word-based coverage mechanism will prevent the repeated generation of "silence", but will not prevent the repeated generation of "low noise", resulting in the decoder possibly generating "low noise" such that "silence" and "low noise" appear in the generated summary at the same time, causing redundancy of information.
Disclosure of Invention
In view of this, embodiments of the present invention provide a method and an apparatus for generating a text abstract, which can avoid applying repeated attention to the same element, thereby reducing repeated descriptions of the same element in the abstract, and reducing the redundancy of the generated abstract, so that the generated text abstract is more simplified and accurate, and can cover more information.
To achieve the above object, according to an aspect of an embodiment of the present invention, there is provided a method of generating a text excerpt.
A method of generating a text excerpt, comprising: coding text data of the abstract to be generated to obtain a hidden layer sequence; and decoding the hidden layer sequence according to a preset element dictionary and an element-based coverage mechanism to generate a text abstract.
Optionally, the element dictionary is constructed by: classifying the text data according to the content characteristics of the text data; and constructing an element dictionary corresponding to the text data by labeling and learning each type of text data.
Optionally, the element dictionary is constructed by: classifying the text data according to the category of the article; and constructing an element dictionary corresponding to the text data by labeling and learning each type of text data.
Optionally, decoding the hidden layer sequence according to a preset element dictionary and an element-based coverage mechanism to generate a text summary comprises: decoding the hidden layer sequence according to a preset element dictionary to obtain an element set; according to the element set and an element-based coverage mechanism, the attention history of each element in the element set is recorded, and the attention of the repeated elements is punished in a loss function to generate a text abstract.
Optionally, decoding the hidden layer sequence according to a preset element dictionary to obtain an element set includes: and decoding the hidden layer sequence by taking elements in a preset element dictionary as basic units to generate an element set element by element.
Optionally, the element-based coverage mechanism comprises the following formula:
Figure BDA0002413057710000031
αt,i=softmax(et,i);ct=∑iαt,ihi
Figure BDA0002413057710000032
wherein e ist,iIs an intermediate variable, hiIs the hidden state of the ith word in the text data, stIs the hidden layer state at the t-th moment of the decoder, ua、Ua、WaAnd VaIs a model parameter matrix; x is the number ofk∈ajIndicates belonging to the element ajWord xk
Figure BDA0002413057710000033
Indicates the time t to the element ajThe attention of (1); alpha is alphat,iRepresenting the attention of the ith word in the text data at the moment t;
Figure BDA0002413057710000034
indicating the element a before time tjThe attention of (a) is accumulated.
Optionally, the manner of penalizing the attention of the repeated elements is as follows:
Figure BDA0002413057710000035
wherein L isβIs a loss function of repeated attention to the decoder; alpha is alphat,iRepresenting the attention of the ith word in the text data at the moment t;
Figure BDA0002413057710000036
indicating the element a before time tjThe attention of (a) is accumulated.
According to another aspect of the embodiments of the present invention, an apparatus for generating a text abstract is provided.
An apparatus for generating a text excerpt, comprising: the encoding module is used for encoding the text data of the abstract to be generated so as to obtain a hidden layer sequence; and the decoding module is used for decoding the hidden layer sequence according to a preset element dictionary and a coverage mechanism of elements so as to generate a text abstract.
According to another aspect of the embodiment of the invention, an electronic device for generating a text abstract is provided.
An electronic device that generates a text excerpt, comprising: one or more processors; the storage device is used for storing one or more programs, and when the one or more programs are executed by the one or more processors, the one or more processors implement the method for generating the text abstract provided by the embodiment of the invention.
According to yet another aspect of embodiments of the present invention, a computer-readable medium is provided.
A computer-readable medium, on which a computer program is stored, which, when executed by a processor, implements a method of generating a text excerpt provided by an embodiment of the present invention.
One embodiment of the above invention has the following advantages or benefits: coding text data of a summary to be generated to obtain a hidden layer sequence; decoding the hidden layer sequence according to a preset element dictionary and an element-based coverage mechanism to generate a text abstract, and identifying elements and corresponding element words of text data based on the element dictionary; based on the coverage mechanism of the elements, a plurality of element words of the same element can be clustered and described to generate the text abstract, so that the situation that repeated attention is exerted on the same element is better avoided, repeated description on the same element in the abstract is reduced, the redundancy of the generated abstract is reduced, the generated text abstract is more simplified and accurate, and more information can be covered.
Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.
Drawings
The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:
FIG. 1 is a diagram illustrating the main steps of a method for generating a text abstract according to an embodiment of the present invention;
FIG. 2 is a flow diagram illustrating the generation of a text excerpt according to one embodiment of the present invention;
FIG. 3 is a schematic diagram of the main modules of an apparatus for generating a text summary according to an embodiment of the present invention;
FIG. 4 is an exemplary system architecture diagram in which embodiments of the present invention may be employed;
fig. 5 is a schematic block diagram of a computer system suitable for use in implementing a terminal device or server of an embodiment of the invention.
Detailed Description
Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
In order to solve the problems in the prior art, the invention provides a method for reducing the redundancy rate of automatic summarization of commodities based on a coverage mechanism of elements. Based on the element dictionary, elements of the commodity can be identified; based on the coverage mechanism of the commodity elements, the repeated attention of the decoder to the vocabularies of which the source ends belong to the same commodity element can be reduced, and further the repeated description of the commodity abstract to the same commodity element is reduced.
Fig. 1 is a schematic diagram of main steps of a method for generating a text abstract according to an embodiment of the present invention. As shown in fig. 1, the method for generating a text abstract according to an embodiment of the present invention mainly includes the following steps S101 to S102.
Step S101: coding text data of the abstract to be generated to obtain a hidden layer sequence;
step S102: the hidden layer sequence is decoded according to an element dictionary and an element-based coverage mechanism to generate a summary of the text data.
According to an embodiment of the present invention, the element dictionary may be constructed, for example, in the following manner:
classifying the text data according to the content characteristics of the text data;
and constructing an element dictionary corresponding to the text data by labeling and learning each type of text data.
The text data is of many kinds, and the text data can be classified according to its content characteristics, for example: the text data can be divided into poems, proses, novels, words and the like according to the subject matters of the text content; as another example, text data may be classified into news, finance, science and technology, sports, entertainment, and the like according to the content.
According to another embodiment of the present invention, the element dictionary may be constructed, for example, by:
classifying the text data according to the category of the article;
and constructing an element dictionary corresponding to the text data by labeling and learning each type of text data.
In an embodiment of the present invention, for example, the text data of detailed description of the product is generated as an abstract, and the text data can be classified according to different product categories.
After the text data is classified, each type of text data can be labeled, and generally, the labeling can be performed through an expert, a batch of detailed description text data of commodities is given, and the expert responsible for labeling marks out the elements and the element words of the commodities contained in the text data every time the expert reads one sentence. Then, the labeled elements and their element words are learned, and an element dictionary is summarized and generated. For example, for text data "this kind of mobile phone is endurance", the element is "battery", the element word is "endurance", and finally, a "element-element word" dictionary is formed. Further, a model for generating the element dictionary may be obtained by machine learning from the previous expert labeling result and the semantic analysis tool, and the element dictionary may be generated using the model.
In the embodiment of the present invention, the element dictionary is a dictionary of "element-element words" describing correspondence between elements and element words, and each element may have a plurality of corresponding element words. For example, for a commodity "mobile phone," its elements include "battery", "screen", and "memory", etc.; the element words corresponding to the element "battery" include "power consumption", "electric quantity" and "endurance", etc. In embodiments of the present invention, there may be an average of about 10 elements per commodity; and there may be 30-100 element words per element. And constructing an element dictionary of the commodity according to the elements and the element words corresponding to the commodity.
Is obtained byAnd after the element dictionary, encoding the text data of the summary to be generated to obtain a hidden layer sequence. When encoding the text data, a piece of detailed description text data { x ] of the commodity needs to be input1,x2,…,xnEach of which xi(i e 1, 2, …, n) is a word, and the word can be obtained by segmenting the detailed description text of the commodity through a word segmentation tool (such as a Stanford Chinese word segmentation tool). Then, a bidirectional LSTM (Long Short-Term Memory) encoder f is usedencCoding detailed description text data of the commodity to generate a hidden layer sequence h1,h2,…,hnThe formula of the coding is as follows:
hi=fenc(xi,hi-1)
after the hidden layer sequence is obtained, the hidden layer sequence can be decoded to generate a text summary. In the embodiment of the invention, the decoding is carried out according to the element dictionary and the element-based coverage mechanism, and compared with the technical scheme of decoding according to the word and the word-based coverage mechanism in the prior art, the method and the device can better avoid applying repeated attention to the same element, further reduce repeated description of the same element in the abstract and reduce the redundancy of the generated abstract.
According to the embodiment of the present invention, decoding the hidden layer sequence according to the preset element dictionary and the element-based coverage mechanism to generate the text summary may be performed as follows:
decoding the hidden layer sequence according to a preset element dictionary to obtain an element set;
according to the element set and the element-based coverage mechanism, the attention history of each element in the element set is recorded, and the repeated elements are punished in a loss function to generate a text abstract.
In an embodiment of the invention, the decoder f is implemented by a unidirectional LSTMdecBy using the hidden layer sequence, a first abstract is generated word by word through an attention mechanism, and the formula is as follows:
st=fdec(st-1,yt-1,ct);
wherein, { t1,t2,…,tmIs the decoder hidden layer sequence, { y }1,y2,…,ymIs the first summary, ctIs the context vector at time t, generated by the attention mechanism.
In order to solve the technical problem proposed by the present invention, the present invention proposes an element-based coverage mechanism, which records the attention history of the decoder on the elements and penalizes the repeated attention of the elements of the decoder in a loss function, thereby reducing the generation of the repeated elements. When decoding the hidden layer sequence according to the preset element dictionary to obtain the element set, the method specifically may be: and decoding the hidden layer sequence by taking the elements in the preset element dictionary as basic units to generate an element set one by one. Therefore, a plurality of element words of the same element can be clustered to generate the text abstract.
The element-based coverage mechanism formula of the invention is as follows:
Figure BDA0002413057710000081
αt,i=softmax(et,i);
ct=∑iαt,ihi
Figure BDA0002413057710000082
Figure BDA0002413057710000083
wherein x isk∈ajIndicates belonging to the element ajWord xk
Figure BDA0002413057710000084
Indicates the time t to the element ajThe attention of (1); alpha is alphat,iRepresenting the attention of the ith word in the text data at the moment t;
Figure BDA0002413057710000085
indicating the element a before time tjThe attention accumulation of (2); e.g. of the typet,iIs an intermediate variable, hiIs the hidden state of the ith word in the input text data, stIs the hidden layer state at the t-th moment of the decoder, ua、Ua、WaAnd VaIs a model parameter matrix.
The decoder finally generates the probability that each word w in the detailed description text data is determined as a digest word by using the decoder hidden state and the context vector, and the formula is as follows:
P(w)=softmax(Wbst+Vbct);
wherein, WbAnd VbIs a model parameter matrix.
The model is trained based on maximum likelihood with the loss function as follows:
Figure BDA0002413057710000086
the invention punishs the repeated element attention of the decoder in the loss function based on the coverage of the element, thereby reducing the generation of the repeated element, and the final loss function formula is as follows:
Figure BDA0002413057710000087
Lfinal=L+Lβ
wherein L isβIs a loss function of repeated attention to the decoder; l isfinalIs the model final loss function.
Compared with the existing word-based coverage mechanism, the element-based coverage mechanism can perform cluster description on a plurality of element words of the same element to generate the text abstract, thereby avoiding repeated description on the same element, enabling the generated text abstract to be more concise and accurate, and covering more information.
Fig. 2 is a flow chart illustrating the generation of a text abstract according to an embodiment of the present invention. As shown in fig. 2, in an embodiment of the present invention, taking the generation of the corresponding abstract according to the detailed description text of different products as an example, the process of generating the text abstract mainly includes the following steps:
step S201: aiming at different commodity types, constructing an element dictionary of the commodity so as to record the corresponding relation between elements and element words;
step S202: encoding the detailed description text of the commodity by using a bidirectional LSTM encoder to obtain a hidden layer sequence;
step S203: and integrating the attention history of the elements of the commodity into a decoder attention calculation process, decoding the hidden layer sequence by using a unidirectional LSTM decoder, punishing the attention of repeated elements and generating a text abstract.
Fig. 3 is a schematic diagram of main blocks of an apparatus for generating a text summary according to an embodiment of the present invention. As shown in fig. 3, an apparatus 300 for generating a text abstract according to an embodiment of the present invention mainly includes an encoding module 301 and a decoding module 302.
The encoding module 301 is configured to encode text data of a to-be-generated abstract to obtain a hidden layer sequence;
a decoding module 302, configured to decode the hidden layer sequence according to a preset element dictionary to generate a summary of the text data.
According to an embodiment of the present invention, the element dictionary may be constructed by: :
classifying the text data according to the content characteristics of the text data;
and constructing an element dictionary corresponding to the text data by labeling and learning each type of text data.
According to another embodiment of the present invention, the element dictionary may be constructed by:
classifying the text data according to the category of the article;
and constructing an element dictionary corresponding to the text data by labeling and learning each type of text data.
According to yet another embodiment of the present invention, the decoding module 302 may be further configured to:
decoding the hidden layer sequence according to a preset element dictionary to obtain an element set;
according to the element set and an element-based coverage mechanism, the attention history of each element in the element set is recorded, and the attention of the repeated elements is punished in a loss function to generate a text abstract.
According to another embodiment of the present invention, the decoding module 302, when decoding the hidden layer sequence according to a preset element dictionary to obtain an element set, may further be configured to:
and decoding the hidden layer sequence by taking elements in a preset element dictionary as basic units to generate an element set element by element.
According to an embodiment of the present invention, the element-based coverage mechanism comprises the following formula:
Figure BDA0002413057710000101
αt,i=softmax(et,i);ct=∑iαt,ihi
Figure BDA0002413057710000102
wherein e ist,iIs an intermediate variable, hiIs the hidden state of the ith word in the text data, stIs the hidden layer state at the t-th moment of the decoder, ua、Ua、WaAnd VaIs a model parameter matrix; x is the number ofk∈ajIndicates belonging to the element ajWord xk
Figure BDA0002413057710000103
Indicates the time t to the element ajIs as followsWill do all the best; alpha is alphat,iRepresenting the attention of the ith word in the text data at the moment t;
Figure BDA0002413057710000104
indicating the element a before time tjThe attention of (a) is accumulated.
According to a further embodiment of the invention, the manner of penalizing the attention of a repeated element is as follows:
Figure BDA0002413057710000105
wherein L isβIs a loss function of repeated attention to the decoder; alpha is alphat,iRepresenting the attention of the ith word in the text data at the moment t;
Figure BDA0002413057710000106
indicating the element a before time tjThe attention of (a) is accumulated.
According to the technical scheme of the embodiment of the invention, a hidden layer sequence is obtained by encoding the text data of the abstract to be generated; decoding the hidden layer sequence according to a preset element dictionary and an element-based coverage mechanism to generate a text abstract, and identifying elements and corresponding element words of text data based on the element dictionary; based on the coverage mechanism of the elements, a plurality of element words of the same element can be clustered and described to generate the text abstract, so that the situation that repeated attention is exerted on the same element is better avoided, repeated description on the same element in the abstract is reduced, the redundancy of the generated abstract is reduced, the generated text abstract is more simplified and accurate, and more information can be covered.
Fig. 4 illustrates an exemplary system architecture 400 of a method of generating a text excerpt or an apparatus for generating a text excerpt to which embodiments of the present invention may be applied.
As shown in fig. 4, the system architecture 400 may include terminal devices 401, 402, 403, a network 404, and a server 405. The network 404 serves as a medium for providing communication links between the terminal devices 401, 402, 403 and the server 405. Network 404 may include various types of connections, such as wire, wireless communication links, or fiber optic cables, to name a few.
A user may use terminal devices 401, 402, 403 to interact with a server 405 over a network 404 to receive or send messages or the like. The terminal devices 401, 402, 403 may have installed thereon various communication client applications, such as shopping-like applications, web browser applications, search-like applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only).
The terminal devices 401, 402, 403 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.
The server 405 may be a server providing various services, such as a background management server (for example only) providing support for shopping websites browsed by users using the terminal devices 401, 402, 403. The backend management server may analyze and perform other processing on the received data such as the product information query request, and feed back a processing result (for example, target push information, product information — just an example) to the terminal device.
It should be noted that the method for generating the text abstract provided by the embodiment of the present invention is generally executed by the server 405, and accordingly, the apparatus for generating the text abstract is generally disposed in the server 405.
It should be understood that the number of terminal devices, networks, and servers in fig. 4 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Referring now to FIG. 5, a block diagram of a computer system 500 suitable for use with a terminal device or server implementing an embodiment of the invention is shown. The terminal device or the server shown in fig. 5 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 5, the computer system 500 includes a Central Processing Unit (CPU)501 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)502 or a program loaded from a storage section 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data necessary for the operation of the system 500 are also stored. The CPU 501, ROM 502, and RAM 503 are connected to each other via a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.
The following components are connected to the I/O interface 505: an input portion 506 including a keyboard, a mouse, and the like; an output portion 507 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage portion 508 including a hard disk and the like; and a communication section 509 including a network interface card such as a LAN card, a modem, or the like. The communication section 509 performs communication processing via a network such as the internet. The driver 510 is also connected to the I/O interface 505 as necessary. A removable medium 511 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 510 as necessary, so that a computer program read out therefrom is mounted into the storage section 508 as necessary.
In particular, according to the embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 509, and/or installed from the removable medium 511. The computer program performs the above-described functions defined in the system of the present invention when executed by the Central Processing Unit (CPU) 501.
It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units or modules described in the embodiments of the present invention may be implemented by software, or may be implemented by hardware. The described units or modules may also be provided in a processor, and may be described as: a processor includes an encoding module and a decoding module. The names of these units or modules do not in some cases form a limitation to the units or modules themselves, and for example, an encoding module may also be described as a "module for encoding text data to be summarized to obtain a hidden layer sequence".
As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to comprise: coding text data of the abstract to be generated to obtain a hidden layer sequence; and decoding the hidden layer sequence according to a preset element dictionary and an element-based coverage mechanism to generate a text abstract.
According to the technical scheme of the embodiment of the invention, a hidden layer sequence is obtained by encoding the text data of the abstract to be generated; decoding the hidden layer sequence according to the element dictionary and an element-based coverage mechanism to generate a text abstract, and identifying elements and corresponding element words of the text data based on the element dictionary; based on the coverage mechanism of the elements, a plurality of element words of the same element can be clustered and described to generate the text abstract, so that the situation that repeated attention is exerted on the same element is better avoided, repeated description on the same element in the abstract is reduced, the redundancy of the generated abstract is reduced, the generated text abstract is more simplified and accurate, and more information can be covered.
The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A method for generating a text summary, comprising:
coding text data of the abstract to be generated to obtain a hidden layer sequence;
and decoding the hidden layer sequence according to a preset element dictionary and an element-based coverage mechanism to generate a text abstract.
2. The method according to claim 1, wherein the element dictionary is constructed by:
classifying the text data according to the content characteristics of the text data;
and constructing an element dictionary corresponding to the text data by labeling and learning each type of text data.
3. The method according to claim 1, wherein the element dictionary is constructed by:
classifying the text data according to the category of the article;
and constructing an element dictionary corresponding to the text data by labeling and learning each type of text data.
4. The method of claim 1, wherein decoding the hidden layer sequence to generate a text summary according to a pre-defined element dictionary and an element-based coverage mechanism comprises:
decoding the hidden layer sequence according to a preset element dictionary to obtain an element set;
according to the element set and an element-based coverage mechanism, the attention history of each element in the element set is recorded, and the attention of the repeated elements is punished in a loss function to generate a text abstract.
5. The method of claim 4, wherein decoding the hidden layer sequence according to a pre-defined element dictionary to obtain a set of elements comprises:
and decoding the hidden layer sequence by taking elements in a preset element dictionary as basic units to generate an element set element by element.
6. The method of claim 1 or 4, wherein the element-based coverage mechanism comprises the following formula:
Figure FDA0002413057700000021
αt,i=soft max(et,i);
ct=∑iαt,ihi
Figure FDA0002413057700000022
Figure FDA0002413057700000023
wherein e ist,iIs an intermediate variable, hiIs the hidden state of the ith word in the text data, stIs the hidden layer state at the t-th moment of the decoder, ua、Ua、WaAnd VaIs a model parameter matrix; x is the number ofk∈ajIndicates belonging to the element ajWord xk
Figure FDA0002413057700000024
Indicates the time t to the element ajThe attention of (1); alpha is alphat,iIndicating the i-th word in the text data at time tAttention is paid;
Figure FDA0002413057700000025
indicating the element a before time tjThe attention of (a) is accumulated.
7. The method of claim 4, wherein the attention of the repeated elements is penalized as follows:
Figure FDA0002413057700000026
wherein L isβIs a loss function of repeated attention to the decoder; alpha is alphat,iRepresenting the attention of the ith word in the text data at the moment t;
Figure FDA0002413057700000027
indicating the element a before time tjThe attention of (a) is accumulated.
8. An apparatus for generating a text excerpt, comprising:
the encoding module is used for encoding the text data of the abstract to be generated so as to obtain a hidden layer sequence;
and the decoding module is used for decoding the hidden layer sequence according to a preset element dictionary and a coverage mechanism of elements so as to generate a text abstract.
9. An electronic device for generating a text summary, comprising:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-7.
10. A computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-7.
CN202010182475.2A 2020-03-16 2020-03-16 Method and device for generating text abstract Pending CN113407707A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010182475.2A CN113407707A (en) 2020-03-16 2020-03-16 Method and device for generating text abstract

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010182475.2A CN113407707A (en) 2020-03-16 2020-03-16 Method and device for generating text abstract

Publications (1)

Publication Number Publication Date
CN113407707A true CN113407707A (en) 2021-09-17

Family

ID=77676627

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010182475.2A Pending CN113407707A (en) 2020-03-16 2020-03-16 Method and device for generating text abstract

Country Status (1)

Country Link
CN (1) CN113407707A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108427771A (en) * 2018-04-09 2018-08-21 腾讯科技(深圳)有限公司 Summary texts generation method, device and computer equipment
US20180357225A1 (en) * 2017-06-13 2018-12-13 Beijing Baidu Netcom Science And Technology Co., Ltd. Method for generating chatting data based on artificial intelligence, computer device and computer-readable storage medium
CN109145105A (en) * 2018-07-26 2019-01-04 福州大学 A kind of text snippet model generation algorithm of fuse information selection and semantic association
CN109376234A (en) * 2018-10-10 2019-02-22 北京京东金融科技控股有限公司 A kind of method and apparatus of trained summarization generation model
CN109508400A (en) * 2018-10-09 2019-03-22 中国科学院自动化研究所 Picture and text abstraction generating method
CN109657051A (en) * 2018-11-30 2019-04-19 平安科技(深圳)有限公司 Text snippet generation method, device, computer equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180357225A1 (en) * 2017-06-13 2018-12-13 Beijing Baidu Netcom Science And Technology Co., Ltd. Method for generating chatting data based on artificial intelligence, computer device and computer-readable storage medium
CN108427771A (en) * 2018-04-09 2018-08-21 腾讯科技(深圳)有限公司 Summary texts generation method, device and computer equipment
CN109145105A (en) * 2018-07-26 2019-01-04 福州大学 A kind of text snippet model generation algorithm of fuse information selection and semantic association
CN109508400A (en) * 2018-10-09 2019-03-22 中国科学院自动化研究所 Picture and text abstraction generating method
CN109376234A (en) * 2018-10-10 2019-02-22 北京京东金融科技控股有限公司 A kind of method and apparatus of trained summarization generation model
CN109657051A (en) * 2018-11-30 2019-04-19 平安科技(深圳)有限公司 Text snippet generation method, device, computer equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
LI, HR等: "GuideRank: A Guided Ranking Graph Model for Multilingual Multi-document Summarization", 《LECTURE NOTES IN ARTIFICIAL INTELLIGENCE》, 31 December 2016 (2016-12-31) *
徐如阳;曾碧卿;韩旭丽;周武;: "卷积自注意力编码过滤的强化自动摘要模型", 小型微型计算机系统, no. 02, 15 February 2020 (2020-02-15) *

Similar Documents

Publication Publication Date Title
US10795939B2 (en) Query method and apparatus
CN112002305B (en) Speech synthesis method, device, storage medium and electronic equipment
US11308942B2 (en) Method and apparatus for operating smart terminal
CN108280200B (en) Method and device for pushing information
US20200322570A1 (en) Method and apparatus for aligning paragraph and video
CN110457694B (en) Message reminding method and device, scene type identification reminding method and device
WO2022156434A1 (en) Method and apparatus for generating text
CN111368551A (en) Method and device for determining event subject
CN112861529A (en) Method and device for managing error codes
CN113779186A (en) Text generation method and device
CN110705271B (en) System and method for providing natural language processing service
CN110910178A (en) Method and device for generating advertisement
CN113076756A (en) Text generation method and device
CN113761174A (en) Text generation method and device
CN110852057A (en) Method and device for calculating text similarity
US20220027577A1 (en) Text generation with customizable style
CN113407707A (en) Method and device for generating text abstract
CN110555204A (en) emotion judgment method and device
CN111949765B (en) Semantic-based similar text searching method, system, device and storage medium
US11250872B2 (en) Using closed captions as parallel training data for customization of closed captioning systems
CN111275476B (en) Quotation method and device for logistics storage service
CN113010666A (en) Abstract generation method, device, computer system and readable storage medium
CN112364657A (en) Method, device, equipment and computer readable medium for generating text
CN113947095B (en) Multilingual text translation method, multilingual text translation device, computer equipment and storage medium
CN113449096A (en) Method and device for generating text abstract

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination