CN111078865B

CN111078865B - Text title generation method and device

Info

Publication number: CN111078865B
Application number: CN201911361394.2A
Authority: CN
Inventors: 陈亮宇; 刘家辰; 肖欣延
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2019-12-24
Filing date: 2019-12-24
Publication date: 2023-02-21
Anticipated expiration: 2039-12-24
Also published as: CN111078865A

Abstract

The application discloses a text title generation method and a text title generation device, which relate to the field of data processing in computer technology, and the method comprises the following steps: extracting a word vector of each word in a word sequence contained in the target text, and coding the word vector according to a first preset model to generate a word coding vector of each word; calculating a first attention probability corresponding to the word code vector of each word according to a first preset model; calculating the word code vector of each word according to a second preset model, and generating a second attention probability corresponding to the word code vector of each word; calculating a word coding vector of each word according to the first attention probability and the second attention probability, and acquiring at least one context vector corresponding to a word sequence; and decoding the context vector according to the first preset model to generate a decoding word sequence, and acquiring a text title corresponding to the target text according to the decoding word sequence. Therefore, the text title is accurately generated, and the high content association degree of the generated title and the target text is ensured.

Description

Text title generation method and device

Technical Field

The present application relates to the field of data processing in computer technologies, and in particular, to a method and an apparatus for generating a text title.

Background

The title is one of the most important parts in the content of the teletext article, and is one of the most concerned, difficult and time-consuming links in the author creation process. In the related art, a title of a text is generated according to a statistical model, that is, a keyword included in the statistical text is counted, and a preset database is queried according to the keyword to obtain a corresponding title.

However, in the above method for generating a title of a text, since the title is obtained according to the keyword query correspondence, the generated titles are the same among different articles because the keywords are the same, and the generated titles are monotonous and inaccurate, and the relevance with the content of the text is not strong.

Disclosure of Invention

A first object of the present application is to provide a text title generating method.

A second object of the present application is to provide a text title generating apparatus.

A third object of the present application is to provide an electronic device.

A fourth object of the present application is to propose a non-transitory computer readable storage medium storing computer instructions.

To achieve the above object, an embodiment of a first aspect of the present application provides a text title generation, including: extracting a word vector of each word in a word sequence contained in a target text, and coding the word vector according to a first preset model to generate a word coding vector of each word; calculating a first attention probability corresponding to the word code vector of each word according to the first preset model; calculating the word code vector of each word according to a second preset model, and generating a second attention probability corresponding to the word code vector of each word; calculating the word coding vector of each word according to the first attention probability and the second attention probability to obtain at least one context vector corresponding to the word sequence; decoding the context vector according to the first preset model to generate a decoding word sequence, and acquiring a text title corresponding to the target text according to the decoding word sequence.

To achieve the above object, a second embodiment of the present application provides a text title generating apparatus, including: the first generation module is used for extracting a word vector of each word in a word sequence contained in the target text and generating a word coding vector of each word by coding the word vector according to a first preset model; the calculation module is used for calculating a first attention probability corresponding to the word coding vector of each word according to the first preset model; the second generation module is used for calculating the word coding vector of each word according to a second preset model and generating a second attention probability corresponding to the word coding vector of each word; a first obtaining module, configured to calculate a word encoding vector of each word according to the first attention probability and the second attention probability, and obtain at least one context vector corresponding to the word sequence; and the second acquisition module is used for decoding the context vector according to the first preset model to generate a decoding word sequence and acquiring a text title corresponding to the target text according to the decoding word sequence.

To achieve the above object, a third aspect of the present application provides an electronic device, including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the text title generation method described in the above embodiments.

To achieve the above object, a fourth aspect of the present application provides a non-transitory computer-readable storage medium storing computer instructions for causing a computer to execute the text title generation method described in the above embodiments.

One embodiment in the above application has the following advantages or benefits:

extracting a word vector of each word in a word sequence contained in a target text, coding the word vector according to a first preset model to generate a word coding vector of each word, calculating a first attention probability corresponding to the word coding vector of each word according to the first preset model, calculating the word coding vector of each word according to a second preset model to generate a second attention probability corresponding to the word coding vector of each word, further calculating the word coding vector of each word according to the first attention probability and the second attention probability to obtain at least one context vector corresponding to the word sequence, finally decoding the context vector according to the first preset model to generate a decoded word sequence, and obtaining a text title corresponding to the target text according to the decoded word sequence. Therefore, the text title can be accurately acquired, and the generated title is highly associated with the content of the target text.

Other effects of the above alternatives will be described below with reference to specific embodiments.

Drawings

The drawings are included to provide a better understanding of the present solution and are not intended to limit the present application. Wherein:

FIG. 1 is a flow diagram of a text headline generation method according to one embodiment of the present application;

FIG. 2 is a schematic diagram illustrating an application flow of a first predetermined model according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a flow chart of an application of a first preset model and a second preset model according to an embodiment of the present application;

FIG. 4-1 is a flow diagram illustrating a context vector calculation according to an embodiment of the present application;

FIG. 4-2 is a flow diagram illustrating context vector calculation according to another embodiment of the present application;

fig. 5 is a schematic structural diagram of a text title generating device according to an embodiment of the present application;

fig. 6 is a block diagram of an electronic device for implementing a text title generation method according to an embodiment of the present application.

Detailed Description

The following description of the exemplary embodiments of the present application, taken in conjunction with the accompanying drawings, includes various details of the embodiments of the application for the understanding of the same, which are to be considered exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

A text title generation method and apparatus of an embodiment of the present application are described below with reference to the accompanying drawings.

In order to solve the problems that a text title is generated singly and is irrelevant to text content, the application provides an optimized text title generation mode, content can be extracted from a text, words capable of generating the title are accurately evaluated according to the words provided in the text content, the relevance of the generated title and the text content is guaranteed, and the generated title is not single any more due to the fact that the corresponding title is extracted from the text content.

Specifically, fig. 1 is a flowchart of a text title generating method according to an embodiment of the present application, and as shown in fig. 1, the method includes:

step 101, extracting a word vector of each word in a word sequence contained in the target text, and encoding the word vector according to a first preset model to generate a word encoding vector of each word.

Specifically, the target text may be understood as an article to be generated with a title, and a word sequence included in the target text is extracted, for example, the target text is subjected to word segmentation to generate a plurality of segmented words, the word sequence is obtained according to a front-back composition sequence of the plurality of segmented words, further, a word vector of each segmented word is extracted, and the word vector is encoded according to a first preset model to generate a word encoding vector of each word.

The first preset model may be a Sequence2Sequence (seq 2seq for short) or a model composed of a Pointer Generator.

Referring to fig. 2, the conventional seq2seq model combines the hidden state of the encoder and the hidden state of the decoder into an intermediate vector C through the Attention Mechanism, then uses the decoder to decode and predict, finally obtains the probability distribution of each participle in the vocabulary possibly being a participle in the title through the softmax layer, and selects the highest probability as the prediction result of the current title.

The Pointer-Generator model, in addition to the above process, also adds a part of the Pointer Networks. The idea of the pointernetworks application is very intuitive, namely to use it to replicate words in the source target text. Briefly, at each prediction, the probability distribution of the participles in the vocabulary as the participles in the title can be obtained through the prediction of the traditional seq2seq model (i.e. the result of softmax layer), then the probability distribution of the participles possibly becoming the title contained in the word sequence of the input target text can be obtained through the Pointer Networks, and the combination of the two can obtain a probability distribution combining the participles in the input target text and the prediction vocabulary (the word "2-0" in the histogram of the final result is not in the prediction vocabulary, and is from the input text), so that the model can copy some participles from the input target text directly to the output result of the title. The generated title is ensured to have strong correlation with the text content of the target text.

In the actual operation process, in order to ensure that the generated word coding vector of each word can reflect the mutual correlation of each word between word sequences, the coding vector of each word can be generated according to adjacent participles.

In an embodiment of the present application, it is determined whether each word is a first word in a word sequence, if the word is the first word, a word encoding vector corresponding to the first word is generated according to a default generation manner, and if the word is not the first word, a word encoding vector of a previous word of a current word is obtained, and the word encoding vector of the previous word and the word vector of the current word are concatenated to generate a word encoding vector of the current word.

Step 102, calculating a first attention probability corresponding to the word code vector of each word according to a first preset model.

Specifically, the attention layer may be included in the first preset model, and the attention layer may calculate a first attention probability corresponding to each word according to the word encoding vector of each word, where the first attention probability may be a probability that the corresponding word segmentation may become a word segmentation in the title.

And 103, calculating the word code vector of each word according to a second preset model, and generating a second attention probability corresponding to the word code vector of each word.

It is to be understood that, in the present application, in order to further optimize the generated title, a second preset model is introduced, as shown in fig. 3, the second preset model may also be calculated for the word encoding vector of each word, and generate a second attention probability corresponding to the word encoding vector of each word, where the second attention probability represents a probability that a word in the target text may be a word in the title, and the second attention probability may act together with the first attention probability to provide a strong reference for the final title determination.

As a possible implementation manner, the second preset model may be a sample deep learning model obtained by training according to sample data that is not exactly the same as the first preset model.

As another possible implementation manner, the second predetermined model may be a fully-connected network model, and the formula of the model is shown in formula (1):

p = U (tanh (Wx + b)) formula (1)

Wherein x is a word encoding vector of each word, p is a second attention probability, and W, U, b are model learning parameters.

It should be emphasized that the second preset model is trained together with the first preset model, in the training stage, a loss value of a combined output result of the first preset model and the second preset model is calculated (according to the similarity between the selected probability (the first attention probability and the second attention probability) of each participle and the finally selected participle, and the like), and in the stage of calculating the loss value, the second preset model is optimized in a combined manner.

And 104, calculating the word coding vector of each word according to the first attention probability and the second attention probability, and acquiring at least one context vector corresponding to the word sequence.

Specifically, after the first attention probability and the second attention probability are obtained, a word encoding vector of each word is calculated by combining the first attention probability and the second attention probability, and at least one context vector corresponding to the word sequence is obtained, wherein the context vector can be understood as a result output by the attention layer.

It should be noted that, in different application scenarios, the word encoding vector of each word is calculated differently according to the first attention probability and the second attention probability, which is exemplified as follows:

as a possible example, a difference between the first attention probability and the second attention probability of each word is calculated, a target attention probability of each word is obtained, a word encoding vector of each word is calculated according to the target attention probability, a candidate context vector corresponding to the word encoding vector of each word is obtained, at least one context vector corresponding to a word sequence is obtained according to the candidate context vectors of all words, for example, all the candidate context vectors are used as at least one last output context vector, and for example, as shown in fig. 4-1, each candidate context vector is used as an input parameter of a candidate context vector of a next word, a total context vector is calculated together as an output context vector, or, as shown in fig. 4-2, each selected context vector is used as an input parameter of a candidate context vector of a next word, and each step outputs the calculated corresponding candidate context vector as the at least one context vector.

As another possible implementation manner, an arithmetic mean of the first attention probability and the second attention probability of each word is calculated, a target attention probability of each word is obtained, the target attention probability of each word is superposed on an attention layer of the first preset model, and at least one context vector corresponding to the word sequence is obtained.

And 105, decoding the context vector according to the first preset model to generate a decoding word sequence, and acquiring a text title corresponding to the target text according to the decoding word sequence.

Specifically, it may be understood that the first preset model includes a decoding end, the context vector is decoded according to the first preset model to generate a decoded word sequence, and a text title corresponding to the target text is obtained according to the decoded word sequence.

In an embodiment of the present application, in order to ensure the integrity of the generated title, after the title is obtained, it is determined whether the decoded word sequence satisfies a preset title condition, and if not, the decoded word sequence is modified according to the title condition, and the modified decoded word sequence is obtained as a text title.

Wherein the preset title condition is different in different application scenarios:

as a possible implementation manner, whether the end of the decoded word sequence is an end punctuation is judged, if not, an end clause in the decoded word sequence is extracted, the clause can be broken through recognition of a segmentation symbol, or the clause can be broken through a semantic recognition mode, the end clause is matched with a target text, whether the target text contains a target clause which is successfully matched is judged, if the end clause is an original sentence in the target text, the end clause is considered as a constituent part of a title, and if the target text does not contain the target clause, the decoded word sequence is determined not to meet a preset title condition, and the end clause in the decoded word sequence is deleted. In this embodiment, to avoid the erroneous deletion of a clause in the title, for example, if the title itself is a clause without punctuation, before determining whether the end of the decoded word sequence is an end punctuation, it may be determined whether the number of words included in the decoded word sequence is greater than a preset threshold, and only if the number of words included in the decoded word sequence is greater than the preset threshold, it is further determined whether the end of the decoded word sequence is an end punctuation, otherwise, the decoded word sequence is not processed.

As another possible implementation manner, it is determined whether the decoded word sequence includes repeated words, such as word sequences of AA, ABAB, and abcabcabc, where each letter in the above example represents a word, and if such repeated words are found, it is determined whether the repeated word sequence segment appears in the article, and if not, the repeated word sequence segment is directly deleted or the repeated word sequence segment is subjected to deduplication processing.

Therefore, in the text title generating method in the embodiment, the attention probability of words which may generate a title in the target text is considered according to the word vector coding of the target text by introducing the second preset model, so that it is ensured that the title content can be directly found from the target text as much as possible, and the smoothness of the generated title and the relevance of the generated title to the target text are ensured.

To sum up, the text title generating method according to the embodiment of the present application extracts a word vector of each word in a word sequence included in a target text, encodes the word vector according to a first preset model to generate a word encoding vector of each word, calculates a first attention probability corresponding to the word encoding vector of each word according to the first preset model, calculates a word encoding vector of each word according to a second preset model to generate a second attention probability corresponding to the word encoding vector of each word, calculates the word encoding vector of each word according to the first attention probability and the second attention probability to obtain at least one context vector corresponding to the word sequence, finally decodes the context vector according to the first preset model to generate a decoded word sequence, and obtains a text title corresponding to the target text according to the decoded word sequence. Therefore, the text title can be accurately acquired, and the generated title is high in content association with the target text.

In order to implement the foregoing embodiments, the present application further provides a text title generating apparatus, fig. 5 is a schematic structural diagram of the text title generating apparatus according to an embodiment of the present application, and as shown in fig. 5, the text title generating apparatus includes: a first generation module 10, a calculation module 20, a second generation module 30, a first acquisition module 40 and a second acquisition module 50, wherein,

the first generation module 10 is configured to extract a word vector of each word in a word sequence included in the target text, and encode the word vector according to a first preset model to generate a word encoding vector of each word;

a calculating module 20, configured to calculate a first attention probability corresponding to the word coding vector of each word according to a first preset model;

the second generating module 30 is configured to calculate a word code vector of each word according to a second preset model, and generate a second attention probability corresponding to the word code vector of each word;

a first obtaining module 40, configured to calculate a word encoding vector of each word according to the first attention probability and the second attention probability, and obtain at least one context vector corresponding to the word sequence;

and a second obtaining module 50, configured to decode the context vector according to the first preset model to generate a decoded word sequence, and obtain a text title corresponding to the target text according to the decoded word sequence.

In one embodiment of the present application, the first generating module 10 is specifically configured to:

judging whether each word is the first word in the word sequence;

if the word is not the first word, acquiring a word coding vector of a previous word of the current word;

and splicing the word code vector of the previous word and the word vector of the current word to generate the word code vector of the current word.

In an embodiment of the present application, the first obtaining module 40 is specifically configured to:

calculating the difference between the first attention probability and the second attention probability of each word, and acquiring the target attention probability of each word;

calculating the word code vector of each word according to the target attention probability, and acquiring a candidate context vector corresponding to the word code vector of each word;

and acquiring at least one context vector corresponding to the word sequence according to the candidate context vectors of all the words.

In an embodiment of the present application, the second obtaining module 50 is specifically configured to:

judging whether the decoded word sequence meets a preset title condition or not;

if the title condition is not met, modifying the decoding word sequence according to the title condition;

and acquiring the modified decoding word sequence as a text title.

It should be noted that the foregoing explanation of the text title generation method is also applicable to the text title generation apparatus according to the embodiment of the present invention, and the implementation principle thereof is similar, and is not repeated herein.

To sum up, the text title generating apparatus according to the embodiment of the present application extracts a word vector of each word in a word sequence included in a target text, encodes the word vector according to a first preset model to generate a word encoding vector of each word, calculates a first attention probability corresponding to the word encoding vector of each word according to the first preset model, calculates a word encoding vector of each word according to a second preset model to generate a second attention probability corresponding to the word encoding vector of each word, calculates the word encoding vector of each word according to the first attention probability and the second attention probability to obtain at least one context vector corresponding to the word sequence, and finally decodes the context vector according to the first preset model to generate a decoded word sequence and obtains a text title corresponding to the target text according to the decoded word sequence. Therefore, the text title can be accurately acquired, and the generated title is highly associated with the content of the target text.

According to an embodiment of the present application, an electronic device and a readable storage medium are also provided.

As shown in fig. 6, the embodiment of the present application is a block diagram of an electronic device of a method for generating a text title. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Electronic devices may also represent various forms of mobile devices, such as personal digital processors, cellular telephones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the present application that are described and/or claimed herein.

As shown in fig. 6, the electronic apparatus includes: one or more processors 601, memory 602, and interfaces for connecting the various components, including a high-speed interface and a low-speed interface. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions for execution within the electronic device, including instructions stored in or on the memory to display graphical information of a GUI on an external input/output apparatus (such as a display device coupled to the interface). In other embodiments, multiple processors and/or multiple buses may be used, along with multiple memories and multiple memories, as desired. Also, multiple electronic devices may be connected, with each device providing portions of the necessary operations (e.g., as a server array, a group of blade servers, or a multi-processor system). In fig. 6, one processor 601 is taken as an example.

The memory 602 is a non-transitory computer readable storage medium as provided herein. Wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to perform the methods provided herein. The non-transitory computer readable storage medium of the present application stores computer instructions for causing a computer to perform the methods provided herein.

The memory 602, which is a non-transitory computer readable storage medium, may be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the methods of text title generation in the embodiments of the present application. The processor 601 executes various functional applications of the server and data processing, i.e., a method of text title generation in the above-described method embodiments, by running non-transitory software programs, instructions, and modules stored in the memory 602.

The memory 602 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of the electronic device, and the like. Further, the memory 602 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory 602 optionally includes memory located remotely from the processor 601, which may be connected to the electronic device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The electronic device performing the method of text title generation may further include: an input device 603 and an output device 604. The processor 601, the memory 602, the input device 603, and the output device 604 may be connected by a bus or other means, and are exemplified by being connected by a bus in fig. 6.

The input device 603 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic apparatus, such as a touch screen, keypad, mouse, track pad, touch pad, pointer stick, one or more mouse buttons, track ball, joystick, or other input device. The output devices 604 may include a display device, auxiliary lighting devices (e.g., LEDs), and tactile feedback devices (e.g., vibrating motors), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device can be a touch screen.

Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

These computer programs (also known as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

In order to achieve the above embodiments, the present application also proposes a non-transitory computer-readable storage medium storing computer instructions for causing the computer to execute the text title generating method in the above embodiments.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, and the present invention is not limited thereto as long as the desired results of the technical solutions disclosed in the present application can be achieved.

The above-described embodiments are not intended to limit the scope of the present disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims

1. A method for generating a text title, comprising:

extracting a word vector of each word in a word sequence contained in a target text, and coding the word vector according to a first preset model to generate a word coding vector of each word;

calculating a first attention probability corresponding to the word code vector of each word according to the first preset model;

calculating the word code vector of each word according to a second preset model, and generating a second attention probability corresponding to the word code vector of each word;

calculating the word coding vector of each word according to the first attention probability and the second attention probability to obtain at least one context vector corresponding to the word sequence;

decoding the context vector according to the first preset model to generate a decoding word sequence, and acquiring a text title corresponding to the target text according to the decoding word sequence.

2. The method of claim 1, wherein said encoding said word vector according to a first predetermined model to generate a word encoding vector for said each word comprises:

judging whether each word is the first word in the word sequence;

3. The method of claim 1, wherein said obtaining at least one context vector corresponding to the sequence of words based on the first attention probability and the second attention probability comprises:

calculating the word coding vector of each word according to the target attention probability to obtain a candidate context vector corresponding to the word coding vector of each word;

4. The method of claim 1, wherein said obtaining a text header corresponding to the target text from the sequence of decoded words comprises:

and acquiring the modified decoding word sequence as the text title.

5. The method of claim 4, wherein said determining whether said sequence of decoded words satisfies a predetermined heading condition comprises:

judging whether the end of the decoding word sequence is an end punctuation;

if the decoded word sequence is not the end punctuation, extracting the tail clause in the decoded word sequence;

matching the last clause with the target text, and judging whether the target text contains a successfully matched target clause;

if the target clause is not included, determining that the decoding word sequence does not meet the preset title condition;

if the title condition is not satisfied, modifying the decoded word sequence according to the title condition, including:

and deleting the tail clause in the decoding word sequence.

6. The method of claim 1, wherein the second predetermined model formula is:

P＝U(tan h(Wx+b))

wherein x is the word encoding vector of each word, p is the second attention probability, and W, U, b are model learning parameters.

7. A text title generation apparatus, comprising:

the first generation module is used for extracting a word vector of each word in a word sequence contained in the target text and generating a word coding vector of each word by coding the word vector according to a first preset model;

the calculation module is used for calculating a first attention probability corresponding to the word coding vector of each word according to the first preset model;

the second generation module is used for calculating the word coding vector of each word according to a second preset model and generating a second attention probability corresponding to the word coding vector of each word;

a first obtaining module, configured to calculate a word encoding vector of each word according to the first attention probability and the second attention probability, and obtain at least one context vector corresponding to the word sequence;

and the second obtaining module is used for decoding the context vector according to the first preset model to generate a decoding word sequence and obtaining a text title corresponding to the target text according to the decoding word sequence.

8. The apparatus of claim 7, wherein the first generating module is specifically configured to:

judging whether each word is the first word in the word sequence;

9. The apparatus of claim 7, wherein the first obtaining module is specifically configured to:

10. The apparatus of claim 7, wherein the second obtaining module is specifically configured to:

and acquiring the modified decoding word sequence as the text title.

11. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the text title generation method of any one of claims 1-6.

12. A non-transitory computer-readable storage medium storing computer instructions for causing a computer to execute the text title generating method according to any one of claims 1 to 6.