CN113901174A

CN113901174A - Text abstract extraction method and device

Info

Publication number: CN113901174A
Application number: CN202111186594.6A
Authority: CN
Inventors: 李清
Original assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Wodong Tianjun Information Technology Co Ltd
Current assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Wodong Tianjun Information Technology Co Ltd
Priority date: 2021-10-12
Filing date: 2021-10-12
Publication date: 2022-01-07

Abstract

The invention discloses a text abstract extraction method and device, and relates to the technical field of natural language processing. One embodiment of the method comprises: converting each sentence in the text into a semantic vector, and inputting the semantic vector of each sentence into a first recurrent neural network to output a semantic representation vector of each sentence; converting each sentence in the text into a grammatical structure vector, and inputting the grammatical structure vector of each sentence into a second recurrent neural network to output a structure representation vector of each sentence; inputting the semantic representation vector and the structural representation vector of each sentence into a semantic structure convolution layer to output a text content vector; and inputting the text content vector into a third recurrent neural network to output a target sentence as a text abstract. The implementation mode can solve the technical problem that the abstraction type text abstract does not conform to the meaning of the text center.

Description

Text abstract extraction method and device

Technical Field

The invention relates to the technical field of natural language processing, in particular to a text abstract extracting method and device.

Background

The flooding of a large amount of information causes a lot of congestion in the aspects of information understanding, information analysis and the like, so that the extraction of information with core meaning from the large amount of information becomes a big problem in the internet at present.

At present, a text abstract extraction mode is mainly adopted to extract information from mass data, and the text abstract extraction mode is mainly divided into an extraction mode and a generation mode. The abstract of the extraction type text is mainly characterized in that the abstract of a core sentence representing the text is directly extracted from the text, so that the abstract extracted in the mode is more in line with the human language rule, is more readable and is more fluent in expression.

For the abstract text, there are many algorithms used at present, such as kmeans clustering, textrank algorithm, neural network manner, and so on. The sentences extracted by the algorithm have the problem that the sentences do not accord with the meaning of the text center, and especially in the aspect of Chinese information extraction, Chinese is more difficult in the aspect of Chinese abstract extraction because Chinese has complex grammar logic and text structures.

Disclosure of Invention

In view of this, embodiments of the present invention provide a method and an apparatus for extracting a text abstract to solve the technical problem that an extracted text abstract does not conform to the meaning of a text center.

In order to achieve the above object, according to an aspect of an embodiment of the present invention, there is provided a text summarization extraction method, including:

converting each sentence in the text into a semantic vector, and inputting the semantic vector of each sentence into a first recurrent neural network to output a semantic representation vector of each sentence;

converting each sentence in the text into a grammatical structure vector, and inputting the grammatical structure vector of each sentence into a second recurrent neural network to output a structure representation vector of each sentence;

inputting the semantic representation vector and the structural representation vector of each sentence into a semantic structure convolution layer to output a text content vector;

and inputting the text content vector into a third recurrent neural network to output a target sentence as a text abstract.

Optionally, converting each sentence in the text into a grammar structure vector includes:

for each sentence in the text, calculating the number of entities in the sentence, and generating a grammar vector of the sentence according to the number of entities in the sentence;

for each sentence in the text, calculating a distance between the sentence and a baseline sentence, thereby generating a structure vector of the sentence according to the distance between the sentence and the baseline sentence;

and inputting the grammar vector and the structure vector of the sentence into a grammar structure convolution layer to output the grammar structure vector of the sentence.

Optionally, calculating the gap between the sentence and the baseline sentence comprises:

sequentially numbering each sentence in the text according to a positive sequence, and counting the number of the sentences in the text to be N;

and taking the sentence with the number of N/2 as a baseline sentence, and calculating the difference value between the number of the sentence and the number of the baseline sentence.

Optionally, converting each sentence in the text into a semantic vector, including:

and converting each sentence in the text into a semantic vector by adopting a BERT model.

Optionally, inputting the syntactic structure vector of each sentence into a second recurrent neural network to output a structural characterization vector of each sentence, including:

inputting the grammar structure vector of each sentence into a first proof layer to output a proof grammar structure vector of each sentence; the first proofreading layer proofreads the grammatical structure vector of each sentence by adopting a difference degree vector;

and inputting the proofreading grammar structure vector of each sentence into a second recurrent neural network to output a structure representation vector of each sentence.

Optionally, the checking the grammar structure vector of each sentence by using the disparity vector includes:

and multiplying the grammar structure vector of each sentence by the difference vector bit by bit.

Optionally, inputting the text content vector into a third recurrent neural network to output a target sentence as a text summary, including:

inputting the text content vector into a second calibration layer to output a calibrated text content vector; wherein the second correction layer corrects the text content vector by using the disparity vector;

and inputting the proofreading text content vector into a third recurrent neural network to output a target sentence as a text abstract.

Optionally, the proofreading the text content vector by using the disparity vector includes:

and multiplying the text content vector and the difference degree vector in a bit-by-bit mode.

Optionally, before converting each sentence in the text into a semantic vector, the method further includes:

converting the abstract sentences of the text into semantic vectors by adopting a BERT model;

and calculating a difference degree vector between the semantic vector of the abstract sentence and the semantic vector of the target sentence output by the third recurrent neural network.

Optionally, the first recurrent neural network is a bidirectional long-short term memory recurrent neural network; and/or the second recurrent neural network is a bidirectional long-short term memory recurrent neural network; and/or the third cyclic neural network is a unidirectional long-short term memory artificial neural network.

In addition, according to another aspect of the embodiments of the present invention, there is provided a text summarization extracting apparatus including:

the first representation module is used for converting each sentence in the text into a semantic vector, and inputting the semantic vector of each sentence into a first recurrent neural network so as to output the semantic representation vector of each sentence;

the second representation module is used for converting each sentence in the text into a grammatical structure vector, and inputting the grammatical structure vector of each sentence into a second recurrent neural network so as to output the structure representation vector of each sentence;

the text content module is used for inputting the semantic representation vector and the structural representation vector of each sentence into the semantic structure convolution layer so as to output a text content vector;

and the extraction module is used for inputting the text content vector into a third recurrent neural network so as to output a target sentence as a text abstract.

Optionally, the second characterization module is further configured to:

Optionally, the first characterization module is further configured to:

Optionally, the second characterization module is further configured to:

Optionally, the extraction module is further configured to:

Optionally, the system further comprises a disparity module, configured to:

before each sentence in the text is converted into a semantic vector, converting the abstract sentence of the text into the semantic vector by adopting a BERT model;

According to another aspect of the embodiments of the present invention, there is also provided an electronic device, including:

one or more processors;

a storage device for storing one or more programs,

when the one or more programs are executed by the one or more processors, the one or more processors implement the method of any of the embodiments described above.

According to another aspect of the embodiments of the present invention, there is also provided a computer readable medium, on which a computer program is stored, which when executed by a processor implements the method of any of the above embodiments.

One embodiment of the above invention has the following advantages or benefits: because the technical means that each sentence in the text is converted into the semantic vector and the syntactic structure vector, the semantic vector and the syntactic structure vector are respectively input into the first cyclic neural network and the second cyclic neural network, the semantic representation vector and the structural representation vector of each sentence are respectively output, then the text content vector is output through the semantic structure convolution layer, and finally the text abstract is output through the third cyclic neural network is adopted, the technical problem that the extraction type text abstract does not accord with the central meaning of the text in the prior art is solved. The embodiment of the invention considers the grammar and the text structure of the Chinese sentence in the design of the text abstract extraction model, and converts each sentence into the semantic vector and the grammar structure vector, thereby accurately extracting the target sentence which accords with the meaning of the text center. In addition, the embodiment of the invention calculates the difference vector between the abstract sentence and the target sentence, feeds the difference vector back to the check layer as the check, and carries out parameter training on the whole text abstract extraction model, so that the whole model can make consistent adjustment on the same parameter; it can be seen from the whole text abstract extraction model that the extracted information is as close as possible to the most similar sentences due to the addition of the proof layer, and the model has preference selection for sentences with different degrees of similarity and selects the most similar sentences, thereby reducing the sentence errors to the minimum in the proof layer.

Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.

Drawings

The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:

FIG. 1 is a schematic diagram of a main flow of a text summarization extraction method according to an embodiment of the present invention;

FIG. 2 is a schematic structural diagram of a text summarization extraction model of an extraction phase according to an embodiment of the invention;

FIG. 3 is a schematic structural diagram of a text summarization extraction model in a training phase according to an embodiment of the present invention;

FIG. 4 is a diagram illustrating a main flow of a text summarization method according to a referential embodiment of the present invention;

FIG. 5 is a diagram illustrating the main modules of a text summarization extracting apparatus according to an embodiment of the present invention;

FIG. 6 is an exemplary system architecture diagram in which embodiments of the present invention may be employed;

fig. 7 is a schematic block diagram of a computer system suitable for use in implementing a terminal device or server of an embodiment of the invention.

Detailed Description

Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

By combining with the current domestic situation analysis, most text abstract extraction algorithms are used for researching English data sets, so that the particularity of Chinese scenes is ignored, and the effect of directly using the models in Chinese is deficient, so that the embodiment of the invention designs the algorithms aiming at Chinese. The embodiment of the invention mainly aims at the technical problem that the extraction type text abstract does not accord with the meaning of a text center, and provides a text abstract extraction method.

Fig. 1 is a schematic diagram of a main flow of a text summarization extraction method according to an embodiment of the present invention. As an embodiment of the present invention, as shown in fig. 1, the text summarization extraction method may include:

step 101, converting each sentence in the text into a semantic vector, inputting the semantic vector of each sentence into a first recurrent neural network, so as to output a semantic representation vector of each sentence.

The method comprises the steps of firstly, using a sentence segmentation tool to segment a text, then converting each sentence into a semantic vector, and then inputting the semantic vector of each sentence into a first cyclic neural network, wherein the first cyclic neural network learns the semantic vector of each sentence, thereby outputting the semantic representation vector of each sentence.

Optionally, as shown in fig. 2, converting each sentence in the text into a semantic vector includes: and converting each sentence in the text into a semantic vector by adopting a BERT model. After the text is divided into sentences, the sentences can be subjected to semantic vector conversion by adopting a BERT model and then input into a first recurrent neural network. Optionally, the first recurrent neural network is a bidirectional recurrent neural network, and the semantic vectors of the sentences are learned through the bidirectional recurrent neural network, which is beneficial to improving the representation capability of the semantic vectors. Optionally, the first recurrent neural network is a bidirectional long-short term memory recurrent neural network (bidirectional LSTM), and bidirectional learning is performed through the bidirectional long-short term memory recurrent neural network, which is helpful for accurately outputting semantic representation vectors of each sentence.

And 102, converting each sentence in the text into a grammatical structure vector, and inputting the grammatical structure vector of each sentence into a second recurrent neural network to output a structure representation vector of each sentence.

The Chinese expression has unique and complex grammar and text structure, so the embodiment of the invention takes the two factors into consideration in the design of the text abstract model. To take into account the impact of grammar on the sentence, the sentence may be parsed using the pyltp model. From the analysis of linguistics, the probability that the sentences with nouns and verbs represent the meaning of the center of the text is greatly increased; from the text structure analysis, the probability that the sentence at the beginning or the end of the article generally represents the meaning of the center of the text is also relatively high.

Optionally, converting each sentence in the text into a grammar structure vector includes: for each sentence in the text, calculating the number of entities in the sentence, and generating a grammar vector of the sentence according to the number of entities in the sentence; for each sentence in the text, calculating a difference between the sentence and a text baseline sentence, thereby generating a structure vector of the sentence according to the difference between the sentence and the text baseline sentence; and inputting the grammar vector and the structure vector of the sentence into a grammar structure convolution layer to output the grammar structure vector of the sentence. For each sentence in the text, performing syntax analysis on the sentence, for example, using a pyltp model to identify entities such as a verb, a name, and the like in each sentence, and counting the number of the entities, then converting the number of the entities of the sentence into a syntax vector of the sentence based on vector matrix filling, then calculating a gap between the sentence and a text baseline sentence, thereby converting the gap between the sentence and the text baseline sentence into a structure vector of the sentence based on vector matrix filling, and finally inputting the syntax vector of the sentence and the structure vector into a syntax structure convolution layer, and performing convolution operation on the syntax structure vector through the syntax structure convolution layer, thereby outputting the syntax structure vector of the sentence.

Optionally, calculating the gap between the sentence and the baseline sentence comprises: sequentially numbering each sentence in the text according to a positive sequence, and counting the number of the sentences in the text to be N; and taking the sentence with the number of N/2 as a baseline sentence, and calculating the difference value between the number of the sentence and the number of the baseline sentence. Since the end and beginning are typically the focus of the text, more weight is given to the beginning and end of the text. Specifically, the text is divided into sentences, each sentence is numbered in sequence according to the positive sequence, then the number of the sentences of the text is counted to be N, the sentence with the number of N/2 is used as a baseline sentence of the text, and finally the difference value between the number of each sentence and the number of the baseline sentence of the text is calculated. In an embodiment of the present invention, the importance of a sentence in the overall textual structure may be determined based on the difference between the number of the sentence and the number of the baseline sentence.

Optionally, as shown in fig. 2, inputting the syntactic structure vector of each sentence into a second recurrent neural network to output a structure characterizing vector of each sentence, including: inputting the grammar structure vector of each sentence into a first proof layer to output a proof grammar structure vector of each sentence; the first proofreading layer proofreads the grammatical structure vector of each sentence by adopting a difference degree vector; and inputting the proofreading grammar structure vector of each sentence into a second recurrent neural network to output a structure representation vector of each sentence. After obtaining the grammar structure vector of each sentence, correcting the grammar structure vector of each sentence through the first proofreading layer, so that the difference degree vector can be fused with the structure representation vector of the sentence to obtain the proofreading grammar structure vector of the sentence. And after the proofreading, continuously inputting the proofreading grammar structure vector of each sentence into a second cyclic neural network, and learning the proofreading grammar structure vector of each sentence by the second cyclic neural network so as to output the structure representation vector of each sentence.

Optionally, the checking the grammar structure vector of each sentence by using the disparity vector includes: and multiplying the grammar structure vector of each sentence by the difference vector bit by bit. For example, the following formula can be used for the collation:

wherein, the verification is a disparity vector, the article _ structrue is a syntactic structure vector of a sentence, and the article _ structrue' is a proofreading syntactic structure vector.

Optionally, the second recurrent neural network is a bidirectional recurrent neural network, and the grammar structure vector of the sentence is learned through the bidirectional recurrent neural network, which is helpful for improving the characterization capability of the grammar structure vector. Optionally, the second recurrent neural network is a bidirectional long-short term memory recurrent neural network (bidirectional LSTM), and bidirectional learning is performed through the bidirectional long-short term memory recurrent neural network, which is helpful for accurately outputting the structural characterization vectors of each sentence. The embodiment of the present invention uses bi-directional LSTM because the chinese text structure is determined by context-based correspondence, and therefore the overall structure of the text needs to be considered.

Step 103, inputting the semantic representation vector and the structural representation vector of each sentence into a semantic structure convolution layer to output a text content vector.

Through the steps 101 and 102, semantic representation vectors and structural representation vectors of each sentence based on Chinese special analysis are obtained, and convolution operation is performed on the semantic representation vectors and the structural representation vectors to obtain corresponding text content vector representation. Specifically, as shown in fig. 2, after obtaining the semantic representation vector and the structural representation vector of each sentence, the semantic representation vector and the structural representation vector of the sentence are input into the semantic structure convolution layer together, and are subjected to convolution operation by the semantic structure convolution layer, so that a text content vector, that is, a content vector representation of the entire text, is output.

And 104, inputting the text content vector into a third recurrent neural network to output a target sentence as a text abstract.

After the text content vector is output through the semantic structure convolution layer, the text content vector is continuously input into a third recurrent neural network and is decoded, so that the vector of the text abstract is output. Optionally, the third recurrent neural network is a one-way long-short term memory artificial neural network, and in this embodiment, the text content vector is decoded by the one-way long-short term memory artificial neural network, so as to output a semantic vector of the target sentence, that is, a semantic vector of the text abstract.

Optionally, as shown in fig. 2, inputting the text content vector into a third recurrent neural network to output a target sentence as a text abstract, including: inputting the text content vector into a second calibration layer to output a calibrated text content vector; wherein the second correction layer corrects the text content vector by using the disparity vector; and inputting the proofreading text content vector into a third recurrent neural network to output a target sentence as a text abstract. After the text content vector is obtained, the text content vector is corrected through a second correction layer, so that the difference degree vector and the text content vector are fused to obtain a corrected text content vector. And after the proofreading, continuously inputting the proofreading text content vector into a third cyclic neural network, and decoding the proofreading text content vector by the third cyclic neural network so as to output the semantic vector of the target sentence and take the target sentence as a text abstract.

Optionally, the proofreading the text content vector by using the disparity vector includes: and multiplying the text content vector and the difference degree vector in a bit-by-bit mode. For example, the following formula can be used for the collation:

wherein, verification is a difference vector, content is a text content vector, and content' is a proofreading text content vector.

At present, the generality of most algorithms for extracting text abstracts by using a neural network is not considered, and output sentences cannot be well fed back to network parameters, so that the extraction effect of the model is poor. The embodiment of the invention calculates the difference degree vector through the difference between the marked abstract sentences and the target sentences output by the third recurrent neural network. Optionally, before step 101, the method further includes: converting the abstract sentences of the text into semantic vectors by adopting a BERT model; and calculating a difference degree vector between the semantic vector of the abstract sentence and the semantic vector of the target sentence output by the third recurrent neural network. Firstly, abstract sentences of a text can be marked manually, as shown in fig. 3, then the abstract sentences are converted into semantic vectors by adopting a BERT model, then the semantic vectors of the abstract sentences are input into a difference calculation layer, meanwhile, the semantic vectors of target sentences output by a third recurrent neural network are also input into the difference calculation layer, the difference calculation layer calculates the difference vectors between the semantic vectors of the abstract sentences and the semantic vectors of the target sentences by adopting Euclidean distance, and the larger the difference is, the more the extracted information is not in accordance with the requirements; the smaller the difference, the higher the extracted information quality. In the training stage of the model, the difference degree calculation layer is mainly used for calculating the difference between the abstract sentence and the target abstract and feeding the difference back to the first calibration layer and the second calibration layer, so that the parameters of the whole text extraction model are adjusted, and the extraction effect of the model is better.

Because the proofreading layer is used for proofreading the semantics and the text structure of the sentences, the sentences with the maximum similarity can be selected from the similar sentences, so that the error is reduced, and the model can extract the optimal information. The embodiment of the invention not only considers the special grammatical characteristics of Chinese in the text abstract extraction model, but also considers the structure of the whole text, and adopts a mode of proofreading with abstract sentences to lead the model parameters to be more comprehensively trained, thereby being beneficial to extracting the target sentences with the meaning of text centers.

The method provided by the embodiment of the invention can be used in a large number of commodity summaries, and sentences capable of representing the characteristics of the commodities are extracted to be used as brief descriptions of the commodities, so that corresponding introduction does not need to be manually written, and corresponding manpower is saved.

According to the various embodiments described above, it can be seen that the technical problem that the extraction type text abstract does not conform to the meaning of the center of the text in the prior art is solved by the technical means of converting each sentence in the text into the semantic vector and the syntactic structure vector, respectively inputting the semantic vector and the syntactic structure vector into the first recurrent neural network and the second recurrent neural network, respectively outputting the semantic representation vector and the structural representation vector of each sentence, then outputting the text content vector through the semantic structure convolution layer, and finally outputting the text abstract through the third recurrent neural network in the embodiments of the present invention. The embodiment of the invention considers the grammar and the text structure of the Chinese sentence in the design of the text abstract extraction model, and converts each sentence into the semantic vector and the grammar structure vector, thereby accurately extracting the target sentence which accords with the meaning of the text center. In addition, the embodiment of the invention calculates the difference vector between the abstract sentence and the target sentence, feeds the difference vector back to the check layer as the check, and carries out parameter training on the whole text abstract extraction model, so that the whole model can make consistent adjustment on the same parameter; it can be seen from the whole text abstract extraction model that the extracted information is as close as possible to the most similar sentences due to the addition of the proof layer, and the model has preference selection for sentences with different degrees of similarity and selects the most similar sentences, thereby reducing the sentence errors to the minimum in the proof layer.

Fig. 4 is a schematic diagram of a main flow of a text summarization extraction method according to a referential embodiment of the present invention. As another embodiment of the present invention, as shown in fig. 4, the text summarization extracting method may include:

step 401, each sentence in the text is converted into a semantic vector by using a BERT model.

The sentence segmentation tool can be used for segmenting the text, and then a BERT model is used for converting each sentence in the text into a semantic vector.

Step 402, inputting the semantic vector of each sentence into a first recurrent neural network to output the semantic representation vector of each sentence.

And then, inputting the semantic vector of each sentence into a first recurrent neural network, and learning the semantic vector of each sentence by the first recurrent neural network so as to output the semantic representation vector of each sentence.

Step 403, for each sentence in the text, calculating the number of entities in the sentence, thereby generating a grammar vector of the sentence according to the number of entities in the sentence.

From the linguistic point of view, the probability that the sentence with nouns and verbs represents the meaning of the center of the text is greatly increased, so that the sentence can be subjected to syntactic analysis by adopting a pyltp model, the number of entities in each sentence is calculated, and then the number of the entities in each sentence is converted into a syntactic vector of the sentence based on vector matrix filling.

Step 404, for each sentence in the text, calculating a difference between the sentence and a baseline sentence, thereby generating a structure vector of the sentence according to the difference between the sentence and the baseline sentence.

From the text structure analysis, generally, the probability that the sentence at the beginning or the end of the article indicates the meaning of the center of the text is also relatively high, so that each sentence in the text can be numbered in sequence according to the positive sequence, the number of the sentences in the text is counted to be N, the sentence with the number of N/2 is taken as a baseline sentence, the difference value between the number of each sentence and the number of the baseline sentence is calculated, and then the difference between the sentence and the baseline sentence is converted into the structure vector of the sentence based on vector matrix filling.

Step 405, inputting the grammar vector and the structure vector of the sentence into a grammar structure convolution layer to output the grammar structure vector of the sentence.

Subsequently, the grammar vector and the structure vector of the sentence are input together into a grammar structure convolution layer, and convolution operation is performed on the grammar structure convolution layer, so that the grammar structure vector of the sentence is output.

Step 406, inputting the grammar structure vector of each sentence into a first proof layer to output a proof grammar structure vector of each sentence; and the first proofreading layer proofreads the grammar structure vector of each sentence by adopting the difference degree vector.

After obtaining the grammar structure vector of each sentence, correcting the grammar structure vector of each sentence through the first proofreading layer, so that the difference degree vector can be fused with the structure representation vector of the sentence to obtain the proofreading grammar structure vector of the sentence.

Step 407, inputting the collation grammar structure vector of each sentence into a second recurrent neural network to output the structure representation vector of each sentence.

And after the proofreading, continuously inputting the proofreading grammar structure vector of each sentence into a second cyclic neural network, and learning the proofreading grammar structure vector of each sentence by the second cyclic neural network so as to output the structure representation vector of each sentence.

Step 408, inputting the semantic representation vector and the structural representation vector of each sentence into the semantic structure convolution layer to output a text content vector.

After the semantic representation vector and the structural representation vector of each sentence based on Chinese special analysis are obtained, convolution operation is carried out on the semantic representation vector and the structural representation vector, and then corresponding text content vector representation can be obtained.

Step 409, inputting the text content vector into a second correction layer to output a corrected text content vector; wherein the second correction layer corrects the text content vector using the disparity vector.

After the text content vector is obtained, the text content vector is corrected through a second correction layer, so that the difference degree vector and the text content vector are fused to obtain a corrected text content vector.

And step 410, inputting the proofreading text content vector into a third recurrent neural network to output a target sentence as a text abstract.

And after the proofreading, continuously inputting the proofreading text content vector into a third cyclic neural network, and decoding the proofreading text content vector by the third cyclic neural network so as to output the semantic vector of the target sentence and take the target sentence as a text abstract.

It should be noted that, in the training phase of the text summarization extraction model, as shown in fig. 3, the embodiment of the present invention calculates the disparity vector by the difference between the labeled summary sentence and the target sentence output by the third-cycle neural network. Firstly, abstract sentences of a text can be marked manually, as shown in fig. 3, then the abstract sentences are converted into semantic vectors by adopting a BERT model, then the semantic vectors of the abstract sentences are input into a difference calculation layer, meanwhile, the semantic vectors of target sentences output by a third recurrent neural network are also input into the difference calculation layer, the difference calculation layer calculates the difference vectors between the semantic vectors of the abstract sentences and the semantic vectors of the target sentences by adopting Euclidean distance, and the larger the difference is, the more the extracted information is not in accordance with the requirements; the smaller the difference, the higher the extracted information quality.

In the training stage of the model, the difference degree calculation layer is mainly used for calculating the difference between the abstract sentence and the target abstract and feeding the difference back to the first calibration layer and the second calibration layer, so that the parameters of the whole text extraction model are adjusted, and the extraction effect of the model is better.

The embodiment of the invention designs a text abstract extraction model combining semantics and a syntactic structure by considering Chinese characteristics, and simultaneously carries out uniform and synchronous training and adjustment on a first proofreading layer and a second proofreading layer, so that the whole model carries out consistent adjustment on the same parameter, and the parameter in the network is trained more comprehensively. By using the model, the target sentence in the text can be accurately extracted as the sentence representative of the central meaning, so that the important information in the text can be quickly acquired.

In addition, in one embodiment of the present invention, the detailed implementation of the text abstract extraction method is described in detail above, so that the repeated description is omitted here.

Fig. 5 is a schematic diagram of the main modules of a text summarization extracting apparatus according to an embodiment of the present invention, and as shown in fig. 5, the text summarization extracting apparatus 500 includes a first representation module 501, a second representation module 502, a text content module 503 and an extracting module 504; the first representation module 501 is configured to convert each sentence in the text into a semantic vector, and input the semantic vector of each sentence into the first recurrent neural network to output the semantic representation vector of each sentence; the second representation module 502 is configured to convert each sentence in the text into a syntactic structure vector, and input the syntactic structure vector of each sentence into a second recurrent neural network to output a structural representation vector of each sentence; the text content module 503 is configured to input the semantic representation vector and the structural representation vector of each sentence into the semantic structure convolution layer to output a text content vector; the extraction module 504 is configured to input the text content vector into a third recurrent neural network to output a target sentence as a text abstract.

Optionally, the second characterization module 502 is further configured to:

Optionally, the first characterization module 501 is further configured to:

Optionally, the second characterization module 502 is further configured to:

Optionally, the extraction module 504 is further configured to:

Optionally, the system further comprises a disparity module, configured to:

It should be noted that, in the implementation of the text abstract extracting apparatus of the present invention, the text abstract extracting method has been described in detail above, and therefore, the repeated content herein will not be described again.

Fig. 6 illustrates an exemplary system architecture 600 to which a text summarization method or a text summarization apparatus of an embodiment of the present invention may be applied.

As shown in fig. 6, the system architecture 600 may include

terminal devices

601, 602, 603, a network 604, and a server 605. The network 604 serves to provide a medium for communication links between the

terminal devices

601, 602, 603 and the server 605. Network 604 may include various types of connections, such as wire, wireless communication links, or fiber optic cables, to name a few.

A user may use the

terminal devices

601, 602, 603 to interact with the server 605 via the network 604 to receive or send messages or the like. The

terminal devices

601, 602, 603 may have installed thereon various communication client applications, such as shopping applications, web browser applications, search applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only).

The

terminal devices

601, 602, 603 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.

The server 605 may be a server providing various services, such as a background management server (for example only) providing support for shopping websites browsed by users using the

terminal devices

601, 602, 603. The background management server can analyze and process the received data such as the article information query request and feed back the processing result to the terminal equipment.

It should be noted that the text abstract extracting method provided by the embodiment of the present invention is generally executed by the server 605, and accordingly, the text abstract extracting apparatus is generally disposed in the server 605. The text abstract extracting method provided by the embodiment of the invention can also be executed by the

terminal devices

601, 602, 603, and correspondingly, the text abstract extracting device can be arranged in the

terminal devices

601, 602, 603.

It should be understood that the number of terminal devices, networks, and servers in fig. 6 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

Referring now to FIG. 7, shown is a block diagram of a computer system 700 suitable for use with a terminal device implementing an embodiment of the present invention. The terminal device shown in fig. 7 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.

As shown in fig. 7, the computer system 700 includes a Central Processing Unit (CPU)701, which can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)702 or a program loaded from a storage section 708 into a Random Access Memory (RAM) 703. In the RAM703, various programs and data necessary for the operation of the system 700 are also stored. The CPU 701, the ROM 702, and the RAM703 are connected to each other via a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.

The following components are connected to the I/O interface 705: an input portion 706 including a keyboard, a mouse, and the like; an output section 707 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 708 including a hard disk and the like; and a communication section 709 including a network interface card such as a LAN card, a modem, or the like. The communication section 709 performs communication processing via a network such as the internet. A drive 710 is also connected to the I/O interface 705 as needed. A removable medium 711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 710 as necessary, so that a computer program read out therefrom is mounted into the storage section 708 as necessary.

In particular, according to the embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program can be downloaded and installed from a network through the communication section 709, and/or installed from the removable medium 711. The computer program performs the above-described functions defined in the system of the present invention when executed by the Central Processing Unit (CPU) 701.

It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer programs according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The modules described in the embodiments of the present invention may be implemented by software or hardware. The described modules may also be provided in a processor, which may be described as: a processor includes a first characterization module, a second characterization module, a textual content module, and an extraction module, where the names of the modules do not in some cases constitute a limitation on the modules themselves.

As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, implement the method of: converting each sentence in the text into a semantic vector, and inputting the semantic vector of each sentence into a first recurrent neural network to output a semantic representation vector of each sentence; converting each sentence in the text into a grammatical structure vector, and inputting the grammatical structure vector of each sentence into a second recurrent neural network to output a structure representation vector of each sentence; inputting the semantic representation vector and the structural representation vector of each sentence into a semantic structure convolution layer to output a text content vector; and inputting the text content vector into a third recurrent neural network to output a target sentence as a text abstract.

According to the technical scheme of the embodiment of the invention, the technical means that each sentence in the text is converted into the semantic vector and the syntactic structure vector, the semantic vector and the syntactic structure vector are respectively input into the first cyclic neural network and the second cyclic neural network, the semantic representation vector and the structural representation vector of each sentence are respectively output, then the text content vector is output through the semantic structure convolution layer, and finally the text abstract is output through the third cyclic neural network is adopted, so that the technical problem that the extraction type text abstract does not conform to the center meaning of the text in the prior art is solved. The embodiment of the invention considers the grammar and the text structure of the Chinese sentence in the design of the text abstract extraction model, and converts each sentence into the semantic vector and the grammar structure vector, thereby accurately extracting the target sentence which accords with the meaning of the text center. In addition, the embodiment of the invention calculates the difference vector between the abstract sentence and the target sentence, feeds the difference vector back to the check layer as the check, and carries out parameter training on the whole text abstract extraction model, so that the whole model can make consistent adjustment on the same parameter; it can be seen from the whole text abstract extraction model that the extracted information is as close as possible to the most similar sentences due to the addition of the proof layer, and the model has preference selection for sentences with different degrees of similarity and selects the most similar sentences, thereby reducing the sentence errors to the minimum in the proof layer.

The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A text abstract extraction method is characterized by comprising the following steps:

2. The method of claim 1, wherein converting each sentence in the text into a grammatical structure vector comprises:

3. The method of claim 2, wherein computing the distance between the sentence and a baseline sentence comprises:

4. The method of claim 1, wherein converting each sentence in the text into a semantic vector comprises:

5. The method of claim 1, wherein inputting the syntactic structure vector of each sentence into a second recurrent neural network to output a structural characterization vector of each sentence, comprises:

6. The method of claim 5, wherein the checking the grammar structure vector of each sentence using the disparity vector comprises:

7. The method of claim 1, wherein inputting the text content vector into a third recurrent neural network to output a target sentence as a text abstract comprises:

8. The method of claim 7, wherein the proofreading the text content vector using a disparity vector comprises:

9. The method according to any one of claims 5-8, wherein before converting each sentence in the text into a semantic vector, further comprising:

10. The method of claim 1, wherein the first recurrent neural network is a bidirectional long-short term memory recurrent neural network; and/or the second recurrent neural network is a bidirectional long-short term memory recurrent neural network; and/or the third cyclic neural network is a unidirectional long-short term memory artificial neural network.

11. A text summarization extraction apparatus, comprising:

12. An electronic device, comprising:

one or more processors;

a storage device for storing one or more programs,

the one or more programs, when executed by the one or more processors, implement the method of any of claims 1-10.

13. A computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-10.