CN113609284A - Method and device for automatically generating text abstract fused with multivariate semantics - Google Patents

Method and device for automatically generating text abstract fused with multivariate semantics Download PDF

Info

Publication number
CN113609284A
CN113609284A CN202110882867.4A CN202110882867A CN113609284A CN 113609284 A CN113609284 A CN 113609284A CN 202110882867 A CN202110882867 A CN 202110882867A CN 113609284 A CN113609284 A CN 113609284A
Authority
CN
China
Prior art keywords
text
multivariate
semantic
hidden layer
abstract
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110882867.4A
Other languages
Chinese (zh)
Inventor
何欣
陈永超
胡霄林
于俊洋
王光辉
翟瑞
宋亚林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Henan University
Original Assignee
Henan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Henan University filed Critical Henan University
Priority to CN202110882867.4A priority Critical patent/CN113609284A/en
Publication of CN113609284A publication Critical patent/CN113609284A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/34Browsing; Visualisation therefor
    • G06F16/345Summarisation for human users
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention belongs to the technical field of text data processing, and particularly relates to a method and a device for automatically generating a text abstract fused with multivariate semantics, wherein the method comprises the following steps: firstly, fusing multi-element semantic features before a source text is input into an encoder, so that the source text contains more semantic information; then inputting the source text fused with the multivariate semantic features into a bidirectional long-short term memory network in an encoder, and obtaining hidden layer states corresponding to word vectors in the text fused with the multivariate semantic features; secondly, the decoder predicts a word vector generated at the next moment through a context vector and a hidden layer state of the decoder at the current moment by adopting a one-way long-short term memory network in combination with an improved attention mechanism; and finally, training the model by using a loss function, and automatically generating the abstract of the text by using the trained model. The invention integrates the multi-element semantic features before the source text is input into the encoder, fully excavates the hidden features of the deep layer of the source text and improves the quality of generating the abstract.

Description

Method and device for automatically generating text abstract fused with multivariate semantics
Technical Field
The invention belongs to the technical field of text data processing, and particularly relates to a method and a device for automatically generating a text abstract fused with multivariate semantics.
Background
The automatic text summarization can effectively reduce the reading cost and can relieve the problem of information overload of people at present. The method is distinguished according to an automatic summarization method, and the main methods are two types: abstract and generate abstract.
The extraction type abstract extracts a plurality of most important sentences for recombination by judging the importance of each sentence in the original text, and the combined content is used as the abstract. Early abstract methods used statistical knowledge as the basis for determining the importance of word frequency, relative length of sentences, and similarity between sentences and titles. The importance of the sentence is measured according to the high-frequency words at first, the more the high-frequency words are, the more important the sentence is, and the word frequency-inverse document algorithm is proposed later to improve the traditional word frequency algorithm, so that the abstract quality is improved. At present, under the condition of having superior computing power, a machine learning method can be applied, a data set is labeled by a supervision and semi-supervision method, and after reasonable modeling, an unlabelled sentence is labeled by a trained model to predict whether the unlabelled sentence can be used as an abstract sentence or not. Although the abstract method is easy to implement, the abstract method is based on the document surface layer, grammars and context relations between adjacent words are not considered, the original text is not really understood, and sentences in the abstract are generated at the same time, so that the consistency is not high, and the limitation is large.
The generated abstract analyzes the grammar of the original text by the current more advanced and complicated method, and expresses the content of the original text by more concise sentences on the basis of understanding the original text. With the increasing performance of hardware in recent years and the increasing amount of data available for training, the deep learning is rapidly developed. After the sequence-to-sequence model is proposed, the sequence-to-sequence model is applied to some fields of natural language processing, provides a good research idea for the task of text automatic summarization, and makes great progress. The sequence-to-sequence model encodes the source text into a fixed size context vector through an encoder, and then generates the next predicted word through a decoder based on the word generated at the previous time and the hidden layer state at that time. Later proposals have made the use of attention mechanisms for the encoder, improving the quality of the generated summary. Then, the network is replaced by the decoder by the cycle spirit, and good progress is achieved. On the basis of the model, reinforcement learning is introduced, so that the problem of error propagation is solved, the problem of repeated words and sentences is solved, and the readability of generating the abstract is improved. In addition, the generated abstract can be combined with the inherent characteristics of the source text to improve the effect of the model, and word vectors can be blended into TF-IDF, POS, NER and other statistical information, so that the generated abstract is closer to the abstract of manual summary.
With the development of deep learning and natural language processing, sequence-to-sequence based generative summarization methods are continuously improving and promoting. Most of the current improvements are based on the encoder and decoder level, and the fusion of multivariate semantics is very lacking.
Disclosure of Invention
In order to acquire more effective information from a source text during model training so as to further improve the quality of an abstract generated after the model training, the invention provides a method and a device for automatically generating a text abstract fusing multivariate semantics.
In order to solve the technical problems, the invention adopts the following technical scheme:
the invention provides a text abstract automatic generation method fusing multivariate semantics, which comprises the following steps:
step 1, based on a sequence-to-sequence model, combining the multivariate semantic characteristics of natural language processing, fusing the multivariate semantic characteristics before a source text is input into an encoder, so that the source text contains more semantic information;
step 2, inputting the source text fused with the multivariate semantic features into a bidirectional long-short term memory network in an encoder, and obtaining hidden layer states corresponding to word vectors in the text fused with the multivariate semantic features;
step 3, the decoder predicts a word vector generated at the next moment through a context vector and the hidden layer state of the decoder at the current moment by adopting a one-way long-short term memory network in combination with an improved attention mechanism;
and 4, training the model by using the loss function, and automatically generating the abstract of the text through the trained model.
Further, the fusion of the multivariate semantic features of step 1 includes two times of semantic information extraction and two times of vector splicing, and the specific process is as follows:
setting the number of convolution kernels of two convolution layers of a convolution neural network to be the same as the word vector size k, setting the size of each convolution kernel of a first convolution layer to be 3, and setting the size of each convolution kernel of a second convolution layer to be 5;
inputting a first convolution layer by a source text, outputting k semantic vectors by the first convolution layer, and carrying out first splicing on the k semantic vectors;
and inputting the spliced semantic vectors into a second convolution layer as a new characteristic matrix, outputting k semantic vectors again by the second convolution layer, performing second splicing on the new k semantic vectors, and finally inputting the spliced semantic vectors into the encoder.
Further, the hidden layer state in step 2 is represented as:
Figure BDA0003192720090000031
wherein h isiBy forward hidden layer states
Figure BDA0003192720090000032
And backward hidden layer state
Figure BDA0003192720090000033
The components are spliced into a whole body,
Figure BDA0003192720090000034
and
Figure BDA0003192720090000035
the generation formula of (1) is:
Figure BDA0003192720090000041
Figure BDA0003192720090000042
wherein x isiThe ith word vector representing the input, i ∈ [1, m]And m represents the number of input source text word vectors.
Further, the step 3 specifically includes the following steps:
step 3.1, calculating the hidden layer state s of the decoder at the time t through a one-way long-short term memory networkt
Step 3.2, generating a context vector C for decoding at time t by means of an improved attention mechanism and hidden layer states of the encoder at time tt
Step 3.3, passing context vector CtAnd the decoder hidden layer state s at time ttThe vocabulary is predicted.
Further, the decoder hides the layer state s at time t in said step 3.1tThe calculation formula of (2) is as follows:
St=LSTM(St-1,yt-1)
wherein s ist-1For the previous moment to hide the layer state, when model training is performed, yt-1Is a word vector of a reference abstract vocabulary in a training set, and y is the word vector of the reference abstract vocabulary when the training set is used for predictiont-1Is a word vector predicted at the last moment; outputting the last encoding output result h of the encoder hidden layermInitializing hidden layer states s at the initial moment of the decoder0Assigning the ending vector of the source text to the initial input sequence y of the decoder0;t∈[1,n]And n is the set length for generating the summary.
Further, the context vector C at time t in step 3.2tHiding layer states h with an encoderiAnd hidden layer state s of the decoder at time ttGenerating and calculating the formula as follows:
Figure BDA0003192720090000043
the method comprises the following steps of introducing an unsaturated activation function LeakyReLU into an attention mechanism to optimize a model, wherein the formula of the LeakyReLU is as follows:
LeakyReLU=max(θx,x)
wherein, theta is a parameter of the function, and theta belongs to (- ∞, 1);
Figure BDA0003192720090000051
Figure BDA0003192720090000057
wherein, v, Wh,Ws,battnAre all learnable parameters, exp (-) represents an exponential function,
Figure BDA0003192720090000052
representing the hidden layer state st at the moment of the decoder t and the hidden layer state of the encoder
Figure BDA0003192720090000053
Degree of similarity of (a)tRepresenting the probability distribution of the source vocabulary, i ∈ [1, m ∈ ]],t∈[1,n]。
Further, the formula of the vocabulary prediction at the time t in the step 3.3 is as follows:
Pvocab=softmax(V′(V[st;Ct]+b)+b′)
wherein V ', V, b' are learnable parameters, PvocabIs the probability distribution of all words in the dictionary, softmax (·) represents the softmax function, and the final distribution of the final predicted word w is:
P(w)=Pvocab(w)。
further, in the step 4, the target vocabulary is obtained at the time t
Figure BDA0003192720090000054
The loss function of (d) is:
Figure BDA0003192720090000055
and the loss of the entire sequence is:
Figure BDA0003192720090000056
and automatically generating the abstract by using the trained model.
The invention also provides a device for automatically generating the text abstract fused with the multivariate semantics, which comprises the following steps:
the multivariate semantic feature fusion module is used for fusing multivariate semantic features before a source text is input into the encoder based on a sequence-to-sequence model and combined with multivariate semantic characteristics of natural language processing, so that the source text contains more semantic information;
the encoder hidden layer state calculation module is used for inputting the source text fused with the multivariate semantic features into a bidirectional long-short term memory network in an encoder and obtaining hidden layer states corresponding to word vectors in the text fused with the multivariate semantic features;
the word vector prediction module is used for predicting a word vector generated at the next moment by the decoder through the context vector and the hidden layer state of the decoder at the current moment by adopting a one-way long-short term memory network and combining an improved attention mechanism;
and the model training module is used for training the model by using the loss function and automatically generating the abstract of the text through the trained model.
Compared with the prior art, the invention has the following advantages:
1. on the basis of combining a traditional sequence-to-sequence model with an attention mechanism, multi-element semantic features are fused before a source text is input into an encoder, so that the source text obtains more semantic information before entering the encoder, the model fully excavates important contents of the source text, the readability and the global relevance of generating an abstract are increased, and the problem of low global relevance of generating the abstract is solved.
2. After coding, a context vector used for predicting a word at the next moment is generated based on an attention mechanism, a saturated activation function is mostly used in the conventional attention mechanism, a non-saturated activation function LeakyReLU is used in the attention mechanism, and the attention mechanism has the functions of avoiding gradient disappearance in a model training process and accelerating the convergence speed of a model; the next word is predicted in conjunction with the context vector at that time and the decoder hidden layer state. By training the model, the quality of generating the abstract is integrally improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a network architecture diagram after a sequence-to-sequence model with attention mechanism incorporates fused multivariate semantic features in accordance with an embodiment of the present invention;
FIG. 2 is a flowchart of a text abstract automatic generation method fusing multi-element semantics according to an embodiment of the present invention;
FIG. 3 is a process diagram of fusing multivariate semantic features according to an embodiment of the invention;
FIG. 4 is a flow chart of predicting a word vector generated at a next time according to an embodiment of the present invention;
fig. 5 is a block diagram of an apparatus for automatically generating a text excerpt with a fused multivariate semantic meaning according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer and more complete, the technical solutions in the embodiments of the present invention will be described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention, and based on the embodiments of the present invention, all other embodiments obtained by a person of ordinary skill in the art without creative efforts belong to the scope of the present invention.
As shown in fig. 1 and fig. 2, the method for automatically generating a text abstract with a fused multivariate semantic meaning of this embodiment includes the following steps:
and step S11, based on the sequence-to-sequence model, combining the multi-element semantic characteristics of natural language processing, fusing the multi-element semantic characteristics before the source text is input into the encoder, so that the source text contains more semantic information, and the model can fully mine the important content of the source text.
As shown in fig. 3, the fusion of the multivariate semantic features includes two times of semantic information extraction and two times of vector splicing, and the specific process is as follows:
the invention provides a multivariate semantic extraction method suitable for text abstraction, which uses two convolution layers of a convolution neural network, sets the number of convolution kernels of a first convolution layer to be the same as the size k of a word vector, sets the size of each convolution kernel of the first convolution layer to be 3 by combining the daily reading range of people to be three to five words, inputs a source text into the first convolution layer to obtain semantic vectors with the same number as the size k of the word vector, and performs first splicing on the semantic vectors; and inputting the spliced semantic vectors into a second convolution layer as a new feature matrix, setting the number of convolution kernels of the second convolution layer to be the same as the word vector size k, setting the size of each convolution kernel of the second convolution layer to be 5, obtaining k semantic vectors again, performing second splicing on the new k semantic vectors to form the feature matrix, splicing the feature matrix into the feature matrix, wherein the feature matrix contains more semantic information than the initial source text vector matrix, and finally inputting the feature matrix into an encoder.
Convolutional neural networks were originally used in the field of natural language processing for text classification to obtain features in sentences, which we improved on the model of the present invention to extract local correlations in sentences, such as internal correlations of phrase structures, and to remove the pooling layer of convolutional neural networks (which would cause text to lose a lot of features), preventing information loss; and (3) filling all 0 in each characteristic matrix boundary to ensure that the size of the matrix fused with the multivariate semantic characteristics is unchanged, so that the deep characteristics in the text can be better mined after the multivariate semantic characteristics are fused, and the global relevance of the abstract is enhanced.
And step S12, inputting the source text fused with the multivariate semantic features into a bidirectional long-short term memory network in an encoder, and obtaining hidden layer states corresponding to word vectors in the text fused with the multivariate semantic features.
In this example, the hidden layer state is represented as:
Figure BDA0003192720090000081
wherein h isiBy forward hidden layer states
Figure BDA0003192720090000082
And backward hidden layer state
Figure BDA0003192720090000083
The components are spliced into a whole body,
Figure BDA0003192720090000084
and
Figure BDA0003192720090000085
the generation formula of (1) is:
Figure BDA0003192720090000086
Figure BDA0003192720090000087
wherein x isiThe ith word vector representing the input, i ∈ [1, m]And m represents the number of input source text word vectors.
In step S13, the decoder predicts the word vector generated at the next time through the context vector and the hidden layer state of the decoder at the current time by using the one-way long-short term memory network in combination with the improved attention mechanism, which specifically includes steps S131 to S133, as shown in fig. 4:
step S131, calculating the hidden layer state S of the decoder at the time t through the one-way long-short term memory networktThe calculation formula is as follows:
St=LSTM(St-1,yt-1)
wherein s ist-1For the previous moment to hide the layer state, when model training is performed, yt-1Is a word vector of a reference abstract vocabulary in a training set, and y is the word vector of the reference abstract vocabulary when the training set is used for predictiont-1Is a word vector predicted at the last moment; outputting the last encoding output result h of the encoder hidden layermInitializing hidden layer states s at the initial moment of the decoder0Assigning the ending vector of the source text to the initial input sequence y of the decoder0;t∈[1,n]And n is the set length for generating the summary.
Step S132, generating a context vector C for decoding at time t by the improved attention mechanism and the hidden layer state of the encoder at time tt
In particular, the method comprises the following steps of,context vector C at time ttHiding layer states h with an encoderiAnd hidden layer state s of the decoder at time ttGenerating and calculating the formula as follows:
Figure BDA0003192720090000091
the invention improves the Attention mechanism, introduces the unsaturated activation function LeakyReLU into the Attention mechanism to optimize the model, and has the functions of avoiding gradient disappearance in the model training process and accelerating the convergence speed of the model, wherein the equation of the LeakyReLU is as follows:
LeakyReLU=max(θx,x)
wherein, theta is a parameter of the function, and theta belongs to (- ∞, 1);
Figure BDA0003192720090000092
Figure BDA0003192720090000101
wherein, v, Wh,Ws,battnAre all learnable parameters, exp (-) represents an exponential function,
Figure BDA0003192720090000102
representing the hidden layer state s at the moment t of the decodertAnd hidden layer states of the encoder
Figure BDA0003192720090000103
Similarity of (2), attention distribution αtCan be viewed as a probability distribution of the source vocabulary that tells the decoder where the next word needs to be focused, and then generates a weighted sum of the encoder hidden layer states with attention weights, called the context vector Ct,i∈[1,m],t∈[1,n]。
The use of sequence-to-sequence models in natural language processing, often accompanied by the use of attention mechanism, can improve the global relevance and quality of generating the summary by determining the relevance of the word and the source text at the next moment before decoding by the decoder.
Step S133, passing the context vector CtAnd the decoder hidden layer state s at time ttThe vocabulary is predicted, and the calculation formula is as follows:
Pvocab=softmax(V′(V[st;Ct]+b)+b′)
wherein V', V, b are learnable parameters, PvocabIs the probability distribution of all words in the dictionary, softmax (·) represents the softmax function, and the final distribution of the final predicted word w is:
P(w)=Pvocab(w)。
and step S14, training the model by using the loss function, and automatically generating the abstract of the text through the trained model.
Wherein for the target vocabulary at time t
Figure BDA0003192720090000104
The loss function of (d) is:
Figure BDA0003192720090000105
and the loss of the entire sequence is:
Figure BDA0003192720090000106
and automatically generating the abstract by using the trained model.
The process of generating the abstract is to repeat the processes from the step S131 to the step S133, and the abstract is formed by repeating the process of completing the generation of one word until all words are generated and finally fusing all the generated words.
Corresponding to the above method for automatically generating a text abstract fused with multivariate semantics, as shown in fig. 5, the present embodiment further provides an apparatus for automatically generating a text abstract fused with multivariate semantics, which includes a multivariate semantic feature fusion module 51, an encoder hidden layer state calculation module 52, a word vector prediction module 53, and a model training module 54.
The multivariate semantic feature fusion module 51 is configured to fuse multivariate semantic features before the source text is input to the encoder based on a sequence-to-sequence model in combination with multivariate semantic features of natural language processing, so that the source text contains more semantic information.
And the encoder hidden layer state calculating module 52 is configured to input the source text with the fused multivariate semantic features into a bidirectional long-term and short-term memory network in the encoder, and obtain hidden layer states corresponding to word vectors in the text with the fused multivariate semantic features.
And a word vector prediction module 53, configured to predict a word vector generated at a next time by using the context vector and the decoder hidden layer state at the current time by using a one-way long-short term memory network in combination with an improved attention mechanism.
And the model training module 54 is used for training the model by using the loss function, and automatically generating the abstract of the text through the trained model.
It is to be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
Those of ordinary skill in the art will understand that: all or part of the steps for realizing the method embodiments can be completed by hardware related to program instructions, the program can be stored in a computer readable storage medium, and the program executes the steps comprising the method embodiments when executed; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.
Finally, it is to be noted that: the above description is only a preferred embodiment of the present invention, and is only used to illustrate the technical solutions of the present invention, and not to limit the protection scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims (9)

1. A text abstract automatic generation method fusing multivariate semantics is characterized by comprising the following steps:
step 1, based on a sequence-to-sequence model, combining the multivariate semantic characteristics of natural language processing, fusing the multivariate semantic characteristics before a source text is input into an encoder, so that the source text contains more semantic information;
step 2, inputting the source text fused with the multivariate semantic features into a bidirectional long-short term memory network in an encoder, and obtaining hidden layer states corresponding to word vectors in the text fused with the multivariate semantic features;
step 3, the decoder predicts a word vector generated at the next moment through a context vector and the hidden layer state of the decoder at the current moment by adopting a one-way long-short term memory network in combination with an improved attention mechanism;
and 4, training the model by using the loss function, and automatically generating the abstract of the text through the trained model.
2. The method for automatically generating the text abstract with the fused multi-semantic meaning as claimed in claim 1, wherein the fused multi-semantic feature of the step 1 comprises two times of semantic information extraction and two times of vector splicing, and the specific process is as follows:
setting the number of convolution kernels of two convolution layers of a convolution neural network to be the same as the word vector size k, setting the size of each convolution kernel of a first convolution layer to be 3, and setting the size of each convolution kernel of a second convolution layer to be 5;
inputting a first convolution layer by a source text, outputting k semantic vectors by the first convolution layer, and carrying out first splicing on the k semantic vectors;
and inputting the spliced semantic vectors into a second convolution layer as a new characteristic matrix, outputting k semantic vectors again by the second convolution layer, performing second splicing on the new k semantic vectors, and finally inputting the spliced semantic vectors into the encoder.
3. The method for automatically generating the text abstract fusing the multi-semantic meaning as claimed in claim 1, wherein the hidden layer state in the step 2 is represented as:
Figure FDA0003192720080000021
wherein h isiBy forward hidden layer states
Figure FDA0003192720080000027
And backward hidden layer state
Figure FDA0003192720080000022
The components are spliced into a whole body,
Figure FDA0003192720080000023
and
Figure FDA0003192720080000024
the generation formula of (1) is:
Figure FDA0003192720080000025
Figure FDA0003192720080000026
wherein x isiThe ith word vector representing the input, i ∈ [1, m]And m represents the number of input source text word vectors.
4. The method for automatically generating a text abstract fusing multivariate semantics according to claim 3, wherein the step 3 specifically comprises the following steps:
step 3.1, calculating the hidden layer state s of the decoder at the time t through a one-way long-short term memory networkt
Step 3.2, generating a context vector C for decoding at time t by means of an improved attention mechanism and hidden layer states of the encoder at time tt
Step 3.3, passing context vector CtAnd the decoder hidden layer state s at time ttThe vocabulary is predicted.
5. The method for automatically generating text abstract fusing multivariate semantics as claimed in claim 4, wherein the decoder hidden layer state s at the time t in the step 3.1 is set as the hidden layer state stThe calculation formula of (2) is as follows:
st=LSTM(st-1,yt-1)
wherein s ist-1For the previous moment to hide the layer state, when model training is performed, yt-1Is a word vector of a reference abstract vocabulary in a training set, and y is the word vector of the reference abstract vocabulary when the training set is used for predictiont-1Is a word vector predicted at the last moment; outputting the last encoding output result h of the encoder hidden layermInitializing hidden layer states s at the initial moment of the decoder0Assigning the ending vector of the source text to the initial input sequence y of the decoder0;t∈[1,n]And n is the set length for generating the summary.
6. The method for automatically generating a text abstract fusing multivariate semantics as claimed in claim 5, wherein the context vector C at the time t in the step 3.2tHiding layer states h with an encoderiAnd hidden layer state s of the decoder at time ttGenerating and calculating the formula as follows:
Figure FDA0003192720080000031
the method comprises the following steps of introducing an unsaturated activation function LeakyReLU into an attention mechanism to optimize a model, wherein the formula of the LeakyReLU is as follows:
LeakyReLU=max(θx,x)
wherein, theta is a parameter of the function, and theta belongs to (- ∞, 1);
Figure FDA0003192720080000032
Figure FDA0003192720080000033
wherein, v, Wh,Ws,battnAre all learnable parameters, exp (-) represents an exponential function,
Figure FDA0003192720080000034
representing the hidden layer state s at the moment t of the decodertAnd hidden layer states of the encoder
Figure FDA0003192720080000035
Degree of similarity of (a)tRepresenting the probability distribution of the source vocabulary, i ∈ [1, m ∈ ]],t∈[1,n]。
7. The method for automatically generating the text abstract fusing the multivariate semantics as claimed in claim 6, wherein the formula of the vocabulary prediction at the time t in the step 3.3 is as follows:
Pvocab=softmax(y′(V[st;Ct]+b)+b′)
wherein V ', V, b' are learnable parameters, PvocabIs the probability distribution of all words in the dictionary, softmax (·) represents the softmax function, and the final distribution of the final predicted word w is:
P(w)=Pvocab(w)。
8. the method for automatically generating text abstract fused with multivariate semantics according to claim 7Method, characterized in that in step 4, the target vocabulary is aligned at time t
Figure FDA0003192720080000036
The loss function of (d) is:
Figure FDA0003192720080000037
and the loss of the entire sequence is:
Figure FDA0003192720080000041
and automatically generating the abstract by using the trained model.
9. An automatic text abstract generating device fused with multivariate semantics is characterized by comprising the following steps:
the multivariate semantic feature fusion module is used for fusing multivariate semantic features before a source text is input into the encoder based on a sequence-to-sequence model and combined with multivariate semantic characteristics of natural language processing, so that the source text contains more semantic information;
the encoder hidden layer state calculation module is used for inputting the source text fused with the multivariate semantic features into a bidirectional long-short term memory network in an encoder and obtaining hidden layer states corresponding to word vectors in the text fused with the multivariate semantic features;
the word vector prediction module is used for predicting a word vector generated at the next moment by the decoder through the context vector and the hidden layer state of the decoder at the current moment by adopting a one-way long-short term memory network and combining an improved attention mechanism;
and the model training module is used for training the model by using the loss function and automatically generating the abstract of the text through the trained model.
CN202110882867.4A 2021-08-02 2021-08-02 Method and device for automatically generating text abstract fused with multivariate semantics Pending CN113609284A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110882867.4A CN113609284A (en) 2021-08-02 2021-08-02 Method and device for automatically generating text abstract fused with multivariate semantics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110882867.4A CN113609284A (en) 2021-08-02 2021-08-02 Method and device for automatically generating text abstract fused with multivariate semantics

Publications (1)

Publication Number Publication Date
CN113609284A true CN113609284A (en) 2021-11-05

Family

ID=78339115

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110882867.4A Pending CN113609284A (en) 2021-08-02 2021-08-02 Method and device for automatically generating text abstract fused with multivariate semantics

Country Status (1)

Country Link
CN (1) CN113609284A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114118024A (en) * 2021-12-06 2022-03-01 成都信息工程大学 Conditional text generation method and generation system
CN114610871A (en) * 2022-05-12 2022-06-10 北京道达天际科技有限公司 Information system modeling analysis method based on artificial intelligence algorithm
CN115865459A (en) * 2022-11-25 2023-03-28 南京信息工程大学 Network flow abnormity detection method and system based on secondary feature extraction
CN115994541A (en) * 2023-03-22 2023-04-21 金蝶软件(中国)有限公司 Interface semantic data generation method, device, computer equipment and storage medium

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114118024A (en) * 2021-12-06 2022-03-01 成都信息工程大学 Conditional text generation method and generation system
CN114118024B (en) * 2021-12-06 2022-06-21 成都信息工程大学 Conditional text generation method and generation system
CN114610871A (en) * 2022-05-12 2022-06-10 北京道达天际科技有限公司 Information system modeling analysis method based on artificial intelligence algorithm
CN114610871B (en) * 2022-05-12 2022-07-08 北京道达天际科技有限公司 Information system modeling analysis method based on artificial intelligence algorithm
CN115865459A (en) * 2022-11-25 2023-03-28 南京信息工程大学 Network flow abnormity detection method and system based on secondary feature extraction
CN115994541A (en) * 2023-03-22 2023-04-21 金蝶软件(中国)有限公司 Interface semantic data generation method, device, computer equipment and storage medium
CN115994541B (en) * 2023-03-22 2023-07-07 金蝶软件(中国)有限公司 Interface semantic data generation method, device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
Tan et al. Neural machine translation: A review of methods, resources, and tools
CN110119765B (en) Keyword extraction method based on Seq2Seq framework
US11113479B2 (en) Utilizing a gated self-attention memory network model for predicting a candidate answer match to a query
CN110210032B (en) Text processing method and device
CN113609284A (en) Method and device for automatically generating text abstract fused with multivariate semantics
CN110326002B (en) Sequence processing using online attention
CN111859978A (en) Emotion text generation method based on deep learning
CN112199956B (en) Entity emotion analysis method based on deep representation learning
Xie et al. Attention-based dense LSTM for speech emotion recognition
CN110991290B (en) Video description method based on semantic guidance and memory mechanism
US11475225B2 (en) Method, system, electronic device and storage medium for clarification question generation
CN111078866A (en) Chinese text abstract generation method based on sequence-to-sequence model
CN113127631A (en) Text summarization method based on multi-head self-attention mechanism and pointer network
CN111581970B (en) Text recognition method, device and storage medium for network context
CN113435211A (en) Text implicit emotion analysis method combined with external knowledge
CN111145914B (en) Method and device for determining text entity of lung cancer clinical disease seed bank
CN114168754A (en) Relation extraction method based on syntactic dependency and fusion information
US20220383119A1 (en) Granular neural network architecture search over low-level primitives
CN114281982B (en) Book propaganda abstract generation method and system adopting multi-mode fusion technology
CN110913229B (en) RNN-based decoder hidden state determination method, device and storage medium
CN114387537A (en) Video question-answering method based on description text
Morioka et al. Multiscale recurrent neural network based language model.
CN114218928A (en) Abstract text summarization method based on graph knowledge and theme perception
Parmar et al. Abstractive text summarization using artificial intelligence
CN113704466B (en) Text multi-label classification method and device based on iterative network and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination