CN109726383B

CN109726383B - Article semantic vector representation method and system

Info

Publication number: CN109726383B
Application number: CN201711024027.4A
Authority: CN
Inventors: 王宁君; 张春荣; 赵琦
Original assignee: Potevio Information Technology Co Ltd
Current assignee: Potevio Information Technology Co Ltd
Priority date: 2017-10-27
Filing date: 2017-10-27
Publication date: 2023-06-23
Anticipated expiration: 2037-10-27
Also published as: CN109726383A

Abstract

The invention provides an article semantic vector representation method and a system, wherein the representation method comprises the following steps: s1, acquiring sentence vectors of any sentence according to all word vectors of any sentence in the article; s2, inputting corresponding sentence vectors arranged according to the sentence positive sequence of the article and corresponding sentence vectors arranged in the reverse sequence into a bidirectional long-short-time memory network model, and obtaining a first output quantity corresponding to any sentence vector and a second output quantity of the attention sentence vector corresponding to any sentence vector; s3, according to the second output quantity and the attention of the corresponding attention sentence vector to any sentence vector, acquiring the semantic vector of the sentence corresponding to any sentence vector; s4, acquiring the semantic vectors of the article according to the semantic vectors of sentences corresponding to all the sentence vectors in the article. The invention greatly reduces the total calculated amount in the paragraph semantic extraction stage, and solves the problem that the traditional word vector-based semantic vector extraction cannot realize the function of paragraph semantic representation.

Description

Article semantic vector representation method and system

Technical Field

The invention relates to the field of article semantic analysis, in particular to an article semantic vector representation method and an article semantic vector representation system.

Background

Vector representation of article semantics plays an important role in many fields related to natural language processing, such as extraction of text center ideas, semantic analysis of text, text classification, dialogue systems, and research in machine translation. However, the semantic representation of the article in the prior art adopts a word vector-based method, and the semantic representation of the paragraph is calculated on the basis of the word vector.

Fig. 1 is a schematic diagram of a semantic vector representation method of an article based on a word vector in the prior art, please refer to fig. 1, the method is based on semantic representation of the word vector, and the process of the method is to directly output the word vector through a Long Short-Term Memory (LSTM) network to obtain the semantic vector of a sentence or article. The method comprises the steps of firstly segmenting a text, carrying out vocabulary vectorization after segmentation to obtain word vectors, and inputting the word vectors into an LSTM model according to the time sequence of sentences. The final output of the model is the final semantic vector of the sentence of text.

Wherein { x ₁ ,x ₂ ,...,x _n The Word vector is obtained through Word2vector, and the final output result of the theoretical model contains all information which should be kept in the sentence, so that the output result can be used as the semantic vector representation of the sentence, but the effect on the text semantic representation can be greatly influenced by the way of extracting the semantic vector.

For this word vector based sentence semantic representation method, taking sequence to sequence model in neural network based machine translation (neural machine translation, NMT) as an example, word vectors or word vectors are sequentially input into the encodings in the model until we input the last word vector or word vector of this sentence, at which time the encodings output the semantic vector of the whole sentence. NMT is characterized by considering the input information of each step before, so that in theory this semantic vector can contain the information of the whole sentence. However, in the actual operation process, with the continuous growth of word sequences, especially the text quantity reaching the paragraph level, the following problems are found: when a sequence is continuously input, semantic information cannot be memorized and the information representing the whole sequence; the vocabularies are similar to the influence factors among the vocabularies, so that the key points in the text cannot be highlighted; it is difficult to extract semantic vector representations of paragraph text.

In the prior art, another representation method, which is also the most representative semantic vector, is to obtain a semantic vector representation based on a word vector's attention model. The distraction is that the human brain changes the attention point of the external information through sensory ingenious and reasonable when receiving or processing the external information, selectively ignores the content less relevant to itself, and amplifies the information required by itself. By changing the attention point, the receiving sensitivity and the information processing speed of the human brain to the information of the focused attention part are greatly enhanced, irrelevant information can be effectively filtered, and closely related information is highlighted. In summary, the basic idea of the attention mechanism is not to treat each location of the entire scene equally at a time, but to focus on specific locations according to the needs. Once the specific extracted rules are determined, machine learning or deep neural network learning is used to learn where in the future the image should be focused.

Text semantic vector representation the initial attention mechanism was used on NMT, neural network machine translation is a typical sequence to sequence model, which contains a encoder to decoder model.

Fig. 2 is a schematic diagram of an article semantic vector representation method based on a word vector attention model in the prior art, as shown in fig. 2, in which the existing word vector attention model is used to obtain a semantic vector representation, a cyclic neural network (Recurrent neural Network, RNN) is used to encode words in a source text according to a time sequence, each output after encoding is multiplied by a corresponding attention degree, and finally a fixed-dimension intermediate semantic vector is obtained by summation. The specific semantic vector is expressed as:

wherein c _i Is a semantic vector, T _s B is the number of sentences of the article, h _j Output through LSTM for jth word vector, exp (e _ij ) E based on e _ij To the power, T _x Is i, v _a For the output of a word vector through LSTM,

is S _i-1 Weight matrix of S _i-1 U is the hidden state of the decoder at the i-1 th moment _a Is h _j Tan () is an activation function 0 < j.ltoreq.i, 0 < i.ltoreq.b, i, j.epsilon.Z, Z being an integer set.

For the semantic vector representation method of the attention model based on the word vectors, due to the introduction of an attention mechanism, each vocabulary is enabled to extract synchronous attention after data learning, each vocabulary has respective corresponding attention, and the semantic representation of the sentences obtained through weighting realizes the key extraction among the sentences.

However, this method simply adds the semantic representations of sentences directly, makes it difficult to represent semantic vectors for long text such as translation of articles, extraction of article summaries, and is based on word vectors and does not represent the overall semantics of articles or paragraphs well.

Disclosure of Invention

The present invention provides a method and system for semantic vector representation of articles that overcomes the above-described problems.

According to one aspect of the present invention, there is provided an article semantic vector representation method, including: s1, acquiring sentence vectors of any sentence according to all word vectors of the sentence in the article; s2, inputting corresponding sentence vectors arranged according to the sentence positive sequence and corresponding sentence vectors arranged according to the sentence negative sequence of the article into a bidirectional long-short-time memory network model, and obtaining a first output quantity corresponding to any sentence vector and a second output quantity corresponding to the attention sentence vector corresponding to any sentence vector, wherein the attention sentence vector corresponding to any sentence vector is: at least one sentence vector formed by any sentence vector and the sentence vector before any sentence vector under the sentence vector positive sequence arrangement of the article; s3, acquiring semantic vectors of sentences corresponding to any sentence vector according to the second output quantity and the attention of the corresponding attention sentence vector to the any sentence vector; s4, acquiring the semantic vectors of the article according to the semantic vectors of sentences corresponding to all the sentence vectors in the article.

Preferably, step S1 further comprises: and adding the same dimension points of all word vectors of any sentence in the article to obtain the sentence vector of the any sentence.

Preferably, step S2 further comprises: inputting corresponding sentence vectors arranged according to the sentence positive sequence of the article into a bidirectional long-short-time memory network model, and acquiring semantic information from a first sentence vector to any sentence vector and semantic information from the first sentence vector to the attention sentence vector; inputting corresponding sentence vectors arranged in reverse order according to sentences of the article into a bidirectional long-short-time memory network model, and acquiring semantic information from the last sentence vector to any sentence vector and semantic information from the last sentence vector to the attention degree sentence vector; integrating the semantic information of the first sentence vector to any sentence vector and the semantic information of the last sentence vector to any sentence vector to obtain a first output quantity corresponding to any sentence vector, and, and integrating the semantic information from the first sentence vector to the attention sentence vector and the semantic information from the last sentence vector to the attention sentence vector to obtain a second output quantity corresponding to the attention sentence vector.

Preferably, in step S3, the degree of interest of the corresponding degree of interest sentence vector on the arbitrary sentence vector is obtained by:

wherein a is _ij For the attention degree of the jth sentence vector to the ith sentence vector, e _ij As bilinear function, T _x Is i, exp (e _ij ) E based on e _ij The jth sentence vector is any attention sentence vector of the ith sentence vector, j is more than 0 and less than or equal to i, i is more than 0 and less than or equal to b, i, j is less than or equal to Z, b is the number of sentence vectors of the article, and Z is an integer set.

Preferably, step S3 further comprises: according to the second output quantity and the attention of the corresponding attention sentence vector to any sentence vector, the semantic vector of the sentence corresponding to any sentence vector is obtained through the following formula:

wherein c _i For the semantic vector, T, of the sentence corresponding to the ith sentence vector _s B, b is the number of sentence vectors of the article, a _ij For the attention degree of the jth sentence vector to the ith sentence vector, h _j For the first output quantity of the jth sentence vector, j is more than 0 and less than or equal to i, i is more than 0 and less than or equal to b, i, j is less than or equal to Z, and Z is an integer set.

Preferably, the bilinear function is:

e _ij ＝c _i-1 Wh _j

wherein e _ij As bilinear function, c _i-1 Is the semantic vector of the sentence corresponding to the i-1 th sentence vector, W is h _j Weight matrix W E R ^h*h ，R ^h*h A real number domain of h multiplied by h, h E R, R is a real number set, h _j For the first output quantity of the jth sentence vector, j is more than 0 and less than or equal to i, i is more than 0 and less than or equal to b, i, j is less than or equal to Z, b is the number of sentence vectors of the article, and Z is an integer set.

Preferably, the weight matrix is obtained by a back propagation algorithm.

According to another aspect of the present invention, there is provided an article semantic vector representation system comprising: the sentence vector obtaining module is used for obtaining all word vectors of any sentence in the article, obtaining sentence vectors of any sentence; the sentence processing module is used for inputting corresponding sentence vectors arranged according to the sentence positive sequence and corresponding sentence vectors arranged according to the inverted sequence of the article into a bidirectional long-short-time memory network model, and obtaining a first output corresponding to any sentence vector and a second output corresponding to the attention sentence vector corresponding to any sentence vector, wherein the attention sentence vector corresponding to any sentence vector is: at least one sentence vector formed by any sentence vector and the sentence vector before any sentence vector under the sentence vector positive sequence arrangement of the article; the sentence semantic vector obtaining module is used for obtaining the semantic vector of the sentence corresponding to any sentence vector according to the second output quantity and the attention of the corresponding attention sentence vector to any sentence vector; the article semantic vector acquisition module is used for acquiring the article semantic vector according to the semantic vectors of sentences corresponding to all sentence vectors in the article.

Preferably, the output-of-acquisition module is further configured to: inputting corresponding sentence vectors arranged according to the sentence positive sequence of the article into a bidirectional long-short-time memory network model, and acquiring semantic information from a first sentence vector to any sentence vector and semantic information from the first sentence vector to the attention sentence vector; inputting corresponding sentence vectors arranged in reverse order according to sentences of the article into a bidirectional long-short-time memory network model, and acquiring semantic information from the last sentence vector to any sentence vector and semantic information from the last sentence vector to the attention degree sentence vector; integrating the semantic information from the first sentence vector to any sentence vector and the semantic information from the last sentence vector to any sentence vector to obtain a first output quantity corresponding to any sentence vector, and integrating the semantic information from the first sentence vector to the attention degree sentence vector and the semantic information from the last sentence vector to the attention degree sentence vector to obtain a second output quantity corresponding to the attention degree sentence vector.

Preferably, the sentence semantic vector obtaining module is further configured to: according to the second output quantity and the attention of the corresponding attention sentence vector to any sentence vector, the semantic vector of the sentence corresponding to any sentence vector is obtained through the following formula:

According to the method and the system for representing the semantic vectors of the articles, all sentence vectors in the articles are obtained through setting, the sentence vectors are operated, and finally the semantic vectors of the articles are obtained, so that the total calculated amount in the stage of extracting the semantic vectors of the paragraphs is greatly reduced, the function of representing the semantic of the paragraphs, which cannot be achieved by extracting the semantic vectors based on the word vectors in the prior art, is solved, and the extracted semantic vectors contain all useful information of the whole articles.

Drawings

FIG. 1 is a schematic diagram of a method for representing semantic vectors of articles based on word vectors in the prior art;

FIG. 2 is a schematic diagram of a method for semantic vector representation of articles based on a word vector attention model in the prior art;

FIG. 3 is a flowchart of an article semantic vector representation method according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a first output acquisition in an embodiment of the present invention;

FIG. 5 is a schematic diagram of the acquisition of article semantic vectors in an embodiment of the present invention

FIG. 6 is a block diagram of an article semantic vector representation system in accordance with an embodiment of the present invention.

Detailed Description

The following describes in further detail the embodiments of the present invention with reference to the drawings and examples. The following examples are illustrative of the invention and are not intended to limit the scope of the invention.

FIG. 3 is a flowchart of an article semantic vector representation method according to an embodiment of the present invention, as shown in FIG. 3, including: s1, acquiring sentence vectors of any sentence according to all word vectors of the sentence in the article; s2, inputting corresponding sentence vectors arranged according to the sentence positive sequence and corresponding sentence vectors arranged according to the sentence negative sequence of the article into a bidirectional long-short-time memory network model, and obtaining a first output quantity corresponding to any sentence vector and a second output quantity corresponding to the attention sentence vector corresponding to any sentence vector, wherein the attention sentence vector corresponding to any sentence vector is: at least one sentence vector formed by any sentence vector and the sentence vector before any sentence vector under the sentence vector positive sequence arrangement of the article; s3, acquiring semantic vectors of sentences corresponding to any sentence vector according to the second output quantity and the attention of the corresponding attention sentence vector to the any sentence vector; s4, acquiring the semantic vectors of the article according to the semantic vectors of sentences corresponding to all the sentence vectors in the article.

Specifically, all Word vectors of any sentence in the article are preferably obtained by Word2 vector.

Further, to solve the problem of sequence length gaps in the data of the sequence problem, those skilled in the art have devised recurrent neural networks (recurrent neural network, RNN) to deal with the sequence problem. However, there are two problems with the conventional RNN. On the one hand, long-distance dependence, on the other hand, gradient extinction and gradient explosion, which is particularly evident when dealing with long sequences.

To solve the above problems, a Long Short-Term Memory (LSTM) model has been proposed by those skilled in the art. The RNN architecture is specially used for solving the problems of gradient elimination and gradient explosion of an RNN model. The activation state of the memory block is controlled by three multiplication gates: input gate (input gate), output gate (output gate), forget gate (foreget gate). The structure can enable the information input before to be stored in the network and transmitted forward all the time, the history state stored in the network can be changed by the new input when the input door is opened, the history state stored when the output door is opened can be accessed, the later output is affected, and the user forgets to empty the history information stored before.

Since the inputs of the LSTM model are unidirectional, future context information is often ignored. The basic idea of a two-way long and short-term memory network is to train one LSTM model each forward and backward using one training sequence. And then the outputs of the two models are linearly combined to achieve the aim that each node in the sequence can completely depend on all the context information.

Specifically, in step S2, under the positive sentence vector arrangement of the articles, at least one sentence vector formed by any one sentence vector and the sentence vector before any one sentence vector means that if any one sentence vector is the first sentence vector of the articles, the attention sentence vector of any one sentence vector is the first sentence vector and only includes one sentence vector. If any sentence vector is not the first sentence vector of the article, the attention sentence vector of any sentence vector is a plurality of sentence vectors included in the first sentence vector to any sentence vector in positive sequence arrangement.

It should be noted that, in the embodiment of the present invention, the first sentence vector is the first sentence vector in the positive sequence arrangement, and the last sentence vector in the embodiment of the present invention is the last sentence vector in the positive sequence arrangement. The number of the embodiments of the present invention refers to two or more.

According to the article semantic vector representation method provided by the invention, all sentence vectors in the article are acquired through setting, and the article semantic vector is finally acquired through operating the sentence vectors, so that the total calculated amount in the paragraph semantic extraction stage is greatly reduced, the function of paragraph semantic representation which cannot be realized by the traditional word vector-based semantic vector extraction is solved, and the extracted semantic vector contains all useful information of the whole article.

Based on the above embodiment, step S1 further includes: and adding the same dimension points of all word vectors of any sentence in the article to obtain the sentence vector of the any sentence.

In particular, sentence vectors are derived from unlabeled data and can therefore be applied to data that is not too much tagged, and each paragraph or even each document can be mapped onto a unique vector based on sentence vectors.

Further, word vectors among sentences are basic constituent units of sentence vectors. For a sentence, the dimensions of each word vector it contains are consistent. And adding the same dimension points of all word vectors of any sentence in the article, so that the sentence vector of any sentence can be obtained. The sentence vector of any sentence is consistent with the dimensions of all word vectors of that sentence.

After the sentence vector is obtained, the first output quantity and the second output quantity in the above embodiment are further explained, where it should be noted that, in the embodiment of the present invention, the first output quantity and the second output quantity are both output quantities through the two-way long-short-time memory network model, and have the same properties.

Fig. 4 is a schematic diagram of the first output acquisition in the embodiment of the present invention, referring to fig. 4, step S2 in the above embodiment further includes: inputting corresponding sentence vectors arranged according to the sentence positive sequence of the article into a bidirectional long-short-time memory network model, and acquiring semantic information from a first sentence vector to any sentence vector and semantic information from the first sentence vector to the attention sentence vector; inputting corresponding sentence vectors arranged in reverse order according to sentences of the article into a bidirectional long-short-time memory network model, and acquiring semantic information from the last sentence vector to any sentence vector and semantic information from the last sentence vector to the attention degree sentence vector; integrating the semantic information from the first sentence vector to any sentence vector and the semantic information from the last sentence vector to any sentence vector to obtain a first output quantity corresponding to any sentence vector, and integrating the semantic information from the first sentence vector to the attention degree sentence vector and the semantic information from the last sentence vector to the attention degree sentence vector to obtain a second output quantity corresponding to the attention degree sentence vector.

Specifically, any sentence vector corresponds to a first output quantity.

Further, the Bi-directional long-short-time memory network model (Bi-LSTM) has the functions of forward encoding and backward encoding, and the above-mentioned process of inputting the corresponding sentence vectors arranged according to the sentence positive sequence of the article into the Bi-directional long-time memory network model to obtain the semantic information from the first sentence vector to any one of the sentence vectors and the semantic information from the first sentence vector to the attention sentence vector as forward encoding. Correspondingly, inputting the corresponding sentence vectors which are arranged according to the sentence inverted sequence of the article into a bidirectional long-short-time memory network model, and obtaining the semantic information from the last sentence vector to any sentence vector and the process that the semantic information from the last sentence vector to the attention degree sentence vector is in reverse coding. Sentence vectors are the basic unit of input of the Bi-directional long-short-time memory network model (Bi-LSTM).

Further, the first sentence vector to any sentence vector means that sentence vectors of articles are sequentially ordered from the first sentence vector to any sentence vector in a positive order arrangement. Accordingly, the last sentence vector to any sentence vector means that sentences of the article are in reverse order from the last sentence vector to the any sentence vector in the positive order arrangement.

Further, the first sentence vector to the attention sentence vector refers to each attention sentence vector from the first sentence vector to the arbitrary sentence vector. The last sentence vector to the attention sentence vector refers to each attention sentence vector from the last sentence vector to any one of the sentence vectors.

Further, for any sentence vector, the semantic information of forward coding and backward coding of a Bi-directional long-short-time memory network model (Bi-LSTM) is integrated, and the obtained first output quantity of any sentence vector contains the semantic information of the sentence vector before the sentence vector and the semantic information of the sentence vector after the sentence vector.

V is also described as _D To a certain extent, can also represent semantic vectors of articles, but there are problems when the paragraph sentence amount is too large: semantic information cannot be memorized and represent the information of the whole article sequence; since the influence factors of all the vocabularies are similar, the emphasis of the article cannot be highlighted and the learning of the central ideas is inaccurate. Based on this, the invention further proposes step S3.

The above embodiment describes that the semantic vector of the sentence corresponding to any sentence vector is obtained according to the second output quantity and the attention of the corresponding attention sentence vector to any sentence vector, and the embodiment explains the manner of obtaining the attention of the corresponding attention sentence vector to any sentence vector.

The attention degree of the corresponding attention degree sentence vector to any sentence vector in step S3 in the above embodiment is obtained by the following formula:

Specifically, the attention degree is a degree of attention, and may be regarded as an affected degree.

According to the article semantic vector representation method, the degree of interaction between sentences in the article can be obtained through setting the degree of interest of the calculated degree of interest sentence vector on any sentence vector, so that the acquired article semantic vector is more accurate.

Step S3 is further described below based on the above embodiment.

Specifically, step S3 further includes: according to the second output quantity and the attention of the corresponding attention sentence vector to any sentence vector, the semantic vector of the sentence corresponding to any sentence vector is obtained through the following formula:

Further, h in the embodiment of the invention _j And V _D All refer to the first output.

According to the article semantic vector representation method provided by the invention, the semantic vector of the sentence corresponding to any sentence vector is obtained by setting the attention of the sentence vector to any sentence vector according to the second output quantity and the corresponding attention, so that the contribution degree of each sentence can be effectively distinguished.

Based on the above embodiment, the bilinear function is:

e _ij ＝c _i-1 Wh _j

According to the article semantic vector representation method, compared with the forward nerve excitation function based on the word vector attention model, the performance of the neural network can be better improved by setting the bilinear function, and a better result can be obtained when the method is applied to the paragraph center extraction network.

Based on the above embodiment, the weight matrix is obtained by a back propagation algorithm.

Specifically, the back propagation algorithm (Backpropagation algorithm, BP) is a supervised learning algorithm, often used to train multi-layer perceptrons. The BP algorithm is a generalization of the Delta rule, requiring that the function used by each artificial neuron must be differentiable. The BP algorithm is particularly suitable for training a forward neural network.

As a preferred embodiment, fig. 5 is a schematic diagram of obtaining an article semantic vector according to an embodiment of the present invention, referring to fig. 5, the obtaining an article semantic vector may specifically include the following steps:

first, the same dimension point of all word vectors of any sentence in the article is added to obtain the sentence vector of any sentence.

Secondly, inputting corresponding sentence vectors arranged according to the sentence positive sequence of the article into a bidirectional long-short-time memory network model, and acquiring semantic information of any sentence vector from the first sentence vector and semantic information of the sentence vector from the first sentence vector to the attention degree; inputting corresponding sentence vectors arranged according to the sentence inverted sequence of the article into a bidirectional long-short time memory network model, and acquiring semantic information of the last sentence vector to any sentence vector and semantic information of the last sentence vector to the attention degree sentence vector; the semantic information of the first sentence vector and the semantic information of the last sentence vector are integrated, so that a first output quantity corresponding to any sentence vector is obtained, and the semantic information of the first sentence vector to the attention sentence vector and the semantic information of the last sentence vector to the attention sentence vector are integrated, so that a second output quantity corresponding to the attention sentence vector is obtained.

Then, according to the second output quantity and the attention of the corresponding attention sentence vector to any sentence vector, the semantic vector of the sentence corresponding to any sentence vector is obtained through the following formula:

A in the above _ij Obtained by the following formula:

wherein a is _ij For the attention degree of the jth sentence vector to the ith sentence vector, e _ij As bilinear function, T _x Is i, exp (e _ij ) E based on e _ij The j-th sentence vector is any attention sentence vector of the i-th sentence vector, j is more than 0 and less than or equal to i, i is more than 0 and less than or equal to b, i and j EZ, b is the number of sentence vectors of the article, Z is an integer set.

And finally, acquiring semantic vectors of sentences corresponding to all sentence vectors in the article, and acquiring the semantic vectors of the article according to the semantic vectors of the sentences corresponding to all sentence vectors in the article.

Based on the above embodiment, the present invention further provides an article semantic vector representation system, configured to implement the article semantic vector representation method of any one of the above embodiments, and fig. 6 is a block diagram of an article semantic vector representation system in the embodiment of the present invention, as shown in fig. 6, including: the sentence vector obtaining module is used for obtaining sentence vectors of any sentence according to all word vectors of any sentence in the article; the sentence processing module is used for inputting corresponding sentence vectors arranged according to the sentence positive sequence and corresponding sentence vectors arranged according to the inverted sequence of the article into a bidirectional long-short-time memory network model, and obtaining a first output corresponding to any sentence vector and a second output corresponding to the attention sentence vector corresponding to any sentence vector, wherein the attention sentence vector corresponding to any sentence vector is: at least one sentence vector formed by any sentence vector and the sentence vector before any sentence vector under the sentence vector positive sequence arrangement of the article; the sentence semantic vector obtaining module is used for obtaining the semantic vector of the sentence corresponding to any sentence vector according to the second output quantity and the attention of the corresponding attention sentence vector to any sentence vector; the article semantic vector acquisition module is used for acquiring the article semantic vector according to the semantic vectors of sentences corresponding to all sentence vectors in the article.

Based on the above embodiment, the output-obtaining module is further configured to: inputting corresponding sentence vectors arranged according to the sentence positive sequence of the article into a bidirectional long-short-time memory network model, and acquiring semantic information from a first sentence vector to any sentence vector and semantic information from the first sentence vector to the attention sentence vector; inputting corresponding sentence vectors arranged in reverse order according to sentences of the article into a bidirectional long-short-time memory network model, and acquiring semantic information from the last sentence vector to any sentence vector and semantic information from the last sentence vector to the attention degree sentence vector; integrating the semantic information from the first sentence vector to any sentence vector and the semantic information from the last sentence vector to any sentence vector to obtain a first output quantity corresponding to any sentence vector, and integrating the semantic information from the first sentence vector to the attention degree sentence vector and the semantic information from the last sentence vector to the attention degree sentence vector to obtain a second output quantity corresponding to the attention degree sentence vector.

Based on the above embodiment, the module for obtaining sentence semantic vector is further configured to: according to the second output quantity and the attention of the corresponding attention sentence vector to any sentence vector, the semantic vector of the sentence corresponding to any sentence vector is obtained through the following formula:

According to the method and the system for representing the semantic vectors of the articles, all sentence vectors in the articles are obtained through setting, the sentence vectors are operated, and finally the semantic vectors of the articles are obtained, so that the total calculated amount in the stage of extracting the semantic vectors of the paragraphs is greatly reduced, the function of representing the semantic of the paragraphs, which cannot be achieved by extracting the semantic vectors based on the word vectors in the prior art, is solved, and the extracted semantic vectors contain all useful information of the whole articles. By setting the attention degree of the calculation attention degree sentence vector to any sentence vector, the degree of interaction among sentences in the article can be known, so that the acquired article semantic vector is more accurate. By setting the attention degree of any sentence vector according to the second output quantity and the corresponding attention degree sentence vector, the semantic vector of the sentence corresponding to any sentence vector is obtained, and the contribution degree of each sentence can be effectively distinguished. By setting the bilinear function, the performance of the neural network can be better improved compared with the forward neural excitation function based on the word vector attention model, and good results can be obtained when the neural network is applied to a paragraph center extraction network.

Finally, the method of the present invention is only a preferred embodiment and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention, are intended to be included within the scope of the present invention.

Claims

1. An article semantic vector representation method, comprising:

s1, acquiring sentence vectors of any sentence according to all word vectors of the sentence in the article;

s2, inputting corresponding sentence vectors arranged according to the sentence positive sequence and corresponding sentence vectors arranged according to the sentence negative sequence of the article into a bidirectional long-short-time memory network model, and obtaining a first output quantity corresponding to any sentence vector and a second output quantity corresponding to the attention sentence vector corresponding to any sentence vector, wherein the attention sentence vector corresponding to any sentence vector is:

at least one sentence vector formed by any sentence vector and the sentence vector before any sentence vector under the sentence vector positive sequence arrangement of the article;

s3, acquiring semantic vectors of sentences corresponding to any sentence vector according to the second output quantity and the attention of the corresponding attention sentence vector to the any sentence vector;

s4, acquiring semantic vectors of the article according to the semantic vectors of sentences corresponding to all sentence vectors in the article;

wherein, step S1 further comprises:

summing the same dimension points of all word vectors of any sentence in the article to obtain the sentence vector of the any sentence;

step S2 further comprises:

inputting corresponding sentence vectors arranged according to the sentence positive sequence of the article into a bidirectional long-short-time memory network model, and acquiring semantic information from a first sentence vector to any sentence vector and semantic information from the first sentence vector to the attention sentence vector;

inputting corresponding sentence vectors arranged in reverse order according to sentences of the article into a bidirectional long-short-time memory network model, and acquiring semantic information from the last sentence vector to any sentence vector and semantic information from the last sentence vector to the attention degree sentence vector;

integrating the semantic information from the first sentence vector to any sentence vector and the semantic information from the last sentence vector to any sentence vector to obtain a first output quantity corresponding to any sentence vector, and integrating the semantic information from the first sentence vector to the attention degree sentence vector and the semantic information from the last sentence vector to the attention degree sentence vector to obtain a second output quantity corresponding to the attention degree sentence vector;

the attention degree of the corresponding attention degree sentence vector to any sentence vector in the step S3 is obtained by the following formula:

wherein a is _ij For the attention degree of the jth sentence vector to the ith sentence vector, e _ij As bilinear function, T _x Is i, exp (e _ij ) E based on e _ij The jth sentence vector is any attention sentence vector of the ith sentence vector, j is more than 0 and less than or equal to i, i is more than 0 and less than or equal to b, i, j is less than or equal to Z, b is the number of sentence vectors of the article, and Z is an integer set;

wherein, step S3 further comprises:

according to the second output quantity and the attention of the corresponding attention sentence vector to any sentence vector, the semantic vector of the sentence corresponding to any sentence vector is obtained through the following formula:

2. The method of claim 1, wherein the bilinear function is:

e _ij ＝c _i-1 Wh _j

wherein e _ij As bilinear function, c _i-1 Is the semantic vector of the sentence corresponding to the i-1 th sentence vector, W is h _j Weight matrix W E R ^h*h ，R ^h*h A real number domain of h multiplied by h, h E R, R is a real number set, h _j The first output quantity of the jth sentence vector is more than 0 and less than or equal to i, more than 0 and less than or equal to b, i, j E Z, b is the number of sentence vectors of the article, and Z is an integer set.

3. The representation according to claim 2, wherein the weight matrix is obtained by a back propagation algorithm.

4. An article semantic vector representation system, comprising:

the sentence vector obtaining module is used for obtaining sentence vectors of any sentence according to all word vectors of any sentence in the article;

the sentence processing module is used for inputting corresponding sentence vectors arranged according to the sentence positive sequence and corresponding sentence vectors arranged according to the inverted sequence of the article into a bidirectional long-short-time memory network model, and obtaining a first output corresponding to any sentence vector and a second output corresponding to the attention sentence vector corresponding to any sentence vector, wherein the attention sentence vector corresponding to any sentence vector is:

the sentence semantic vector obtaining module is used for obtaining the semantic vector of the sentence corresponding to any sentence vector according to the second output quantity and the attention of the corresponding attention sentence vector to any sentence vector;

the article semantic vector obtaining module is used for obtaining the article semantic vector according to the semantic vectors of sentences corresponding to all sentence vectors in the article;

wherein, the system is specifically used for:

the attention degree of the corresponding attention degree sentence vector to any sentence vector is obtained through the following formula: