CN109726383B - Article semantic vector representation method and system - Google Patents

Article semantic vector representation method and system Download PDF

Info

Publication number
CN109726383B
CN109726383B CN201711024027.4A CN201711024027A CN109726383B CN 109726383 B CN109726383 B CN 109726383B CN 201711024027 A CN201711024027 A CN 201711024027A CN 109726383 B CN109726383 B CN 109726383B
Authority
CN
China
Prior art keywords
sentence
vector
sentence vector
semantic
article
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711024027.4A
Other languages
Chinese (zh)
Other versions
CN109726383A (en
Inventor
王宁君
张春荣
赵琦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Potevio Information Technology Co Ltd
Original Assignee
Potevio Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Potevio Information Technology Co Ltd filed Critical Potevio Information Technology Co Ltd
Priority to CN201711024027.4A priority Critical patent/CN109726383B/en
Publication of CN109726383A publication Critical patent/CN109726383A/en
Application granted granted Critical
Publication of CN109726383B publication Critical patent/CN109726383B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Machine Translation (AREA)

Abstract

The invention provides an article semantic vector representation method and a system, wherein the representation method comprises the following steps: s1, acquiring sentence vectors of any sentence according to all word vectors of any sentence in the article; s2, inputting corresponding sentence vectors arranged according to the sentence positive sequence of the article and corresponding sentence vectors arranged in the reverse sequence into a bidirectional long-short-time memory network model, and obtaining a first output quantity corresponding to any sentence vector and a second output quantity of the attention sentence vector corresponding to any sentence vector; s3, according to the second output quantity and the attention of the corresponding attention sentence vector to any sentence vector, acquiring the semantic vector of the sentence corresponding to any sentence vector; s4, acquiring the semantic vectors of the article according to the semantic vectors of sentences corresponding to all the sentence vectors in the article. The invention greatly reduces the total calculated amount in the paragraph semantic extraction stage, and solves the problem that the traditional word vector-based semantic vector extraction cannot realize the function of paragraph semantic representation.

Description

Article semantic vector representation method and system
Technical Field
The invention relates to the field of article semantic analysis, in particular to an article semantic vector representation method and an article semantic vector representation system.
Background
Vector representation of article semantics plays an important role in many fields related to natural language processing, such as extraction of text center ideas, semantic analysis of text, text classification, dialogue systems, and research in machine translation. However, the semantic representation of the article in the prior art adopts a word vector-based method, and the semantic representation of the paragraph is calculated on the basis of the word vector.
Fig. 1 is a schematic diagram of a semantic vector representation method of an article based on a word vector in the prior art, please refer to fig. 1, the method is based on semantic representation of the word vector, and the process of the method is to directly output the word vector through a Long Short-Term Memory (LSTM) network to obtain the semantic vector of a sentence or article. The method comprises the steps of firstly segmenting a text, carrying out vocabulary vectorization after segmentation to obtain word vectors, and inputting the word vectors into an LSTM model according to the time sequence of sentences. The final output of the model is the final semantic vector of the sentence of text.
Wherein { x 1 ,x 2 ,...,x n The Word vector is obtained through Word2vector, and the final output result of the theoretical model contains all information which should be kept in the sentence, so that the output result can be used as the semantic vector representation of the sentence, but the effect on the text semantic representation can be greatly influenced by the way of extracting the semantic vector.
For this word vector based sentence semantic representation method, taking sequence to sequence model in neural network based machine translation (neural machine translation, NMT) as an example, word vectors or word vectors are sequentially input into the encodings in the model until we input the last word vector or word vector of this sentence, at which time the encodings output the semantic vector of the whole sentence. NMT is characterized by considering the input information of each step before, so that in theory this semantic vector can contain the information of the whole sentence. However, in the actual operation process, with the continuous growth of word sequences, especially the text quantity reaching the paragraph level, the following problems are found: when a sequence is continuously input, semantic information cannot be memorized and the information representing the whole sequence; the vocabularies are similar to the influence factors among the vocabularies, so that the key points in the text cannot be highlighted; it is difficult to extract semantic vector representations of paragraph text.
In the prior art, another representation method, which is also the most representative semantic vector, is to obtain a semantic vector representation based on a word vector's attention model. The distraction is that the human brain changes the attention point of the external information through sensory ingenious and reasonable when receiving or processing the external information, selectively ignores the content less relevant to itself, and amplifies the information required by itself. By changing the attention point, the receiving sensitivity and the information processing speed of the human brain to the information of the focused attention part are greatly enhanced, irrelevant information can be effectively filtered, and closely related information is highlighted. In summary, the basic idea of the attention mechanism is not to treat each location of the entire scene equally at a time, but to focus on specific locations according to the needs. Once the specific extracted rules are determined, machine learning or deep neural network learning is used to learn where in the future the image should be focused.
Text semantic vector representation the initial attention mechanism was used on NMT, neural network machine translation is a typical sequence to sequence model, which contains a encoder to decoder model.
Fig. 2 is a schematic diagram of an article semantic vector representation method based on a word vector attention model in the prior art, as shown in fig. 2, in which the existing word vector attention model is used to obtain a semantic vector representation, a cyclic neural network (Recurrent neural Network, RNN) is used to encode words in a source text according to a time sequence, each output after encoding is multiplied by a corresponding attention degree, and finally a fixed-dimension intermediate semantic vector is obtained by summation. The specific semantic vector is expressed as:
Figure BDA0001448018690000021
Figure BDA0001448018690000031
wherein c i Is a semantic vector, T s B is the number of sentences of the article, h j Output through LSTM for jth word vector, exp (e ij ) E based on e ij To the power, T x Is i, v a For the output of a word vector through LSTM,
Figure BDA0001448018690000032
is S i-1 Weight matrix of S i-1 U is the hidden state of the decoder at the i-1 th moment a Is h j Tan () is an activation function 0 < j.ltoreq.i, 0 < i.ltoreq.b, i, j.epsilon.Z, Z being an integer set.
For the semantic vector representation method of the attention model based on the word vectors, due to the introduction of an attention mechanism, each vocabulary is enabled to extract synchronous attention after data learning, each vocabulary has respective corresponding attention, and the semantic representation of the sentences obtained through weighting realizes the key extraction among the sentences.
However, this method simply adds the semantic representations of sentences directly, makes it difficult to represent semantic vectors for long text such as translation of articles, extraction of article summaries, and is based on word vectors and does not represent the overall semantics of articles or paragraphs well.
Disclosure of Invention
The present invention provides a method and system for semantic vector representation of articles that overcomes the above-described problems.
According to one aspect of the present invention, there is provided an article semantic vector representation method, including: s1, acquiring sentence vectors of any sentence according to all word vectors of the sentence in the article; s2, inputting corresponding sentence vectors arranged according to the sentence positive sequence and corresponding sentence vectors arranged according to the sentence negative sequence of the article into a bidirectional long-short-time memory network model, and obtaining a first output quantity corresponding to any sentence vector and a second output quantity corresponding to the attention sentence vector corresponding to any sentence vector, wherein the attention sentence vector corresponding to any sentence vector is: at least one sentence vector formed by any sentence vector and the sentence vector before any sentence vector under the sentence vector positive sequence arrangement of the article; s3, acquiring semantic vectors of sentences corresponding to any sentence vector according to the second output quantity and the attention of the corresponding attention sentence vector to the any sentence vector; s4, acquiring the semantic vectors of the article according to the semantic vectors of sentences corresponding to all the sentence vectors in the article.
Preferably, step S1 further comprises: and adding the same dimension points of all word vectors of any sentence in the article to obtain the sentence vector of the any sentence.
Preferably, step S2 further comprises: inputting corresponding sentence vectors arranged according to the sentence positive sequence of the article into a bidirectional long-short-time memory network model, and acquiring semantic information from a first sentence vector to any sentence vector and semantic information from the first sentence vector to the attention sentence vector; inputting corresponding sentence vectors arranged in reverse order according to sentences of the article into a bidirectional long-short-time memory network model, and acquiring semantic information from the last sentence vector to any sentence vector and semantic information from the last sentence vector to the attention degree sentence vector; integrating the semantic information of the first sentence vector to any sentence vector and the semantic information of the last sentence vector to any sentence vector to obtain a first output quantity corresponding to any sentence vector, and, and integrating the semantic information from the first sentence vector to the attention sentence vector and the semantic information from the last sentence vector to the attention sentence vector to obtain a second output quantity corresponding to the attention sentence vector.
Preferably, in step S3, the degree of interest of the corresponding degree of interest sentence vector on the arbitrary sentence vector is obtained by:
Figure BDA0001448018690000041
wherein a is ij For the attention degree of the jth sentence vector to the ith sentence vector, e ij As bilinear function, T x Is i, exp (e ij ) E based on e ij The jth sentence vector is any attention sentence vector of the ith sentence vector, j is more than 0 and less than or equal to i, i is more than 0 and less than or equal to b, i, j is less than or equal to Z, b is the number of sentence vectors of the article, and Z is an integer set.
Preferably, step S3 further comprises: according to the second output quantity and the attention of the corresponding attention sentence vector to any sentence vector, the semantic vector of the sentence corresponding to any sentence vector is obtained through the following formula:
Figure BDA0001448018690000042
wherein c i For the semantic vector, T, of the sentence corresponding to the ith sentence vector s B, b is the number of sentence vectors of the article, a ij For the attention degree of the jth sentence vector to the ith sentence vector, h j For the first output quantity of the jth sentence vector, j is more than 0 and less than or equal to i, i is more than 0 and less than or equal to b, i, j is less than or equal to Z, and Z is an integer set.
Preferably, the bilinear function is:
e ij =c i-1 Wh j
wherein e ij As bilinear function, c i-1 Is the semantic vector of the sentence corresponding to the i-1 th sentence vector, W is h j Weight matrix W E R h*h ,R h*h A real number domain of h multiplied by h, h E R, R is a real number set, h j For the first output quantity of the jth sentence vector, j is more than 0 and less than or equal to i, i is more than 0 and less than or equal to b, i, j is less than or equal to Z, b is the number of sentence vectors of the article, and Z is an integer set.
Preferably, the weight matrix is obtained by a back propagation algorithm.
According to another aspect of the present invention, there is provided an article semantic vector representation system comprising: the sentence vector obtaining module is used for obtaining all word vectors of any sentence in the article, obtaining sentence vectors of any sentence; the sentence processing module is used for inputting corresponding sentence vectors arranged according to the sentence positive sequence and corresponding sentence vectors arranged according to the inverted sequence of the article into a bidirectional long-short-time memory network model, and obtaining a first output corresponding to any sentence vector and a second output corresponding to the attention sentence vector corresponding to any sentence vector, wherein the attention sentence vector corresponding to any sentence vector is: at least one sentence vector formed by any sentence vector and the sentence vector before any sentence vector under the sentence vector positive sequence arrangement of the article; the sentence semantic vector obtaining module is used for obtaining the semantic vector of the sentence corresponding to any sentence vector according to the second output quantity and the attention of the corresponding attention sentence vector to any sentence vector; the article semantic vector acquisition module is used for acquiring the article semantic vector according to the semantic vectors of sentences corresponding to all sentence vectors in the article.
Preferably, the output-of-acquisition module is further configured to: inputting corresponding sentence vectors arranged according to the sentence positive sequence of the article into a bidirectional long-short-time memory network model, and acquiring semantic information from a first sentence vector to any sentence vector and semantic information from the first sentence vector to the attention sentence vector; inputting corresponding sentence vectors arranged in reverse order according to sentences of the article into a bidirectional long-short-time memory network model, and acquiring semantic information from the last sentence vector to any sentence vector and semantic information from the last sentence vector to the attention degree sentence vector; integrating the semantic information from the first sentence vector to any sentence vector and the semantic information from the last sentence vector to any sentence vector to obtain a first output quantity corresponding to any sentence vector, and integrating the semantic information from the first sentence vector to the attention degree sentence vector and the semantic information from the last sentence vector to the attention degree sentence vector to obtain a second output quantity corresponding to the attention degree sentence vector.
Preferably, the sentence semantic vector obtaining module is further configured to: according to the second output quantity and the attention of the corresponding attention sentence vector to any sentence vector, the semantic vector of the sentence corresponding to any sentence vector is obtained through the following formula:
Figure BDA0001448018690000061
wherein c i For the semantic vector, T, of the sentence corresponding to the ith sentence vector s B, b is the number of sentence vectors of the article, a ij For the attention degree of the jth sentence vector to the ith sentence vector, h j For the first output quantity of the jth sentence vector, j is more than 0 and less than or equal to i, i is more than 0 and less than or equal to b, i, j is less than or equal to Z, and Z is an integer set.
According to the method and the system for representing the semantic vectors of the articles, all sentence vectors in the articles are obtained through setting, the sentence vectors are operated, and finally the semantic vectors of the articles are obtained, so that the total calculated amount in the stage of extracting the semantic vectors of the paragraphs is greatly reduced, the function of representing the semantic of the paragraphs, which cannot be achieved by extracting the semantic vectors based on the word vectors in the prior art, is solved, and the extracted semantic vectors contain all useful information of the whole articles.
Drawings
FIG. 1 is a schematic diagram of a method for representing semantic vectors of articles based on word vectors in the prior art;
FIG. 2 is a schematic diagram of a method for semantic vector representation of articles based on a word vector attention model in the prior art;
FIG. 3 is a flowchart of an article semantic vector representation method according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a first output acquisition in an embodiment of the present invention;
FIG. 5 is a schematic diagram of the acquisition of article semantic vectors in an embodiment of the present invention
FIG. 6 is a block diagram of an article semantic vector representation system in accordance with an embodiment of the present invention.
Detailed Description
The following describes in further detail the embodiments of the present invention with reference to the drawings and examples. The following examples are illustrative of the invention and are not intended to limit the scope of the invention.
FIG. 3 is a flowchart of an article semantic vector representation method according to an embodiment of the present invention, as shown in FIG. 3, including: s1, acquiring sentence vectors of any sentence according to all word vectors of the sentence in the article; s2, inputting corresponding sentence vectors arranged according to the sentence positive sequence and corresponding sentence vectors arranged according to the sentence negative sequence of the article into a bidirectional long-short-time memory network model, and obtaining a first output quantity corresponding to any sentence vector and a second output quantity corresponding to the attention sentence vector corresponding to any sentence vector, wherein the attention sentence vector corresponding to any sentence vector is: at least one sentence vector formed by any sentence vector and the sentence vector before any sentence vector under the sentence vector positive sequence arrangement of the article; s3, acquiring semantic vectors of sentences corresponding to any sentence vector according to the second output quantity and the attention of the corresponding attention sentence vector to the any sentence vector; s4, acquiring the semantic vectors of the article according to the semantic vectors of sentences corresponding to all the sentence vectors in the article.
Specifically, all Word vectors of any sentence in the article are preferably obtained by Word2 vector.
Further, to solve the problem of sequence length gaps in the data of the sequence problem, those skilled in the art have devised recurrent neural networks (recurrent neural network, RNN) to deal with the sequence problem. However, there are two problems with the conventional RNN. On the one hand, long-distance dependence, on the other hand, gradient extinction and gradient explosion, which is particularly evident when dealing with long sequences.
To solve the above problems, a Long Short-Term Memory (LSTM) model has been proposed by those skilled in the art. The RNN architecture is specially used for solving the problems of gradient elimination and gradient explosion of an RNN model. The activation state of the memory block is controlled by three multiplication gates: input gate (input gate), output gate (output gate), forget gate (foreget gate). The structure can enable the information input before to be stored in the network and transmitted forward all the time, the history state stored in the network can be changed by the new input when the input door is opened, the history state stored when the output door is opened can be accessed, the later output is affected, and the user forgets to empty the history information stored before.
Since the inputs of the LSTM model are unidirectional, future context information is often ignored. The basic idea of a two-way long and short-term memory network is to train one LSTM model each forward and backward using one training sequence. And then the outputs of the two models are linearly combined to achieve the aim that each node in the sequence can completely depend on all the context information.
Specifically, in step S2, under the positive sentence vector arrangement of the articles, at least one sentence vector formed by any one sentence vector and the sentence vector before any one sentence vector means that if any one sentence vector is the first sentence vector of the articles, the attention sentence vector of any one sentence vector is the first sentence vector and only includes one sentence vector. If any sentence vector is not the first sentence vector of the article, the attention sentence vector of any sentence vector is a plurality of sentence vectors included in the first sentence vector to any sentence vector in positive sequence arrangement.
It should be noted that, in the embodiment of the present invention, the first sentence vector is the first sentence vector in the positive sequence arrangement, and the last sentence vector in the embodiment of the present invention is the last sentence vector in the positive sequence arrangement. The number of the embodiments of the present invention refers to two or more.
According to the article semantic vector representation method provided by the invention, all sentence vectors in the article are acquired through setting, and the article semantic vector is finally acquired through operating the sentence vectors, so that the total calculated amount in the paragraph semantic extraction stage is greatly reduced, the function of paragraph semantic representation which cannot be realized by the traditional word vector-based semantic vector extraction is solved, and the extracted semantic vector contains all useful information of the whole article.
Based on the above embodiment, step S1 further includes: and adding the same dimension points of all word vectors of any sentence in the article to obtain the sentence vector of the any sentence.
In particular, sentence vectors are derived from unlabeled data and can therefore be applied to data that is not too much tagged, and each paragraph or even each document can be mapped onto a unique vector based on sentence vectors.
Further, word vectors among sentences are basic constituent units of sentence vectors. For a sentence, the dimensions of each word vector it contains are consistent. And adding the same dimension points of all word vectors of any sentence in the article, so that the sentence vector of any sentence can be obtained. The sentence vector of any sentence is consistent with the dimensions of all word vectors of that sentence.
After the sentence vector is obtained, the first output quantity and the second output quantity in the above embodiment are further explained, where it should be noted that, in the embodiment of the present invention, the first output quantity and the second output quantity are both output quantities through the two-way long-short-time memory network model, and have the same properties.
Fig. 4 is a schematic diagram of the first output acquisition in the embodiment of the present invention, referring to fig. 4, step S2 in the above embodiment further includes: inputting corresponding sentence vectors arranged according to the sentence positive sequence of the article into a bidirectional long-short-time memory network model, and acquiring semantic information from a first sentence vector to any sentence vector and semantic information from the first sentence vector to the attention sentence vector; inputting corresponding sentence vectors arranged in reverse order according to sentences of the article into a bidirectional long-short-time memory network model, and acquiring semantic information from the last sentence vector to any sentence vector and semantic information from the last sentence vector to the attention degree sentence vector; integrating the semantic information from the first sentence vector to any sentence vector and the semantic information from the last sentence vector to any sentence vector to obtain a first output quantity corresponding to any sentence vector, and integrating the semantic information from the first sentence vector to the attention degree sentence vector and the semantic information from the last sentence vector to the attention degree sentence vector to obtain a second output quantity corresponding to the attention degree sentence vector.
Specifically, any sentence vector corresponds to a first output quantity.
Further, the Bi-directional long-short-time memory network model (Bi-LSTM) has the functions of forward encoding and backward encoding, and the above-mentioned process of inputting the corresponding sentence vectors arranged according to the sentence positive sequence of the article into the Bi-directional long-time memory network model to obtain the semantic information from the first sentence vector to any one of the sentence vectors and the semantic information from the first sentence vector to the attention sentence vector as forward encoding. Correspondingly, inputting the corresponding sentence vectors which are arranged according to the sentence inverted sequence of the article into a bidirectional long-short-time memory network model, and obtaining the semantic information from the last sentence vector to any sentence vector and the process that the semantic information from the last sentence vector to the attention degree sentence vector is in reverse coding. Sentence vectors are the basic unit of input of the Bi-directional long-short-time memory network model (Bi-LSTM).
Further, the first sentence vector to any sentence vector means that sentence vectors of articles are sequentially ordered from the first sentence vector to any sentence vector in a positive order arrangement. Accordingly, the last sentence vector to any sentence vector means that sentences of the article are in reverse order from the last sentence vector to the any sentence vector in the positive order arrangement.
Further, the first sentence vector to the attention sentence vector refers to each attention sentence vector from the first sentence vector to the arbitrary sentence vector. The last sentence vector to the attention sentence vector refers to each attention sentence vector from the last sentence vector to any one of the sentence vectors.
Further, for any sentence vector, the semantic information of forward coding and backward coding of a Bi-directional long-short-time memory network model (Bi-LSTM) is integrated, and the obtained first output quantity of any sentence vector contains the semantic information of the sentence vector before the sentence vector and the semantic information of the sentence vector after the sentence vector.
V is also described as D To a certain extent, can also represent semantic vectors of articles, but there are problems when the paragraph sentence amount is too large: semantic information cannot be memorized and represent the information of the whole article sequence; since the influence factors of all the vocabularies are similar, the emphasis of the article cannot be highlighted and the learning of the central ideas is inaccurate. Based on this, the invention further proposes step S3.
The above embodiment describes that the semantic vector of the sentence corresponding to any sentence vector is obtained according to the second output quantity and the attention of the corresponding attention sentence vector to any sentence vector, and the embodiment explains the manner of obtaining the attention of the corresponding attention sentence vector to any sentence vector.
The attention degree of the corresponding attention degree sentence vector to any sentence vector in step S3 in the above embodiment is obtained by the following formula:
Figure BDA0001448018690000101
wherein a is ij For the attention degree of the jth sentence vector to the ith sentence vector, e ij As bilinear function, T x Is i, exp (e ij ) E based on e ij The jth sentence vector is any attention sentence vector of the ith sentence vector, j is more than 0 and less than or equal to i, i is more than 0 and less than or equal to b, i, j is less than or equal to Z, b is the number of sentence vectors of the article, and Z is an integer set.
Specifically, the attention degree is a degree of attention, and may be regarded as an affected degree.
According to the article semantic vector representation method, the degree of interaction between sentences in the article can be obtained through setting the degree of interest of the calculated degree of interest sentence vector on any sentence vector, so that the acquired article semantic vector is more accurate.
Step S3 is further described below based on the above embodiment.
Specifically, step S3 further includes: according to the second output quantity and the attention of the corresponding attention sentence vector to any sentence vector, the semantic vector of the sentence corresponding to any sentence vector is obtained through the following formula:
Figure BDA0001448018690000111
wherein c i For the semantic vector, T, of the sentence corresponding to the ith sentence vector s B, b is the number of sentence vectors of the article, a ij For the attention degree of the jth sentence vector to the ith sentence vector, h j For the first output quantity of the jth sentence vector, j is more than 0 and less than or equal to i, i is more than 0 and less than or equal to b, i, j is less than or equal to Z, and Z is an integer set.
Further, h in the embodiment of the invention j And V D All refer to the first output.
According to the article semantic vector representation method provided by the invention, the semantic vector of the sentence corresponding to any sentence vector is obtained by setting the attention of the sentence vector to any sentence vector according to the second output quantity and the corresponding attention, so that the contribution degree of each sentence can be effectively distinguished.
Based on the above embodiment, the bilinear function is:
e ij =c i-1 Wh j
wherein e ij As bilinear function, c i-1 Is the semantic vector of the sentence corresponding to the i-1 th sentence vector, W is h j Weight matrix W E R h*h ,R h*h A real number domain of h multiplied by h, h E R, R is a real number set, h j For the first output quantity of the jth sentence vector, j is more than 0 and less than or equal to i, i is more than 0 and less than or equal to b, i, j is less than or equal to Z, b is the number of sentence vectors of the article, and Z is an integer set.
According to the article semantic vector representation method, compared with the forward nerve excitation function based on the word vector attention model, the performance of the neural network can be better improved by setting the bilinear function, and a better result can be obtained when the method is applied to the paragraph center extraction network.
Based on the above embodiment, the weight matrix is obtained by a back propagation algorithm.
Specifically, the back propagation algorithm (Backpropagation algorithm, BP) is a supervised learning algorithm, often used to train multi-layer perceptrons. The BP algorithm is a generalization of the Delta rule, requiring that the function used by each artificial neuron must be differentiable. The BP algorithm is particularly suitable for training a forward neural network.
As a preferred embodiment, fig. 5 is a schematic diagram of obtaining an article semantic vector according to an embodiment of the present invention, referring to fig. 5, the obtaining an article semantic vector may specifically include the following steps:
first, the same dimension point of all word vectors of any sentence in the article is added to obtain the sentence vector of any sentence.
Secondly, inputting corresponding sentence vectors arranged according to the sentence positive sequence of the article into a bidirectional long-short-time memory network model, and acquiring semantic information of any sentence vector from the first sentence vector and semantic information of the sentence vector from the first sentence vector to the attention degree; inputting corresponding sentence vectors arranged according to the sentence inverted sequence of the article into a bidirectional long-short time memory network model, and acquiring semantic information of the last sentence vector to any sentence vector and semantic information of the last sentence vector to the attention degree sentence vector; the semantic information of the first sentence vector and the semantic information of the last sentence vector are integrated, so that a first output quantity corresponding to any sentence vector is obtained, and the semantic information of the first sentence vector to the attention sentence vector and the semantic information of the last sentence vector to the attention sentence vector are integrated, so that a second output quantity corresponding to the attention sentence vector is obtained.
Then, according to the second output quantity and the attention of the corresponding attention sentence vector to any sentence vector, the semantic vector of the sentence corresponding to any sentence vector is obtained through the following formula:
Figure BDA0001448018690000121
wherein c i For the semantic vector, T, of the sentence corresponding to the ith sentence vector s B, b is the number of sentence vectors of the article, a ij For the attention degree of the jth sentence vector to the ith sentence vector, h j For the first output quantity of the jth sentence vector, j is more than 0 and less than or equal to i, i is more than 0 and less than or equal to b, i, j is less than or equal to Z, and Z is an integer set.
A in the above ij Obtained by the following formula:
Figure BDA0001448018690000131
wherein a is ij For the attention degree of the jth sentence vector to the ith sentence vector, e ij As bilinear function, T x Is i, exp (e ij ) E based on e ij The j-th sentence vector is any attention sentence vector of the i-th sentence vector, j is more than 0 and less than or equal to i, i is more than 0 and less than or equal to b, i and j EZ, b is the number of sentence vectors of the article, Z is an integer set.
And finally, acquiring semantic vectors of sentences corresponding to all sentence vectors in the article, and acquiring the semantic vectors of the article according to the semantic vectors of the sentences corresponding to all sentence vectors in the article.
Based on the above embodiment, the present invention further provides an article semantic vector representation system, configured to implement the article semantic vector representation method of any one of the above embodiments, and fig. 6 is a block diagram of an article semantic vector representation system in the embodiment of the present invention, as shown in fig. 6, including: the sentence vector obtaining module is used for obtaining sentence vectors of any sentence according to all word vectors of any sentence in the article; the sentence processing module is used for inputting corresponding sentence vectors arranged according to the sentence positive sequence and corresponding sentence vectors arranged according to the inverted sequence of the article into a bidirectional long-short-time memory network model, and obtaining a first output corresponding to any sentence vector and a second output corresponding to the attention sentence vector corresponding to any sentence vector, wherein the attention sentence vector corresponding to any sentence vector is: at least one sentence vector formed by any sentence vector and the sentence vector before any sentence vector under the sentence vector positive sequence arrangement of the article; the sentence semantic vector obtaining module is used for obtaining the semantic vector of the sentence corresponding to any sentence vector according to the second output quantity and the attention of the corresponding attention sentence vector to any sentence vector; the article semantic vector acquisition module is used for acquiring the article semantic vector according to the semantic vectors of sentences corresponding to all sentence vectors in the article.
Based on the above embodiment, the output-obtaining module is further configured to: inputting corresponding sentence vectors arranged according to the sentence positive sequence of the article into a bidirectional long-short-time memory network model, and acquiring semantic information from a first sentence vector to any sentence vector and semantic information from the first sentence vector to the attention sentence vector; inputting corresponding sentence vectors arranged in reverse order according to sentences of the article into a bidirectional long-short-time memory network model, and acquiring semantic information from the last sentence vector to any sentence vector and semantic information from the last sentence vector to the attention degree sentence vector; integrating the semantic information from the first sentence vector to any sentence vector and the semantic information from the last sentence vector to any sentence vector to obtain a first output quantity corresponding to any sentence vector, and integrating the semantic information from the first sentence vector to the attention degree sentence vector and the semantic information from the last sentence vector to the attention degree sentence vector to obtain a second output quantity corresponding to the attention degree sentence vector.
Based on the above embodiment, the module for obtaining sentence semantic vector is further configured to: according to the second output quantity and the attention of the corresponding attention sentence vector to any sentence vector, the semantic vector of the sentence corresponding to any sentence vector is obtained through the following formula:
Figure BDA0001448018690000141
wherein c i For the semantic vector, T, of the sentence corresponding to the ith sentence vector s B, b is the number of sentence vectors of the article, a ij For the attention degree of the jth sentence vector to the ith sentence vector, h j For the first output quantity of the jth sentence vector, j is more than 0 and less than or equal to i, i is more than 0 and less than or equal to b, i, j is less than or equal to Z, and Z is an integer set.
According to the method and the system for representing the semantic vectors of the articles, all sentence vectors in the articles are obtained through setting, the sentence vectors are operated, and finally the semantic vectors of the articles are obtained, so that the total calculated amount in the stage of extracting the semantic vectors of the paragraphs is greatly reduced, the function of representing the semantic of the paragraphs, which cannot be achieved by extracting the semantic vectors based on the word vectors in the prior art, is solved, and the extracted semantic vectors contain all useful information of the whole articles. By setting the attention degree of the calculation attention degree sentence vector to any sentence vector, the degree of interaction among sentences in the article can be known, so that the acquired article semantic vector is more accurate. By setting the attention degree of any sentence vector according to the second output quantity and the corresponding attention degree sentence vector, the semantic vector of the sentence corresponding to any sentence vector is obtained, and the contribution degree of each sentence can be effectively distinguished. By setting the bilinear function, the performance of the neural network can be better improved compared with the forward neural excitation function based on the word vector attention model, and good results can be obtained when the neural network is applied to a paragraph center extraction network.
Finally, the method of the present invention is only a preferred embodiment and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention, are intended to be included within the scope of the present invention.

Claims (4)

1. An article semantic vector representation method, comprising:
s1, acquiring sentence vectors of any sentence according to all word vectors of the sentence in the article;
s2, inputting corresponding sentence vectors arranged according to the sentence positive sequence and corresponding sentence vectors arranged according to the sentence negative sequence of the article into a bidirectional long-short-time memory network model, and obtaining a first output quantity corresponding to any sentence vector and a second output quantity corresponding to the attention sentence vector corresponding to any sentence vector, wherein the attention sentence vector corresponding to any sentence vector is:
at least one sentence vector formed by any sentence vector and the sentence vector before any sentence vector under the sentence vector positive sequence arrangement of the article;
s3, acquiring semantic vectors of sentences corresponding to any sentence vector according to the second output quantity and the attention of the corresponding attention sentence vector to the any sentence vector;
s4, acquiring semantic vectors of the article according to the semantic vectors of sentences corresponding to all sentence vectors in the article;
wherein, step S1 further comprises:
summing the same dimension points of all word vectors of any sentence in the article to obtain the sentence vector of the any sentence;
step S2 further comprises:
inputting corresponding sentence vectors arranged according to the sentence positive sequence of the article into a bidirectional long-short-time memory network model, and acquiring semantic information from a first sentence vector to any sentence vector and semantic information from the first sentence vector to the attention sentence vector;
inputting corresponding sentence vectors arranged in reverse order according to sentences of the article into a bidirectional long-short-time memory network model, and acquiring semantic information from the last sentence vector to any sentence vector and semantic information from the last sentence vector to the attention degree sentence vector;
integrating the semantic information from the first sentence vector to any sentence vector and the semantic information from the last sentence vector to any sentence vector to obtain a first output quantity corresponding to any sentence vector, and integrating the semantic information from the first sentence vector to the attention degree sentence vector and the semantic information from the last sentence vector to the attention degree sentence vector to obtain a second output quantity corresponding to the attention degree sentence vector;
the attention degree of the corresponding attention degree sentence vector to any sentence vector in the step S3 is obtained by the following formula:
Figure QLYQS_1
wherein a is ij For the attention degree of the jth sentence vector to the ith sentence vector, e ij As bilinear function, T x Is i, exp (e ij ) E based on e ij The jth sentence vector is any attention sentence vector of the ith sentence vector, j is more than 0 and less than or equal to i, i is more than 0 and less than or equal to b, i, j is less than or equal to Z, b is the number of sentence vectors of the article, and Z is an integer set;
wherein, step S3 further comprises:
according to the second output quantity and the attention of the corresponding attention sentence vector to any sentence vector, the semantic vector of the sentence corresponding to any sentence vector is obtained through the following formula:
Figure QLYQS_2
wherein c i For the semantic vector, T, of the sentence corresponding to the ith sentence vector s B, b is the number of sentence vectors of the article, a ij For the attention degree of the jth sentence vector to the ith sentence vector, h j For the first output quantity of the jth sentence vector, j is more than 0 and less than or equal to i, i is more than 0 and less than or equal to b, i, j is less than or equal to Z, and Z is an integer set.
2. The method of claim 1, wherein the bilinear function is:
e ij =c i-1 Wh j
wherein e ij As bilinear function, c i-1 Is the semantic vector of the sentence corresponding to the i-1 th sentence vector, W is h j Weight matrix W E R h*h ,R h*h A real number domain of h multiplied by h, h E R, R is a real number set, h j The first output quantity of the jth sentence vector is more than 0 and less than or equal to i, more than 0 and less than or equal to b, i, j E Z, b is the number of sentence vectors of the article, and Z is an integer set.
3. The representation according to claim 2, wherein the weight matrix is obtained by a back propagation algorithm.
4. An article semantic vector representation system, comprising:
the sentence vector obtaining module is used for obtaining sentence vectors of any sentence according to all word vectors of any sentence in the article;
the sentence processing module is used for inputting corresponding sentence vectors arranged according to the sentence positive sequence and corresponding sentence vectors arranged according to the inverted sequence of the article into a bidirectional long-short-time memory network model, and obtaining a first output corresponding to any sentence vector and a second output corresponding to the attention sentence vector corresponding to any sentence vector, wherein the attention sentence vector corresponding to any sentence vector is:
at least one sentence vector formed by any sentence vector and the sentence vector before any sentence vector under the sentence vector positive sequence arrangement of the article;
the sentence semantic vector obtaining module is used for obtaining the semantic vector of the sentence corresponding to any sentence vector according to the second output quantity and the attention of the corresponding attention sentence vector to any sentence vector;
the article semantic vector obtaining module is used for obtaining the article semantic vector according to the semantic vectors of sentences corresponding to all sentence vectors in the article;
wherein, the system is specifically used for:
summing the same dimension points of all word vectors of any sentence in the article to obtain the sentence vector of the any sentence;
inputting corresponding sentence vectors arranged according to the sentence positive sequence of the article into a bidirectional long-short-time memory network model, and acquiring semantic information from a first sentence vector to any sentence vector and semantic information from the first sentence vector to the attention sentence vector;
inputting corresponding sentence vectors arranged in reverse order according to sentences of the article into a bidirectional long-short-time memory network model, and acquiring semantic information from the last sentence vector to any sentence vector and semantic information from the last sentence vector to the attention degree sentence vector;
integrating the semantic information from the first sentence vector to any sentence vector and the semantic information from the last sentence vector to any sentence vector to obtain a first output quantity corresponding to any sentence vector, and integrating the semantic information from the first sentence vector to the attention degree sentence vector and the semantic information from the last sentence vector to the attention degree sentence vector to obtain a second output quantity corresponding to the attention degree sentence vector;
the attention degree of the corresponding attention degree sentence vector to any sentence vector is obtained through the following formula:
Figure QLYQS_3
wherein a is ij For the attention degree of the jth sentence vector to the ith sentence vector, e ij As bilinear function, T x Is i, exp (e ij ) E based on e ij The jth sentence vector is any attention sentence vector of the ith sentence vector, j is more than 0 and less than or equal to i, i is more than 0 and less than or equal to b, i, j is less than or equal to Z, b is the number of sentence vectors of the article, and Z is an integer set;
according to the second output quantity and the attention of the corresponding attention sentence vector to any sentence vector, the semantic vector of the sentence corresponding to any sentence vector is obtained through the following formula:
Figure QLYQS_4
wherein c i For the semantic vector, T, of the sentence corresponding to the ith sentence vector s B, b is the number of sentence vectors of the article, a ij For the attention degree of the jth sentence vector to the ith sentence vector, h j For the first output quantity of the jth sentence vector, j is more than 0 and less than or equal to i, i is more than 0 and less than or equal to b, i, j is less than or equal to Z, and Z is an integer set.
CN201711024027.4A 2017-10-27 2017-10-27 Article semantic vector representation method and system Active CN109726383B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711024027.4A CN109726383B (en) 2017-10-27 2017-10-27 Article semantic vector representation method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711024027.4A CN109726383B (en) 2017-10-27 2017-10-27 Article semantic vector representation method and system

Publications (2)

Publication Number Publication Date
CN109726383A CN109726383A (en) 2019-05-07
CN109726383B true CN109726383B (en) 2023-06-23

Family

ID=66290802

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711024027.4A Active CN109726383B (en) 2017-10-27 2017-10-27 Article semantic vector representation method and system

Country Status (1)

Country Link
CN (1) CN109726383B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112784597A (en) * 2019-11-06 2021-05-11 阿里巴巴集团控股有限公司 Method and device for evaluating quality of article
CN113761182A (en) * 2020-06-17 2021-12-07 北京沃东天骏信息技术有限公司 Method and device for determining service problem

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007140602A (en) * 2005-11-14 2007-06-07 Nippon Telegr & Teleph Corp <Ntt> Topic extraction method, device and program, and computer readable recording medium
JP2009086858A (en) * 2007-09-28 2009-04-23 Nippon Telegr & Teleph Corp <Ntt> Content-retrieving device, content-retrieving method, program, and recording medium
CN106919646A (en) * 2017-01-18 2017-07-04 南京云思创智信息科技有限公司 Chinese text summarization generation system and method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007140602A (en) * 2005-11-14 2007-06-07 Nippon Telegr & Teleph Corp <Ntt> Topic extraction method, device and program, and computer readable recording medium
JP2009086858A (en) * 2007-09-28 2009-04-23 Nippon Telegr & Teleph Corp <Ntt> Content-retrieving device, content-retrieving method, program, and recording medium
CN106919646A (en) * 2017-01-18 2017-07-04 南京云思创智信息科技有限公司 Chinese text summarization generation system and method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于分层编码的深度增强学习对话生成;赵宇晴等;《计算机应用》;20171010(第10期);85-90页 *

Also Published As

Publication number Publication date
CN109726383A (en) 2019-05-07

Similar Documents

Publication Publication Date Title
CN109284506B (en) User comment emotion analysis system and method based on attention convolution neural network
CN108733792B (en) Entity relation extraction method
CN108984724B (en) Method for improving emotion classification accuracy of specific attributes by using high-dimensional representation
CN106650813B (en) A kind of image understanding method based on depth residual error network and LSTM
CN111738003B (en) Named entity recognition model training method, named entity recognition method and medium
CN107247702A (en) A kind of text emotion analysis and processing method and system
CN114565104A (en) Language model pre-training method, result recommendation method and related device
CN110738026A (en) Method and device for generating description text
CN110060657B (en) SN-based many-to-many speaker conversion method
CN112115687A (en) Problem generation method combining triples and entity types in knowledge base
CN111241807A (en) Machine reading understanding method based on knowledge-guided attention
CN112131883A (en) Language model training method and device, computer equipment and storage medium
CN111145914B (en) Method and device for determining text entity of lung cancer clinical disease seed bank
CN109726383B (en) Article semantic vector representation method and system
Deng et al. Global context-dependent recurrent neural network language model with sparse feature learning
Albayati et al. A method of deep learning tackles sentiment analysis problem in Arabic texts
CN113934835B (en) Retrieval type reply dialogue method and system combining keywords and semantic understanding representation
Xiong et al. IARNN-based semantic-containing double-level embedding Bi-LSTM for question-and-answer matching
Larsson et al. Disentangled representations for manipulation of sentiment in text
CN109117471A (en) A kind of calculation method and terminal of the word degree of correlation
CN114065769B (en) Method, device, equipment and medium for training emotion reason pair extraction model
CN113779244B (en) Document emotion classification method and device, storage medium and electronic equipment
CN115577072A (en) Short text sentiment analysis method based on deep learning
CN114757183A (en) Cross-domain emotion classification method based on contrast alignment network
Khan Comparing the Performance of NLP Toolkits and Evaluation measures in Legal Tech

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant