CN108829722B - Remote supervision Dual-Attention relation classification method and system - Google Patents
Remote supervision Dual-Attention relation classification method and system Download PDFInfo
- Publication number
- CN108829722B CN108829722B CN201810432079.3A CN201810432079A CN108829722B CN 108829722 B CN108829722 B CN 108829722B CN 201810432079 A CN201810432079 A CN 201810432079A CN 108829722 B CN108829722 B CN 108829722B
- Authority
- CN
- China
- Prior art keywords
- sentence
- vector
- word
- coding
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 61
- 239000013598 vector Substances 0.000 claims abstract description 176
- 230000007246 mechanism Effects 0.000 claims abstract description 44
- 238000013519 translation Methods 0.000 claims abstract description 21
- 238000012545 processing Methods 0.000 claims description 6
- 238000012856 packing Methods 0.000 claims description 5
- 238000005259 measurement Methods 0.000 claims description 3
- 238000012549 training Methods 0.000 abstract description 26
- 238000004806 packaging method and process Methods 0.000 abstract description 4
- 230000005540 biological transmission Effects 0.000 abstract description 3
- 238000013527 convolutional neural network Methods 0.000 description 6
- 238000000605 extraction Methods 0.000 description 5
- 238000002372 labelling Methods 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 238000013528 artificial neural network Methods 0.000 description 4
- 238000012360 testing method Methods 0.000 description 4
- 238000011176 pooling Methods 0.000 description 3
- 230000002457 bidirectional effect Effects 0.000 description 2
- 238000013145 classification model Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 239000003550 marker Substances 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000009193 crawling Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Machine Translation (AREA)
Abstract
The invention relates to a remote supervision Dual-Attention relation classification method and a system, comprising the following steps: aligning entity pairs in a knowledge base to news corpora through remote supervision, and constructing an entity-to-sentence set; carrying out word-level vector coding on the sentence based on a Bi-LSTM model of a word-level attention mechanism to obtain a semantic feature coding vector of the sentence; the method comprises the steps that semantic features of a sentence are coded and denoised based on a Bi-LSTM model of a sentence level attention mechanism, and a sentence set feature coding vector is obtained; and packaging the sentence set feature coding vector and the entity pair translation vector, and carrying out entity pair relation classification on the obtained packet features. The technical scheme provided by the invention reduces the noise data of model training and avoids manual marking data and error transmission caused by the manual marking data. And the entity alignment is carried out by using the open domain text and the large-scale knowledge base, so that the problem of scale of the labeled data extracted by the relation is effectively solved.
Description
Technical Field
The invention belongs to the field of relation classification, and particularly relates to a remote-supervised Dual-attribute relation classification method and system.
Background
With the development of internet technology, a great deal of text information on the world wide web is rapidly growing, and a technology for automatically extracting knowledge from the text information is receiving more and more attention and becomes a current hotspot. The current mainstream relation extraction method is a relation classification method based on neural network learning, and mainly faces three problems: difficulty in representation and mining of semantic features, error transmission caused by manual labeling, and noise influence of model training. At present, in relation classification methods based on neural network learning, a relation classification method achieving the optimal effect appears in supervised learning and remote supervision. By taking the two learning methods as approaches, corresponding improved models appear aiming at three problems, wherein the three problems mainly comprise: a Bi-directional long and short memory network (Bi-LSTM) method is extracted by a supervised learning relation; a remote supervised relationship classification method of Convolutional Neural Network (CNN); a method of relational classification based on a sentence set level attention mechanism for convolutional networks (CNN).
In the face of three major problems of relation classification, the mainstream neural network relation classification method makes better improvement effect on a certain specific problem. However, certain problems exist, the method depends on knowledge in a specific field, and the robustness and the application scene of the model are relatively limited.
Firstly, the relation classification method is carried out by Bi-LSTM alone, although the effective coding of the long-distance semantic features existing in the text is realized. However, the method still depends on manually labeled data sets, and the model only selects one sentence for learning and prediction, does not consider the noisy sentence, and is limited to knowledge in a specific field.
Secondly, the remote supervision method of weak supervision is premised on that: assuming that two entities have a certain relationship in the knowledge base, all sentences containing the two entities in the knowledge will express the relationship. Such an assumption is not completely correct, so that in the process, the automatic generation of the training data has wrong labeling data, which brings noise to the training process. And when the model is trained, selecting the sentence with the highest probability of the entity to the sentence with the relationship as training. The method for selecting the maximum probability does not fully utilize all sentences containing the two entities as the training corpus, and a large amount of information is lost.
In addition, the remote supervision relation classification method based on the attention mechanism of the Convolutional Neural Networks (CNNs) can effectively classify the local semantic features of the text although the influence of wrong labeling is reduced. However, each layer in the CNNs model adopted in the method is fixed in span, and naturally, the layer can only model semantic information with limited distance and is only suitable for the relation extraction tasks in some short texts. Although the partially improved convolutional network model has a modeling for realizing larger span information by overlapping K-segment maximum pooling structures, such as an experiment for performing three-segment pooling through pcnns (piewin cnns), the maximum pooling method has higher cost and relatively weaker performance when extracting semantic features with long dependency relationship in long texts, compared with Bi-LSTM.
Therefore, it is necessary to provide a remote supervised Dual-Attention relationship classification method and system to solve the deficiencies of the prior art.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a remote supervision Dual-attribute relation classification method and system, which automatically acquire a labeled corpus from a knowledge base WikiData and find a commonly occurring sentence of the entity pair from an open domain as a training corpus. The neural network learning model is used for finishing the task of relation extraction by taking the classification of the predefined relation as a target.
A remote supervised Dual-Attention relationship classification method comprises the following steps:
aligning entity pairs in a knowledge base to news corpora through remote supervision, and constructing an entity-to-sentence set;
carrying out word-level vector coding on the sentence based on a Bi-LSTM model of a word-level attention mechanism to obtain a semantic feature coding vector of the sentence;
the method comprises the steps that semantic features of a sentence are coded and denoised based on a Bi-LSTM model of a sentence level attention mechanism, and a sentence set feature coding vector is obtained;
and packaging the sentence set feature coding vector and the entity pair translation vector, and carrying out entity pair relation classification on the obtained packet features.
Further, performing word-level vector coding on the sentence based on a Bi-LSTM model of a word-level attention mechanism to obtain a semantic feature coding vector of the sentence, including:
processing the sentence by adopting a text depth representation model to obtain a word vector of each word in the sentence;
inputting the word vector into a Bi-LSTM model to obtain a coding vector of the word vector;
and adding a word level attention mechanism into the coding vector of the word vector to obtain a semantic feature coding vector of each sentence.
Further, inputting the word vector into a Bi-LSTM model to obtain an encoded vector of the word vector, including:
inputting the word vector into a Bi-LSTM model;
the forward LSTM of the model obtains the above feature information of the word vector, and the backward LSTM of the model obtains the below feature information of the word vector;
and finally, obtaining the context coding vector of the word vector.
Further, adding a word-level attention mechanism to the coding vector of the word vector to obtain a semantic feature coding vector of each sentence, including:
said adding a word-level attention mechanism to said encoded vector;
connecting each time node in the LSTM by a weight vector by calculating attention probability distribution;
and obtaining a semantic feature coding vector of each sentence.
Further, the method for coding and denoising semantic features of a sentence based on a Bi-LSTM model of a sentence level attention mechanism to obtain a sentence set feature coding vector includes:
inputting the semantic feature coding vector of the sentence into a Bi-LSTM model to obtain a feature coding vector of a sentence set;
and adding a sentence level attention mechanism into the feature coding vector of the sentence set to obtain the noise-reduced sentence set feature coding vector.
Further, adding a sentence level attention mechanism to the feature coding vector of the sentence set to obtain a noise-reduced sentence set feature coding vector, including:
adding sentence level attention mechanism weight to each sentence, so that the weight of an effective sentence is great, and the weight of a noise sentence is small;
and obtaining the noise-reduced sentence set feature coding vector.
Further, the sentence set feature encoding vector and the entity pair translation vector are packed, and the obtained packet features are subjected to entity pair relationship classification, including:
introducing a translation vector of an entity pair translation model, giving different weights to sentences with different confidence degrees, and reducing the noise of a sentence set;
introducing the difference value of the entity pair vector as another feature of the similarity measurement sentence into a sentence set to obtain a packet feature;
and carrying out relation classification on the packet features by using a multi-example learning method.
Further, the relationship classification of the packet features by using a multi-example learning method comprises the following steps:
if at least one example of a sentence in a packet is judged to be positive by the classifier, the sentence in the packet is positive example data; if all sentences in a packet are judged to be negative by the classifier, the sentences in the packet are negative example data;
carrying out multi-example learning on the sentence with the tag to obtain a feature representation containing multiple feature relation information;
and predicting which relationship of the entity pair is given by a Softmax relationship classification method to obtain the probability sequence of each relationship.
A remotely supervised Dual-Attention relationship classification system, comprising:
the building module is used for aligning the entity pairs in the knowledge base to news corpora through remote supervision and building an entity-to-sentence set;
the first vector module is used for carrying out word-level vector coding on the sentence based on a Bi-LSTM model of a word-level attention mechanism to obtain a semantic feature coding vector of the sentence;
the second vector module is used for coding and denoising the semantic features of the sentence based on a Bi-LSTM model of the sentence level attention mechanism to obtain a sentence set feature coding vector;
and the relation classification module is used for packing the sentence set feature coding vector and the entity pair translation vector and carrying out entity pair relation classification on the obtained packet features.
Compared with the closest prior art, the technical scheme provided by the invention has the following advantages:
the technical scheme provided by the invention reduces the noise data of model training and avoids manual marking data and error transmission caused by the manual marking data. And the entity alignment is carried out by using the open domain text and the large-scale knowledge base, so that the problem of scale of the labeled data extracted by the relation is effectively solved.
The technical scheme provided by the invention combines the Bi-LSTM word and sentence level feature coding to construct a packet feature training method, and adds an attention weight mechanism of sentence sets and an entity pair translation vector RrelationAnd reducing the weight of invalid sentences, and constructing the packet feature codes of possible relations to carry out multi-example learning. The attention weight of the sentence is combined with the packet feature multi-example learning training of the translation vector, so that the effective vector representation of the relation semantic information is realized, and the accuracy of the relation extraction task is improved.
The technical scheme provided by the invention constructs an end-to-end relation extraction task, and does not depend on complex labeling characteristics such as part of speech, dependency syntax and the like of manual labeling, and the probability value of the relation and the corresponding relation between the word vector input into a sentence from the model and the entity pair output from the model is obtained. The whole process is an end-to-end process, and the coding method of the Dual-Attention mechanism effectively codes the important word characteristics of a sentence at the sentence level; the influence of noise problems brought by a remote supervision method on model training is reduced at the sentence set level, and the accuracy of the trained model is higher.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a detailed flow chart of an embodiment of the present invention;
FIG. 3 is a diagram of the Dual-Attention relationship classification model in the embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
Examples 1,
As shown in fig. 1, an embodiment of the present invention provides a remote-supervised Dual-attribute relationship classification method, including:
aligning entity pairs in a knowledge base to news corpora through remote supervision, and constructing an entity-to-sentence set;
carrying out word-level vector coding on the sentence based on a Bi-LSTM model of a word-level attention mechanism to obtain a semantic feature coding vector of the sentence;
the method comprises the steps that semantic features of a sentence are coded and denoised based on a Bi-LSTM model of a sentence level attention mechanism, and a sentence set feature coding vector is obtained;
and packaging the sentence set feature coding vector and the entity pair translation vector, and carrying out entity pair relation classification on the obtained packet features.
Fig. 2 shows a detailed flowchart of an embodiment of the present invention.
Preferably, aligning the entity pairs in the knowledge base to a corpus by remote supervision, and constructing the entity pair sentence set includes:
the premise hypothesis of the remote supervision method is as follows: if two entities have a certain relationship in the knowledge base, then an unstructured sentence containing the two entities can both represent the relationship. For example, "Jack Ma" and "Alibaba" have a relationship of "provider" in WikiData, and the unstructured text "Jack Ma is the provider of Alibaba" containing these two entities can be used as a training example to train the model. The specific implementation steps of the data construction method are as follows:
the method comprises the following steps: pairs of entities that have relationships, such as "Jack Ma", "Alibaba" herein, are extracted from the knowledge base. And the relation in the knowledge base is known as R1=“founder”,R2=“CEO”,R3=“Boss”,R4…RpAnd so on.
Step two: extracting sentences containing entity pairs from unstructured texts as training samples, and crawling a sentence set S containing entity pairs in news texts1,S2…Sn-1,Sn}. Forming the initial corpus.
Preferably, the performing word-level vector coding on the sentence based on a Bi-LSTM model of the word-level attention mechanism to obtain a semantic feature coding vector of the sentence includes:
in the main corpus of relational classification, such as unstructured news text, sentences are generally long, wherein entity pairs and their relations are far apart from each other when viewed from the position of a word, i.e., semantic relations show long-distance dependency characteristics. Therefore, the Bi-LSTM model is selected to effectively mine the strong feature semantics of the sentences, so that the semantic feature coding of the sentences containing the entities in the sentence set is realized, and the model structure is shown in FIG. 3. Wherein,the embedding's characteristic of the 1 st word representing the nth sentence,the 1 st word of the nth sentence is represented by a hidden vector encoding of contextual features,the context coding vector of the 1 st word of the nth sentence is obtained by combining hidden vector coding after adding the Attention weight. The detailed processing steps are as follows:
the method comprises the following steps: a sentence with co-occurring entities is used as input. A word embedding processing mode of word2vec is selected to map each word in the sentence into a low-dimensional vector, and a character embedding vector of each word is obtained.
Step two: taking the vector of the word obtained in the step one as input, and obtaining the semantic strong features of the sentence from the input vector by utilizing a Bi-LSTM model, wherein the strong features refer to some long textsRemote dependent semantic features in the present document. The bidirectional long-short term memory network is provided with a forward LSTM and a backward LSTM at the hidden layer, the forward LSTM captures the characteristic information of the context, and the backward LSTM captures the characteristic information of the context to obtain the context coding vectorWherein lnRepresenting the number of word vectors for the length of the sentence.
Step three: coding the context vector obtained in the step twoThe Attention mechanism is added, and each time node in the LSTM is connected by the weight vector by calculating the Attention probability distribution. The step mainly highlights the influence of a certain key input on output, captures words of important features in a sentence, and acquires the output features of the bidirectional LSTM according to attention probability.
Step four: deriving semantic feature vector coding [ S 'for each sentence'1,S′2...S'n]。
Preferably, the method for coding and denoising semantic features of a sentence based on a Bi-LSTM model of a sentence-level attention mechanism to obtain a sentence set feature coding vector includes:
for the training sentence corpus, assuming that at least one sentence in all sentences of each entity pair reflects the relationship of the entity pair, we select the sentences containing the entity pair and package the sentences, but need to filter the noisy sentences corresponding to the entity pair during training, for example, we need to extract the relationship of "found", but the sentence of "Jack Ma" and "aiba" does not show the relationship of "Jack Ma" and "aiba" as "found", and the sentence of "found" is a noisy sentence. This problem is solved by a neural network model based on a subset of sentences level attention mechanism, which can assign weights to each sentence of an entity pair according to a specific relationship, enabling valid sentences to obtain higher weights and noisy sentences to obtain lower weights by continuous learning.
The model is shown in the upper part of FIG. 3, where Si' sentence S output for first modeliThe feature code vector h of the possible relationship of different sentences in the Bi-LSTM training modeliVector Rrelation=e1–e2The features of the relationship R are included. If a sentence instance expresses the relation R, it should be related to the vector RrelationHas higher similarity, and can be used as the similarity constraint of the training positive example. A. theiRepresenting the corresponding weights of the different sentences. The method comprises the following specific steps:
the method comprises the following steps: all sentence feature vectors [ S 'containing entity pairs'1,S'2...S'n]And inputting as a Bi-LSTM model to obtain the feature codes at the sentence set level. For example, when a fountain relational classification model is trained, a relational triple "(Jack Ma, fountain, Alibaba)" exists in a relational database, and according to the assumption of remote supervision, (S)iJack Ma, foundation, Alibaba) is a positive example of the relation, the vector weight of the sentence should be high, and feature codes at the sentence set level are obtained by continuously learning positive sentences.
Step two: each Sentence is assigned a sequence-level Attention weight, so that valid sentences get higher weight and noisy sentences get lower weight by continuous learning. Because the core assumption of remote supervision is wrong, sentences which do not express the relationship between the entities can be wrongly marked, after the characteristics of the entities to the sentences are obtained, different weights are added to the sentences by using a selective attribution mechanism, and the influence of invalid sentences is reduced.
Preferably, the step of packing the sentence set feature encoding vector and the entity pair translation vector and performing entity pair relationship classification on the obtained packet features includes:
introducing a translation vector RrelationSentence vector S obtained from the previous modeliContaining entity pair implication relationThe semantic information of R is the feature code of a sentence. For each according to entity pair (e)1,e2) Each instance sentence in the set of packed sentences may express a relationship R or other relationship. Then the sentence vector encoded for the features containing this relationship should have a correlation with the translation vector R during model trainingrelationVery high similarity. Here, the sequence-level Attention weight is associated with the translation vector RrelationActing on each sentence together to reduce the coding impact of invalid sentences.
The multi-example learning training method of the packet features comprises the steps of packaging semantic features obtained by all encoding in the steps, initializing samples in each label packet B into labels of the packets by continuously learning as the multi-example learning method of the packet features, initializing a set U to be empty, and adding all the samples into a sample set U. Repeating the following processes, sampling the data and performing label training to obtain a classification function fB; predicting the marks of all samples by using f, and emptying U; for each positive marker packet, selecting a sample with the highest fB prediction score and adding the sample into a set U; for each negative marker packet, selecting a sample with the highest fB prediction score and adding the sample into the set U; until an end condition is satisfied; returning to fB. The packet features obtained by multi-instance learning contain semantic coding information of the possible relation R, namely semantic feature implicit representation of the possible relation.
And performing Softmax classification on the obtained packet features, and after continuous learning, corresponding to Softmax to several candidate relation categories in bag containing sentence set level features. The goal of the training here is to maximize the accuracy of the classification.
Model training, including relationship class (relationship. txt), training data (train. txt), test data (test. txt) and word vector (vec. txt). The training data and the test data may be raw data randomly ordered, separated by 80% training and 20% testing. And (4) realizing the optimal prediction of the predefined relationship by adjusting the hyper-parameters until different probability values of different relationships of the same entity pair are obtained finally.
Examples 2,
Based on the same inventive concept, the invention also provides a remote-supervised Dual-Attention relation classification system, which comprises:
the building module is used for aligning the entity pairs in the knowledge base to news corpora through remote supervision and building an entity-to-sentence set;
the first vector module is used for carrying out word-level vector coding on the sentence based on a Bi-LSTM model of a word-level attention mechanism to obtain a semantic feature coding vector of the sentence;
the second vector module is used for coding and denoising the semantic features of the sentence based on a Bi-LSTM model of the sentence level attention mechanism to obtain a sentence set feature coding vector;
and the relation classification module is used for packing the sentence set feature coding vector and the entity pair translation vector and carrying out entity pair relation classification on the obtained packet features.
Preferably, the building block comprises:
and processing the sentence by adopting a text depth representation model to obtain a word vector of each word in the sentence.
Inputting the word vector into a Bi-LSTM model to obtain a coding vector of the word vector, and inputting the word vector into the Bi-LSTM model;
the forward LSTM of the model obtains the above feature information of the word vector, and the backward LSTM of the model obtains the below feature information of the word vector;
and finally, obtaining the context coding vector of the word vector.
Adding a word-level attention mechanism into the coding vector of the word vector to obtain a semantic feature coding vector of each sentence, wherein the word-level attention mechanism is added into the coding vector;
connecting each time node in the LSTM by a weight vector by calculating attention probability distribution;
and obtaining a semantic feature coding vector of each sentence.
Preferably, the first vector module comprises:
inputting the semantic feature coding vector of the sentence into a Bi-LSTM model to obtain a feature coding vector of a sentence set;
adding a sentence level attention mechanism into the feature coding vector of the sentence set to obtain a denoised sentence set feature coding vector, and adding a sentence level attention mechanism weight into each sentence to ensure that the weight of an effective sentence is great and the weight of a noise sentence is small;
and obtaining the noise-reduced sentence set feature coding vector.
The second vector module includes:
introducing a translation vector of an entity pair translation model, giving different weights to sentences with different confidence degrees, and reducing the noise of a sentence set;
introducing the difference value of the entity pair vector as another feature of the similarity measurement sentence into a sentence set to obtain a packet feature;
carrying out relation classification on the packet features by using a multi-example learning method, wherein if at least one example that the label is judged to be positive by the classifier exists in one packet, the sentence in the packet is positive example data; if all sentences in a packet are judged to be negative by the classifier, the sentences in the packet are negative example data;
carrying out multi-example learning on the sentence with the tag to obtain a feature representation containing multiple feature relation information;
and predicting which relationship of the entity pair is given by a Softmax relationship classification method to obtain the probability sequence of each relationship.
Examples 3,
Other relationships such as the entity pair "Jack Ma", the "Alibaba" and the corresponding relationship set "found", "CEO" and the like are known in the knowledge base WikiData, and several sentences containing the entity pair "Jack Ma" and "Alibaba" are classified from internet data, and here, sentences in which four entities coexist are exemplified.
Sentence 1: "Fe executives are Alibaba's secret sugar, found Jack Mass.
Sentence 2: "At a conference hosted by All Things D last week, Alibaba CEOJack Ma said that he was intested in Yahoo.
Sentence 3: "Internet entrepreneurial Jack Ma started a chip version of the yellow Pages that way as Alibaba's recursor in Hanzhou, China.") "
Sentence 4: "Alibaba has bright more small U.S. bussiness on to the company's sites, but this is the first time Ma has divided specific targets"
Sentences 3, 4 do not express the predefined relation of the knowledge base. One purpose of the invention is to train sentences with a large number of entity co-occurrence through a model, and realize classification and probability calculation of the corresponding relation of the co-occurrence entities in the sentences. And (3) outputting a result by the model, wherein the probability of the relation result 'fountain' extracted from the sentence 1 is the maximum, and other relations are the second order. The probability of the relation classification result "CEO" for sentence 2 is certainly the largest, and the probabilities of the other relations are the second order. The training corpus is sufficient enough, and the model can judge which relation is the relation according to the probability maximum value of the obtained possible relation for the sentences 3 and 4, so that the relation classification of the co-occurrence entity pair in the sentences 3 and 4 is realized.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.
Claims (8)
1. A remote supervised Dual-Attention relationship classification method is characterized by comprising the following steps:
aligning entity pairs in a knowledge base to news corpora through remote supervision, and constructing an entity-to-sentence set;
carrying out word-level vector coding on the sentence based on a Bi-LSTM model of a word-level attention mechanism to obtain a semantic feature coding vector of the sentence;
the method comprises the steps that semantic features of a sentence are coded and denoised based on a Bi-LSTM model of a sentence level attention mechanism, and a sentence set feature coding vector is obtained;
packing the sentence set feature coding vector and the entity pair translation vector, and carrying out entity pair relation classification on the obtained packet features;
the method for obtaining the semantic feature coding vector of the sentence comprises the following steps of carrying out word-level vector coding on the sentence based on a Bi-LSTM model of a word-level attention mechanism, wherein the word-level vector coding comprises the following steps:
processing the sentence by adopting a text depth representation model to obtain a word vector of each word in the sentence;
inputting the word vector into a Bi-LSTM model to obtain a coding vector of the word vector;
and adding a word level attention mechanism into the coding vector of the word vector to obtain a semantic feature coding vector of each sentence.
2. The remotely supervised Dual-Attention relationship classification method of claim 1, wherein inputting the word vector into a Bi-LSTM model to obtain an encoding vector of the word vector, comprises:
inputting the word vector into a Bi-LSTM model;
the forward LSTM of the model obtains the above feature information of the word vector, and the backward LSTM of the model obtains the below feature information of the word vector;
and finally, obtaining the context coding vector of the word vector.
3. The remotely supervised Dual-Attention relationship classification method of claim 1, wherein adding a word level Attention mechanism to the coding vector of the word vector to obtain a semantic feature coding vector of each sentence, comprises:
said adding a word-level attention mechanism to said encoded vector;
connecting each time node in the LSTM by a weight vector by calculating attention probability distribution;
and obtaining a semantic feature coding vector of each sentence.
4. The remotely supervised Dual-Attention relationship classification method of claim 1, wherein the Bi-LSTM model based on the sentence level Attention mechanism encodes and de-noises semantic features of the sentence to obtain a sentence set feature encoding vector, comprising:
inputting the semantic feature coding vector of the sentence into a Bi-LSTM model to obtain a feature coding vector of a sentence set;
and adding a sentence level attention mechanism into the feature coding vector of the sentence set to obtain the noise-reduced sentence set feature coding vector.
5. The remotely supervised Dual-Attention relationship classification method of claim 4, wherein adding a sentence level Attention mechanism to the feature coding vectors of the sentence set to obtain denoised sentence set feature coding vectors comprises:
adding sentence level attention mechanism weight to each sentence, so that the weight of an effective sentence is great, and the weight of a noise sentence is small;
and obtaining the noise-reduced sentence set feature coding vector.
6. The remotely supervised Dual-Attention relationship classification method of claim 1, wherein the sentence set feature coding vector and the entity pair translation vector are packed, and the obtained package features are subjected to entity pair relationship classification, comprising:
introducing a translation vector of an entity pair translation model, giving different weights to sentences with different confidence degrees, and reducing the noise of a sentence set;
introducing the difference value of the entity pair vector as another feature of the similarity measurement sentence into a sentence set to obtain a packet feature;
and carrying out relation classification on the packet features by using a multi-example learning method.
7. The remotely supervised Dual-Attention relationship classification method of claim 6, wherein the relationship classification of the package features using multi-instance learning comprises:
if at least one example of a sentence in a packet is judged to be positive by the classifier, the sentence in the packet is positive example data; if all sentences in a packet are judged to be negative by the classifier, the sentences in the packet are negative example data;
carrying out multi-example learning on the sentence with the tag to obtain a feature representation containing multiple feature relation information;
and predicting which relationship of the entity pair is given by a Softmax relationship classification method to obtain the probability sequence of each relationship.
8. A remotely supervised Dual-Attention relationship classification system, comprising:
the building module is used for aligning the entity pairs in the knowledge base to news corpora through remote supervision and building an entity-to-sentence set;
the first vector module is used for carrying out word-level vector coding on the sentence based on a Bi-LSTM model of a word-level attention mechanism to obtain a semantic feature coding vector of the sentence;
the second vector module is used for coding and denoising the semantic features of the sentence based on a Bi-LSTM model of the sentence level attention mechanism to obtain a sentence set feature coding vector;
the relation classification module is used for packing the sentence set feature coding vector and the entity pair translation vector and carrying out entity pair relation classification on the obtained packet features;
wherein, the first vector module is specifically configured to:
processing the sentence by adopting a text depth representation model to obtain a word vector of each word in the sentence;
inputting the word vector into a Bi-LSTM model to obtain a coding vector of the word vector;
and adding a word level attention mechanism into the coding vector of the word vector to obtain a semantic feature coding vector of each sentence.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810432079.3A CN108829722B (en) | 2018-05-08 | 2018-05-08 | Remote supervision Dual-Attention relation classification method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810432079.3A CN108829722B (en) | 2018-05-08 | 2018-05-08 | Remote supervision Dual-Attention relation classification method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108829722A CN108829722A (en) | 2018-11-16 |
CN108829722B true CN108829722B (en) | 2020-10-02 |
Family
ID=64148408
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810432079.3A Active CN108829722B (en) | 2018-05-08 | 2018-05-08 | Remote supervision Dual-Attention relation classification method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108829722B (en) |
Families Citing this family (31)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109543047A (en) * | 2018-11-21 | 2019-03-29 | 焦点科技股份有限公司 | A kind of knowledge mapping construction method based on medical field website |
CN109635124B (en) * | 2018-11-30 | 2021-04-23 | 北京大学 | Remote supervision relation extraction method combined with background knowledge |
CN109871533B (en) * | 2019-01-04 | 2019-12-10 | 北京车慧科技有限公司 | Corpus processing system based on corpus field |
CN109871451B (en) * | 2019-01-25 | 2021-03-19 | 中译语通科技股份有限公司 | Method and system for extracting relation of dynamic word vectors |
CN109933809B (en) * | 2019-03-15 | 2023-09-15 | 北京金山数字娱乐科技有限公司 | Translation method and device, and training method and device of translation model |
CN110070093B (en) * | 2019-04-08 | 2023-04-25 | 东南大学 | Remote supervision relation extraction denoising method based on countermeasure learning |
US11069346B2 (en) | 2019-04-22 | 2021-07-20 | International Business Machines Corporation | Intent recognition model creation from randomized intent vector proximities |
CN110175469B (en) * | 2019-05-16 | 2020-11-17 | 山东大学 | Social media user privacy leakage detection method, system, device and medium |
CN110209836B (en) * | 2019-05-17 | 2022-04-26 | 北京邮电大学 | Remote supervision relation extraction method and device |
CN110298036B (en) * | 2019-06-06 | 2022-07-22 | 昆明理工大学 | Online medical text symptom identification method based on part-of-speech incremental iteration |
CN110597948A (en) * | 2019-07-11 | 2019-12-20 | 东华大学 | Entity relation extraction method based on deep learning |
CN110555084B (en) * | 2019-08-26 | 2023-01-24 | 电子科技大学 | Remote supervision relation classification method based on PCNN and multi-layer attention |
CN110619121B (en) * | 2019-09-18 | 2023-04-07 | 江南大学 | Entity relation extraction method based on improved depth residual error network and attention mechanism |
CN110781312B (en) * | 2019-09-19 | 2022-07-15 | 平安科技(深圳)有限公司 | Text classification method and device based on semantic representation model and computer equipment |
CN110647620B (en) * | 2019-09-23 | 2022-07-01 | 中国农业大学 | Knowledge graph representation learning method based on confidence hyperplane and dictionary information |
CN111125364B (en) * | 2019-12-24 | 2023-04-25 | 华南理工大学 | ERNIE-based noise reduction method for remote supervision relation extraction |
CN111159407B (en) * | 2019-12-30 | 2022-01-28 | 北京明朝万达科技股份有限公司 | Method, apparatus, device and medium for training entity recognition and relation classification model |
CN111241303A (en) * | 2020-01-16 | 2020-06-05 | 东方红卫星移动通信有限公司 | Remote supervision relation extraction method for large-scale unstructured text data |
CN111324743A (en) * | 2020-02-14 | 2020-06-23 | 平安科技(深圳)有限公司 | Text relation extraction method and device, computer equipment and storage medium |
CN111368026B (en) * | 2020-02-25 | 2020-11-24 | 杭州电子科技大学 | Text inclusion analysis method based on word meaning relation and dynamic convolution neural network |
CN111091007A (en) * | 2020-03-23 | 2020-05-01 | 杭州有数金融信息服务有限公司 | Method for identifying relationships among multiple enterprises based on public sentiment and enterprise portrait |
CN113553424A (en) * | 2020-04-26 | 2021-10-26 | 阿里巴巴集团控股有限公司 | Data processing method, device and equipment and generation method of event extraction model |
CN113947092A (en) * | 2020-07-16 | 2022-01-18 | 阿里巴巴集团控股有限公司 | Translation method and device |
CN111737440B (en) * | 2020-07-31 | 2021-03-05 | 支付宝(杭州)信息技术有限公司 | Question generation method and device |
CN112307130B (en) * | 2020-10-21 | 2022-07-05 | 清华大学 | Document-level remote supervision relation extraction method and system |
CN112329463A (en) * | 2020-11-27 | 2021-02-05 | 上海汽车集团股份有限公司 | Training method of remote monitoring relation extraction model and related device |
CN112507137B (en) * | 2020-12-17 | 2022-04-22 | 华南理工大学 | Small sample relation extraction method based on granularity perception in open environment and application |
CN112579792B (en) * | 2020-12-22 | 2023-08-04 | 东北大学 | PGAT and FTATT-based remote supervision relation extraction method |
CN113268985B (en) * | 2021-04-26 | 2023-06-20 | 华南理工大学 | Relationship path-based remote supervision relationship extraction method, device and medium |
CN113591478B (en) * | 2021-06-08 | 2023-04-18 | 电子科技大学 | Remote supervision text entity relation extraction method based on deep reinforcement learning |
CN114218956A (en) * | 2022-01-24 | 2022-03-22 | 平安科技(深圳)有限公司 | Relation extraction method and system based on neural network and remote supervision |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8311874B2 (en) * | 2005-03-31 | 2012-11-13 | Oracle Financial Services Software Limited | Systems and methods for customer relationship evaluation and resource allocation |
CN106294593A (en) * | 2016-07-28 | 2017-01-04 | 浙江大学 | In conjunction with subordinate clause level remote supervisory and the Relation extraction method of semi-supervised integrated study |
CN106407211A (en) * | 2015-07-30 | 2017-02-15 | 富士通株式会社 | Method and device for classifying semantic relationships among entity words |
CN106845351A (en) * | 2016-05-13 | 2017-06-13 | 苏州大学 | It is a kind of for Activity recognition method of the video based on two-way length mnemon in short-term |
CN107180247A (en) * | 2017-05-19 | 2017-09-19 | 中国人民解放军国防科学技术大学 | Relation grader and its method based on selective attention convolutional neural networks |
CN107562752A (en) * | 2016-06-30 | 2018-01-09 | 富士通株式会社 | The method, apparatus and electronic equipment classified to the semantic relation of entity word |
CN107578106A (en) * | 2017-09-18 | 2018-01-12 | 中国科学技术大学 | A kind of neutral net natural language inference method for merging semanteme of word knowledge |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107978373A (en) * | 2017-11-23 | 2018-05-01 | 吉林大学 | A kind of semi-supervised biomedical event extraction method based on common training |
-
2018
- 2018-05-08 CN CN201810432079.3A patent/CN108829722B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8311874B2 (en) * | 2005-03-31 | 2012-11-13 | Oracle Financial Services Software Limited | Systems and methods for customer relationship evaluation and resource allocation |
CN106407211A (en) * | 2015-07-30 | 2017-02-15 | 富士通株式会社 | Method and device for classifying semantic relationships among entity words |
CN106845351A (en) * | 2016-05-13 | 2017-06-13 | 苏州大学 | It is a kind of for Activity recognition method of the video based on two-way length mnemon in short-term |
CN107562752A (en) * | 2016-06-30 | 2018-01-09 | 富士通株式会社 | The method, apparatus and electronic equipment classified to the semantic relation of entity word |
CN106294593A (en) * | 2016-07-28 | 2017-01-04 | 浙江大学 | In conjunction with subordinate clause level remote supervisory and the Relation extraction method of semi-supervised integrated study |
CN107180247A (en) * | 2017-05-19 | 2017-09-19 | 中国人民解放军国防科学技术大学 | Relation grader and its method based on selective attention convolutional neural networks |
CN107578106A (en) * | 2017-09-18 | 2018-01-12 | 中国科学技术大学 | A kind of neutral net natural language inference method for merging semanteme of word knowledge |
Non-Patent Citations (1)
Title |
---|
"基于注意力的BiLSTM-CNN中文微博立场检测模型";白静等;《计算机应用与软件》;20180331;第266-274页 * |
Also Published As
Publication number | Publication date |
---|---|
CN108829722A (en) | 2018-11-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108829722B (en) | Remote supervision Dual-Attention relation classification method and system | |
CN108416058B (en) | Bi-LSTM input information enhancement-based relation extraction method | |
CN110929030A (en) | Text abstract and emotion classification combined training method | |
CN110110054A (en) | A method of obtaining question and answer pair in the slave non-structured text based on deep learning | |
CN113468888A (en) | Entity relation joint extraction method and device based on neural network | |
CN113868432B (en) | Automatic knowledge graph construction method and system for iron and steel manufacturing enterprises | |
CN111858932A (en) | Multiple-feature Chinese and English emotion classification method and system based on Transformer | |
CN113051929A (en) | Entity relationship extraction method based on fine-grained semantic information enhancement | |
CN110390049B (en) | Automatic answer generation method for software development questions | |
CN113190656A (en) | Chinese named entity extraction method based on multi-label framework and fusion features | |
CN111814477B (en) | Dispute focus discovery method and device based on dispute focus entity and terminal | |
CN116204674B (en) | Image description method based on visual concept word association structural modeling | |
CN111967267B (en) | XLNET-based news text region extraction method and system | |
CN116306652A (en) | Chinese naming entity recognition model based on attention mechanism and BiLSTM | |
CN116661805B (en) | Code representation generation method and device, storage medium and electronic equipment | |
CN114818717A (en) | Chinese named entity recognition method and system fusing vocabulary and syntax information | |
CN114757184B (en) | Method and system for realizing knowledge question and answer in aviation field | |
CN112270187A (en) | Bert-LSTM-based rumor detection model | |
CN114168754A (en) | Relation extraction method based on syntactic dependency and fusion information | |
CN112633007A (en) | Semantic understanding model construction method and device and semantic understanding method and device | |
CN116595023A (en) | Address information updating method and device, electronic equipment and storage medium | |
CN115688784A (en) | Chinese named entity recognition method fusing character and word characteristics | |
CN114048314B (en) | Natural language steganalysis method | |
CN116645971A (en) | Semantic communication text transmission optimization method based on deep learning | |
CN113360601A (en) | PGN-GAN text abstract model fusing topics |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |