CN108829722B - Remote supervision Dual-Attention relation classification method and system - Google Patents

Remote supervision Dual-Attention relation classification method and system Download PDF

Info

Publication number
CN108829722B
CN108829722B CN201810432079.3A CN201810432079A CN108829722B CN 108829722 B CN108829722 B CN 108829722B CN 201810432079 A CN201810432079 A CN 201810432079A CN 108829722 B CN108829722 B CN 108829722B
Authority
CN
China
Prior art keywords
sentence
vector
word
coding
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810432079.3A
Other languages
Chinese (zh)
Other versions
CN108829722A (en
Inventor
贺敏
毛乾任
王丽宏
李晨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National Computer Network and Information Security Management Center
Original Assignee
National Computer Network and Information Security Management Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National Computer Network and Information Security Management Center filed Critical National Computer Network and Information Security Management Center
Priority to CN201810432079.3A priority Critical patent/CN108829722B/en
Publication of CN108829722A publication Critical patent/CN108829722A/en
Application granted granted Critical
Publication of CN108829722B publication Critical patent/CN108829722B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to a remote supervision Dual-Attention relation classification method and a system, comprising the following steps: aligning entity pairs in a knowledge base to news corpora through remote supervision, and constructing an entity-to-sentence set; carrying out word-level vector coding on the sentence based on a Bi-LSTM model of a word-level attention mechanism to obtain a semantic feature coding vector of the sentence; the method comprises the steps that semantic features of a sentence are coded and denoised based on a Bi-LSTM model of a sentence level attention mechanism, and a sentence set feature coding vector is obtained; and packaging the sentence set feature coding vector and the entity pair translation vector, and carrying out entity pair relation classification on the obtained packet features. The technical scheme provided by the invention reduces the noise data of model training and avoids manual marking data and error transmission caused by the manual marking data. And the entity alignment is carried out by using the open domain text and the large-scale knowledge base, so that the problem of scale of the labeled data extracted by the relation is effectively solved.

Description

Remote supervision Dual-Attention relation classification method and system
Technical Field
The invention belongs to the field of relation classification, and particularly relates to a remote-supervised Dual-attribute relation classification method and system.
Background
With the development of internet technology, a great deal of text information on the world wide web is rapidly growing, and a technology for automatically extracting knowledge from the text information is receiving more and more attention and becomes a current hotspot. The current mainstream relation extraction method is a relation classification method based on neural network learning, and mainly faces three problems: difficulty in representation and mining of semantic features, error transmission caused by manual labeling, and noise influence of model training. At present, in relation classification methods based on neural network learning, a relation classification method achieving the optimal effect appears in supervised learning and remote supervision. By taking the two learning methods as approaches, corresponding improved models appear aiming at three problems, wherein the three problems mainly comprise: a Bi-directional long and short memory network (Bi-LSTM) method is extracted by a supervised learning relation; a remote supervised relationship classification method of Convolutional Neural Network (CNN); a method of relational classification based on a sentence set level attention mechanism for convolutional networks (CNN).
In the face of three major problems of relation classification, the mainstream neural network relation classification method makes better improvement effect on a certain specific problem. However, certain problems exist, the method depends on knowledge in a specific field, and the robustness and the application scene of the model are relatively limited.
Firstly, the relation classification method is carried out by Bi-LSTM alone, although the effective coding of the long-distance semantic features existing in the text is realized. However, the method still depends on manually labeled data sets, and the model only selects one sentence for learning and prediction, does not consider the noisy sentence, and is limited to knowledge in a specific field.
Secondly, the remote supervision method of weak supervision is premised on that: assuming that two entities have a certain relationship in the knowledge base, all sentences containing the two entities in the knowledge will express the relationship. Such an assumption is not completely correct, so that in the process, the automatic generation of the training data has wrong labeling data, which brings noise to the training process. And when the model is trained, selecting the sentence with the highest probability of the entity to the sentence with the relationship as training. The method for selecting the maximum probability does not fully utilize all sentences containing the two entities as the training corpus, and a large amount of information is lost.
In addition, the remote supervision relation classification method based on the attention mechanism of the Convolutional Neural Networks (CNNs) can effectively classify the local semantic features of the text although the influence of wrong labeling is reduced. However, each layer in the CNNs model adopted in the method is fixed in span, and naturally, the layer can only model semantic information with limited distance and is only suitable for the relation extraction tasks in some short texts. Although the partially improved convolutional network model has a modeling for realizing larger span information by overlapping K-segment maximum pooling structures, such as an experiment for performing three-segment pooling through pcnns (piewin cnns), the maximum pooling method has higher cost and relatively weaker performance when extracting semantic features with long dependency relationship in long texts, compared with Bi-LSTM.
Therefore, it is necessary to provide a remote supervised Dual-Attention relationship classification method and system to solve the deficiencies of the prior art.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a remote supervision Dual-attribute relation classification method and system, which automatically acquire a labeled corpus from a knowledge base WikiData and find a commonly occurring sentence of the entity pair from an open domain as a training corpus. The neural network learning model is used for finishing the task of relation extraction by taking the classification of the predefined relation as a target.
A remote supervised Dual-Attention relationship classification method comprises the following steps:
aligning entity pairs in a knowledge base to news corpora through remote supervision, and constructing an entity-to-sentence set;
carrying out word-level vector coding on the sentence based on a Bi-LSTM model of a word-level attention mechanism to obtain a semantic feature coding vector of the sentence;
the method comprises the steps that semantic features of a sentence are coded and denoised based on a Bi-LSTM model of a sentence level attention mechanism, and a sentence set feature coding vector is obtained;
and packaging the sentence set feature coding vector and the entity pair translation vector, and carrying out entity pair relation classification on the obtained packet features.
Further, performing word-level vector coding on the sentence based on a Bi-LSTM model of a word-level attention mechanism to obtain a semantic feature coding vector of the sentence, including:
processing the sentence by adopting a text depth representation model to obtain a word vector of each word in the sentence;
inputting the word vector into a Bi-LSTM model to obtain a coding vector of the word vector;
and adding a word level attention mechanism into the coding vector of the word vector to obtain a semantic feature coding vector of each sentence.
Further, inputting the word vector into a Bi-LSTM model to obtain an encoded vector of the word vector, including:
inputting the word vector into a Bi-LSTM model;
the forward LSTM of the model obtains the above feature information of the word vector, and the backward LSTM of the model obtains the below feature information of the word vector;
and finally, obtaining the context coding vector of the word vector.
Further, adding a word-level attention mechanism to the coding vector of the word vector to obtain a semantic feature coding vector of each sentence, including:
said adding a word-level attention mechanism to said encoded vector;
connecting each time node in the LSTM by a weight vector by calculating attention probability distribution;
and obtaining a semantic feature coding vector of each sentence.
Further, the method for coding and denoising semantic features of a sentence based on a Bi-LSTM model of a sentence level attention mechanism to obtain a sentence set feature coding vector includes:
inputting the semantic feature coding vector of the sentence into a Bi-LSTM model to obtain a feature coding vector of a sentence set;
and adding a sentence level attention mechanism into the feature coding vector of the sentence set to obtain the noise-reduced sentence set feature coding vector.
Further, adding a sentence level attention mechanism to the feature coding vector of the sentence set to obtain a noise-reduced sentence set feature coding vector, including:
adding sentence level attention mechanism weight to each sentence, so that the weight of an effective sentence is great, and the weight of a noise sentence is small;
and obtaining the noise-reduced sentence set feature coding vector.
Further, the sentence set feature encoding vector and the entity pair translation vector are packed, and the obtained packet features are subjected to entity pair relationship classification, including:
introducing a translation vector of an entity pair translation model, giving different weights to sentences with different confidence degrees, and reducing the noise of a sentence set;
introducing the difference value of the entity pair vector as another feature of the similarity measurement sentence into a sentence set to obtain a packet feature;
and carrying out relation classification on the packet features by using a multi-example learning method.
Further, the relationship classification of the packet features by using a multi-example learning method comprises the following steps:
if at least one example of a sentence in a packet is judged to be positive by the classifier, the sentence in the packet is positive example data; if all sentences in a packet are judged to be negative by the classifier, the sentences in the packet are negative example data;
carrying out multi-example learning on the sentence with the tag to obtain a feature representation containing multiple feature relation information;
and predicting which relationship of the entity pair is given by a Softmax relationship classification method to obtain the probability sequence of each relationship.
A remotely supervised Dual-Attention relationship classification system, comprising:
the building module is used for aligning the entity pairs in the knowledge base to news corpora through remote supervision and building an entity-to-sentence set;
the first vector module is used for carrying out word-level vector coding on the sentence based on a Bi-LSTM model of a word-level attention mechanism to obtain a semantic feature coding vector of the sentence;
the second vector module is used for coding and denoising the semantic features of the sentence based on a Bi-LSTM model of the sentence level attention mechanism to obtain a sentence set feature coding vector;
and the relation classification module is used for packing the sentence set feature coding vector and the entity pair translation vector and carrying out entity pair relation classification on the obtained packet features.
Compared with the closest prior art, the technical scheme provided by the invention has the following advantages:
the technical scheme provided by the invention reduces the noise data of model training and avoids manual marking data and error transmission caused by the manual marking data. And the entity alignment is carried out by using the open domain text and the large-scale knowledge base, so that the problem of scale of the labeled data extracted by the relation is effectively solved.
The technical scheme provided by the invention combines the Bi-LSTM word and sentence level feature coding to construct a packet feature training method, and adds an attention weight mechanism of sentence sets and an entity pair translation vector RrelationAnd reducing the weight of invalid sentences, and constructing the packet feature codes of possible relations to carry out multi-example learning. The attention weight of the sentence is combined with the packet feature multi-example learning training of the translation vector, so that the effective vector representation of the relation semantic information is realized, and the accuracy of the relation extraction task is improved.
The technical scheme provided by the invention constructs an end-to-end relation extraction task, and does not depend on complex labeling characteristics such as part of speech, dependency syntax and the like of manual labeling, and the probability value of the relation and the corresponding relation between the word vector input into a sentence from the model and the entity pair output from the model is obtained. The whole process is an end-to-end process, and the coding method of the Dual-Attention mechanism effectively codes the important word characteristics of a sentence at the sentence level; the influence of noise problems brought by a remote supervision method on model training is reduced at the sentence set level, and the accuracy of the trained model is higher.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a detailed flow chart of an embodiment of the present invention;
FIG. 3 is a diagram of the Dual-Attention relationship classification model in the embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
Examples 1,
As shown in fig. 1, an embodiment of the present invention provides a remote-supervised Dual-attribute relationship classification method, including:
aligning entity pairs in a knowledge base to news corpora through remote supervision, and constructing an entity-to-sentence set;
carrying out word-level vector coding on the sentence based on a Bi-LSTM model of a word-level attention mechanism to obtain a semantic feature coding vector of the sentence;
the method comprises the steps that semantic features of a sentence are coded and denoised based on a Bi-LSTM model of a sentence level attention mechanism, and a sentence set feature coding vector is obtained;
and packaging the sentence set feature coding vector and the entity pair translation vector, and carrying out entity pair relation classification on the obtained packet features.
Fig. 2 shows a detailed flowchart of an embodiment of the present invention.
Preferably, aligning the entity pairs in the knowledge base to a corpus by remote supervision, and constructing the entity pair sentence set includes:
the premise hypothesis of the remote supervision method is as follows: if two entities have a certain relationship in the knowledge base, then an unstructured sentence containing the two entities can both represent the relationship. For example, "Jack Ma" and "Alibaba" have a relationship of "provider" in WikiData, and the unstructured text "Jack Ma is the provider of Alibaba" containing these two entities can be used as a training example to train the model. The specific implementation steps of the data construction method are as follows:
the method comprises the following steps: pairs of entities that have relationships, such as "Jack Ma", "Alibaba" herein, are extracted from the knowledge base. And the relation in the knowledge base is known as R1=“founder”,R2=“CEO”,R3=“Boss”,R4…RpAnd so on.
Step two: extracting sentences containing entity pairs from unstructured texts as training samples, and crawling a sentence set S containing entity pairs in news texts1,S2…Sn-1,Sn}. Forming the initial corpus.
Preferably, the performing word-level vector coding on the sentence based on a Bi-LSTM model of the word-level attention mechanism to obtain a semantic feature coding vector of the sentence includes:
in the main corpus of relational classification, such as unstructured news text, sentences are generally long, wherein entity pairs and their relations are far apart from each other when viewed from the position of a word, i.e., semantic relations show long-distance dependency characteristics. Therefore, the Bi-LSTM model is selected to effectively mine the strong feature semantics of the sentences, so that the semantic feature coding of the sentences containing the entities in the sentence set is realized, and the model structure is shown in FIG. 3. Wherein the content of the first and second substances,
Figure BDA0001653672120000071
the embedding's characteristic of the 1 st word representing the nth sentence,
Figure BDA0001653672120000072
the 1 st word of the nth sentence is represented by a hidden vector encoding of contextual features,
Figure BDA0001653672120000073
the context coding vector of the 1 st word of the nth sentence is obtained by combining hidden vector coding after adding the Attention weight. The detailed processing steps are as follows:
the method comprises the following steps: a sentence with co-occurring entities is used as input. A word embedding processing mode of word2vec is selected to map each word in the sentence into a low-dimensional vector, and a character embedding vector of each word is obtained.
Step two: taking the vector of the word obtained in the step one as input, and obtaining the semantic strong features of the sentence from the input vector by utilizing a Bi-LSTM model, wherein the strong features refer to some long textsRemote dependent semantic features in the present document. The bidirectional long-short term memory network is provided with a forward LSTM and a backward LSTM at the hidden layer, the forward LSTM captures the characteristic information of the context, and the backward LSTM captures the characteristic information of the context to obtain the context coding vector
Figure BDA0001653672120000081
Wherein lnRepresenting the number of word vectors for the length of the sentence.
Step three: coding the context vector obtained in the step two
Figure BDA0001653672120000082
The Attention mechanism is added, and each time node in the LSTM is connected by the weight vector by calculating the Attention probability distribution. The step mainly highlights the influence of a certain key input on output, captures words of important features in a sentence, and acquires the output features of the bidirectional LSTM according to attention probability.
Step four: deriving semantic feature vector coding [ S 'for each sentence'1,S′2...S'n]。
Preferably, the method for coding and denoising semantic features of a sentence based on a Bi-LSTM model of a sentence-level attention mechanism to obtain a sentence set feature coding vector includes:
for the training sentence corpus, assuming that at least one sentence in all sentences of each entity pair reflects the relationship of the entity pair, we select the sentences containing the entity pair and package the sentences, but need to filter the noisy sentences corresponding to the entity pair during training, for example, we need to extract the relationship of "found", but the sentence of "Jack Ma" and "aiba" does not show the relationship of "Jack Ma" and "aiba" as "found", and the sentence of "found" is a noisy sentence. This problem is solved by a neural network model based on a subset of sentences level attention mechanism, which can assign weights to each sentence of an entity pair according to a specific relationship, enabling valid sentences to obtain higher weights and noisy sentences to obtain lower weights by continuous learning.
The model is shown in the upper part of FIG. 3, where Si' sentence S output for first modeliThe feature code vector h of the possible relationship of different sentences in the Bi-LSTM training modeliVector Rrelation=e1–e2The features of the relationship R are included. If a sentence instance expresses the relation R, it should be related to the vector RrelationHas higher similarity, and can be used as the similarity constraint of the training positive example. A. theiRepresenting the corresponding weights of the different sentences. The method comprises the following specific steps:
the method comprises the following steps: all sentence feature vectors [ S 'containing entity pairs'1,S'2...S'n]And inputting as a Bi-LSTM model to obtain the feature codes at the sentence set level. For example, when a fountain relational classification model is trained, a relational triple "(Jack Ma, fountain, Alibaba)" exists in a relational database, and according to the assumption of remote supervision, (S)iJack Ma, foundation, Alibaba) is a positive example of the relation, the vector weight of the sentence should be high, and feature codes at the sentence set level are obtained by continuously learning positive sentences.
Step two: each Sentence is assigned a sequence-level Attention weight, so that valid sentences get higher weight and noisy sentences get lower weight by continuous learning. Because the core assumption of remote supervision is wrong, sentences which do not express the relationship between the entities can be wrongly marked, after the characteristics of the entities to the sentences are obtained, different weights are added to the sentences by using a selective attribution mechanism, and the influence of invalid sentences is reduced.
Preferably, the step of packing the sentence set feature encoding vector and the entity pair translation vector and performing entity pair relationship classification on the obtained packet features includes:
introducing a translation vector RrelationSentence vector S obtained from the previous modeliContaining entity pair implication relationThe semantic information of R is the feature code of a sentence. For each according to entity pair (e)1,e2) Each instance sentence in the set of packed sentences may express a relationship R or other relationship. Then the sentence vector encoded for the features containing this relationship should have a correlation with the translation vector R during model trainingrelationVery high similarity. Here, the sequence-level Attention weight is associated with the translation vector RrelationActing on each sentence together to reduce the coding impact of invalid sentences.
The multi-example learning training method of the packet features comprises the steps of packaging semantic features obtained by all encoding in the steps, initializing samples in each label packet B into labels of the packets by continuously learning as the multi-example learning method of the packet features, initializing a set U to be empty, and adding all the samples into a sample set U. Repeating the following processes, sampling the data and performing label training to obtain a classification function fB; predicting the marks of all samples by using f, and emptying U; for each positive marker packet, selecting a sample with the highest fB prediction score and adding the sample into a set U; for each negative marker packet, selecting a sample with the highest fB prediction score and adding the sample into the set U; until an end condition is satisfied; returning to fB. The packet features obtained by multi-instance learning contain semantic coding information of the possible relation R, namely semantic feature implicit representation of the possible relation.
And performing Softmax classification on the obtained packet features, and after continuous learning, corresponding to Softmax to several candidate relation categories in bag containing sentence set level features. The goal of the training here is to maximize the accuracy of the classification.
Model training, including relationship class (relationship. txt), training data (train. txt), test data (test. txt) and word vector (vec. txt). The training data and the test data may be raw data randomly ordered, separated by 80% training and 20% testing. And (4) realizing the optimal prediction of the predefined relationship by adjusting the hyper-parameters until different probability values of different relationships of the same entity pair are obtained finally.
Examples 2,
Based on the same inventive concept, the invention also provides a remote-supervised Dual-Attention relation classification system, which comprises:
the building module is used for aligning the entity pairs in the knowledge base to news corpora through remote supervision and building an entity-to-sentence set;
the first vector module is used for carrying out word-level vector coding on the sentence based on a Bi-LSTM model of a word-level attention mechanism to obtain a semantic feature coding vector of the sentence;
the second vector module is used for coding and denoising the semantic features of the sentence based on a Bi-LSTM model of the sentence level attention mechanism to obtain a sentence set feature coding vector;
and the relation classification module is used for packing the sentence set feature coding vector and the entity pair translation vector and carrying out entity pair relation classification on the obtained packet features.
Preferably, the building block comprises:
and processing the sentence by adopting a text depth representation model to obtain a word vector of each word in the sentence.
Inputting the word vector into a Bi-LSTM model to obtain a coding vector of the word vector, and inputting the word vector into the Bi-LSTM model;
the forward LSTM of the model obtains the above feature information of the word vector, and the backward LSTM of the model obtains the below feature information of the word vector;
and finally, obtaining the context coding vector of the word vector.
Adding a word-level attention mechanism into the coding vector of the word vector to obtain a semantic feature coding vector of each sentence, wherein the word-level attention mechanism is added into the coding vector;
connecting each time node in the LSTM by a weight vector by calculating attention probability distribution;
and obtaining a semantic feature coding vector of each sentence.
Preferably, the first vector module comprises:
inputting the semantic feature coding vector of the sentence into a Bi-LSTM model to obtain a feature coding vector of a sentence set;
adding a sentence level attention mechanism into the feature coding vector of the sentence set to obtain a denoised sentence set feature coding vector, and adding a sentence level attention mechanism weight into each sentence to ensure that the weight of an effective sentence is great and the weight of a noise sentence is small;
and obtaining the noise-reduced sentence set feature coding vector.
The second vector module includes:
introducing a translation vector of an entity pair translation model, giving different weights to sentences with different confidence degrees, and reducing the noise of a sentence set;
introducing the difference value of the entity pair vector as another feature of the similarity measurement sentence into a sentence set to obtain a packet feature;
carrying out relation classification on the packet features by using a multi-example learning method, wherein if at least one example that the label is judged to be positive by the classifier exists in one packet, the sentence in the packet is positive example data; if all sentences in a packet are judged to be negative by the classifier, the sentences in the packet are negative example data;
carrying out multi-example learning on the sentence with the tag to obtain a feature representation containing multiple feature relation information;
and predicting which relationship of the entity pair is given by a Softmax relationship classification method to obtain the probability sequence of each relationship.
Examples 3,
Other relationships such as the entity pair "Jack Ma", the "Alibaba" and the corresponding relationship set "found", "CEO" and the like are known in the knowledge base WikiData, and several sentences containing the entity pair "Jack Ma" and "Alibaba" are classified from internet data, and here, sentences in which four entities coexist are exemplified.
Sentence 1: "Fe executives are Alibaba's secret sugar, found Jack Mass.
Sentence 2: "At a conference hosted by All Things D last week, Alibaba CEOJack Ma said that he was intested in Yahoo.
Sentence 3: "Internet entrepreneurial Jack Ma started a chip version of the yellow Pages that way as Alibaba's recursor in Hanzhou, China.") "
Sentence 4: "Alibaba has bright more small U.S. bussiness on to the company's sites, but this is the first time Ma has divided specific targets"
Sentences 3, 4 do not express the predefined relation of the knowledge base. One purpose of the invention is to train sentences with a large number of entity co-occurrence through a model, and realize classification and probability calculation of the corresponding relation of the co-occurrence entities in the sentences. And (3) outputting a result by the model, wherein the probability of the relation result 'fountain' extracted from the sentence 1 is the maximum, and other relations are the second order. The probability of the relation classification result "CEO" for sentence 2 is certainly the largest, and the probabilities of the other relations are the second order. The training corpus is sufficient enough, and the model can judge which relation is the relation according to the probability maximum value of the obtained possible relation for the sentences 3 and 4, so that the relation classification of the co-occurrence entity pair in the sentences 3 and 4 is realized.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (8)

1. A remote supervised Dual-Attention relationship classification method is characterized by comprising the following steps:
aligning entity pairs in a knowledge base to news corpora through remote supervision, and constructing an entity-to-sentence set;
carrying out word-level vector coding on the sentence based on a Bi-LSTM model of a word-level attention mechanism to obtain a semantic feature coding vector of the sentence;
the method comprises the steps that semantic features of a sentence are coded and denoised based on a Bi-LSTM model of a sentence level attention mechanism, and a sentence set feature coding vector is obtained;
packing the sentence set feature coding vector and the entity pair translation vector, and carrying out entity pair relation classification on the obtained packet features;
the method for obtaining the semantic feature coding vector of the sentence comprises the following steps of carrying out word-level vector coding on the sentence based on a Bi-LSTM model of a word-level attention mechanism, wherein the word-level vector coding comprises the following steps:
processing the sentence by adopting a text depth representation model to obtain a word vector of each word in the sentence;
inputting the word vector into a Bi-LSTM model to obtain a coding vector of the word vector;
and adding a word level attention mechanism into the coding vector of the word vector to obtain a semantic feature coding vector of each sentence.
2. The remotely supervised Dual-Attention relationship classification method of claim 1, wherein inputting the word vector into a Bi-LSTM model to obtain an encoding vector of the word vector, comprises:
inputting the word vector into a Bi-LSTM model;
the forward LSTM of the model obtains the above feature information of the word vector, and the backward LSTM of the model obtains the below feature information of the word vector;
and finally, obtaining the context coding vector of the word vector.
3. The remotely supervised Dual-Attention relationship classification method of claim 1, wherein adding a word level Attention mechanism to the coding vector of the word vector to obtain a semantic feature coding vector of each sentence, comprises:
said adding a word-level attention mechanism to said encoded vector;
connecting each time node in the LSTM by a weight vector by calculating attention probability distribution;
and obtaining a semantic feature coding vector of each sentence.
4. The remotely supervised Dual-Attention relationship classification method of claim 1, wherein the Bi-LSTM model based on the sentence level Attention mechanism encodes and de-noises semantic features of the sentence to obtain a sentence set feature encoding vector, comprising:
inputting the semantic feature coding vector of the sentence into a Bi-LSTM model to obtain a feature coding vector of a sentence set;
and adding a sentence level attention mechanism into the feature coding vector of the sentence set to obtain the noise-reduced sentence set feature coding vector.
5. The remotely supervised Dual-Attention relationship classification method of claim 4, wherein adding a sentence level Attention mechanism to the feature coding vectors of the sentence set to obtain denoised sentence set feature coding vectors comprises:
adding sentence level attention mechanism weight to each sentence, so that the weight of an effective sentence is great, and the weight of a noise sentence is small;
and obtaining the noise-reduced sentence set feature coding vector.
6. The remotely supervised Dual-Attention relationship classification method of claim 1, wherein the sentence set feature coding vector and the entity pair translation vector are packed, and the obtained package features are subjected to entity pair relationship classification, comprising:
introducing a translation vector of an entity pair translation model, giving different weights to sentences with different confidence degrees, and reducing the noise of a sentence set;
introducing the difference value of the entity pair vector as another feature of the similarity measurement sentence into a sentence set to obtain a packet feature;
and carrying out relation classification on the packet features by using a multi-example learning method.
7. The remotely supervised Dual-Attention relationship classification method of claim 6, wherein the relationship classification of the package features using multi-instance learning comprises:
if at least one example of a sentence in a packet is judged to be positive by the classifier, the sentence in the packet is positive example data; if all sentences in a packet are judged to be negative by the classifier, the sentences in the packet are negative example data;
carrying out multi-example learning on the sentence with the tag to obtain a feature representation containing multiple feature relation information;
and predicting which relationship of the entity pair is given by a Softmax relationship classification method to obtain the probability sequence of each relationship.
8. A remotely supervised Dual-Attention relationship classification system, comprising:
the building module is used for aligning the entity pairs in the knowledge base to news corpora through remote supervision and building an entity-to-sentence set;
the first vector module is used for carrying out word-level vector coding on the sentence based on a Bi-LSTM model of a word-level attention mechanism to obtain a semantic feature coding vector of the sentence;
the second vector module is used for coding and denoising the semantic features of the sentence based on a Bi-LSTM model of the sentence level attention mechanism to obtain a sentence set feature coding vector;
the relation classification module is used for packing the sentence set feature coding vector and the entity pair translation vector and carrying out entity pair relation classification on the obtained packet features;
wherein, the first vector module is specifically configured to:
processing the sentence by adopting a text depth representation model to obtain a word vector of each word in the sentence;
inputting the word vector into a Bi-LSTM model to obtain a coding vector of the word vector;
and adding a word level attention mechanism into the coding vector of the word vector to obtain a semantic feature coding vector of each sentence.
CN201810432079.3A 2018-05-08 2018-05-08 Remote supervision Dual-Attention relation classification method and system Active CN108829722B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810432079.3A CN108829722B (en) 2018-05-08 2018-05-08 Remote supervision Dual-Attention relation classification method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810432079.3A CN108829722B (en) 2018-05-08 2018-05-08 Remote supervision Dual-Attention relation classification method and system

Publications (2)

Publication Number Publication Date
CN108829722A CN108829722A (en) 2018-11-16
CN108829722B true CN108829722B (en) 2020-10-02

Family

ID=64148408

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810432079.3A Active CN108829722B (en) 2018-05-08 2018-05-08 Remote supervision Dual-Attention relation classification method and system

Country Status (1)

Country Link
CN (1) CN108829722B (en)

Families Citing this family (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109543047A (en) * 2018-11-21 2019-03-29 焦点科技股份有限公司 A kind of knowledge mapping construction method based on medical field website
CN109635124B (en) * 2018-11-30 2021-04-23 北京大学 Remote supervision relation extraction method combined with background knowledge
CN109871533B (en) * 2019-01-04 2019-12-10 北京车慧科技有限公司 Corpus processing system based on corpus field
CN109871451B (en) * 2019-01-25 2021-03-19 中译语通科技股份有限公司 Method and system for extracting relation of dynamic word vectors
CN109933809B (en) * 2019-03-15 2023-09-15 北京金山数字娱乐科技有限公司 Translation method and device, and training method and device of translation model
CN110070093B (en) * 2019-04-08 2023-04-25 东南大学 Remote supervision relation extraction denoising method based on countermeasure learning
US11069346B2 (en) * 2019-04-22 2021-07-20 International Business Machines Corporation Intent recognition model creation from randomized intent vector proximities
CN110175469B (en) * 2019-05-16 2020-11-17 山东大学 Social media user privacy leakage detection method, system, device and medium
CN110209836B (en) * 2019-05-17 2022-04-26 北京邮电大学 Remote supervision relation extraction method and device
CN110298036B (en) * 2019-06-06 2022-07-22 昆明理工大学 Online medical text symptom identification method based on part-of-speech incremental iteration
CN110597948A (en) * 2019-07-11 2019-12-20 东华大学 Entity relation extraction method based on deep learning
CN110555084B (en) * 2019-08-26 2023-01-24 电子科技大学 Remote supervision relation classification method based on PCNN and multi-layer attention
CN110619121B (en) * 2019-09-18 2023-04-07 江南大学 Entity relation extraction method based on improved depth residual error network and attention mechanism
CN110781312B (en) * 2019-09-19 2022-07-15 平安科技(深圳)有限公司 Text classification method and device based on semantic representation model and computer equipment
CN110647620B (en) * 2019-09-23 2022-07-01 中国农业大学 Knowledge graph representation learning method based on confidence hyperplane and dictionary information
CN111125364B (en) * 2019-12-24 2023-04-25 华南理工大学 ERNIE-based noise reduction method for remote supervision relation extraction
CN111159407B (en) * 2019-12-30 2022-01-28 北京明朝万达科技股份有限公司 Method, apparatus, device and medium for training entity recognition and relation classification model
CN111241303A (en) * 2020-01-16 2020-06-05 东方红卫星移动通信有限公司 Remote supervision relation extraction method for large-scale unstructured text data
CN111324743A (en) * 2020-02-14 2020-06-23 平安科技(深圳)有限公司 Text relation extraction method and device, computer equipment and storage medium
CN111368026B (en) * 2020-02-25 2020-11-24 杭州电子科技大学 Text inclusion analysis method based on word meaning relation and dynamic convolution neural network
CN111091007A (en) * 2020-03-23 2020-05-01 杭州有数金融信息服务有限公司 Method for identifying relationships among multiple enterprises based on public sentiment and enterprise portrait
CN111737440B (en) * 2020-07-31 2021-03-05 支付宝(杭州)信息技术有限公司 Question generation method and device
CN112307130B (en) * 2020-10-21 2022-07-05 清华大学 Document-level remote supervision relation extraction method and system
CN112329463A (en) * 2020-11-27 2021-02-05 上海汽车集团股份有限公司 Training method of remote monitoring relation extraction model and related device
CN112507137B (en) * 2020-12-17 2022-04-22 华南理工大学 Small sample relation extraction method based on granularity perception in open environment and application
CN112579792B (en) * 2020-12-22 2023-08-04 东北大学 PGAT and FTATT-based remote supervision relation extraction method
CN113268985B (en) * 2021-04-26 2023-06-20 华南理工大学 Relationship path-based remote supervision relationship extraction method, device and medium
CN113591478B (en) * 2021-06-08 2023-04-18 电子科技大学 Remote supervision text entity relation extraction method based on deep reinforcement learning
CN114218956A (en) * 2022-01-24 2022-03-22 平安科技(深圳)有限公司 Relation extraction method and system based on neural network and remote supervision

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8311874B2 (en) * 2005-03-31 2012-11-13 Oracle Financial Services Software Limited Systems and methods for customer relationship evaluation and resource allocation
CN106294593A (en) * 2016-07-28 2017-01-04 浙江大学 In conjunction with subordinate clause level remote supervisory and the Relation extraction method of semi-supervised integrated study
CN106407211A (en) * 2015-07-30 2017-02-15 富士通株式会社 Method and device for classifying semantic relationships among entity words
CN106845351A (en) * 2016-05-13 2017-06-13 苏州大学 It is a kind of for Activity recognition method of the video based on two-way length mnemon in short-term
CN107180247A (en) * 2017-05-19 2017-09-19 中国人民解放军国防科学技术大学 Relation grader and its method based on selective attention convolutional neural networks
CN107562752A (en) * 2016-06-30 2018-01-09 富士通株式会社 The method, apparatus and electronic equipment classified to the semantic relation of entity word
CN107578106A (en) * 2017-09-18 2018-01-12 中国科学技术大学 A kind of neutral net natural language inference method for merging semanteme of word knowledge

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107978373A (en) * 2017-11-23 2018-05-01 吉林大学 A kind of semi-supervised biomedical event extraction method based on common training

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8311874B2 (en) * 2005-03-31 2012-11-13 Oracle Financial Services Software Limited Systems and methods for customer relationship evaluation and resource allocation
CN106407211A (en) * 2015-07-30 2017-02-15 富士通株式会社 Method and device for classifying semantic relationships among entity words
CN106845351A (en) * 2016-05-13 2017-06-13 苏州大学 It is a kind of for Activity recognition method of the video based on two-way length mnemon in short-term
CN107562752A (en) * 2016-06-30 2018-01-09 富士通株式会社 The method, apparatus and electronic equipment classified to the semantic relation of entity word
CN106294593A (en) * 2016-07-28 2017-01-04 浙江大学 In conjunction with subordinate clause level remote supervisory and the Relation extraction method of semi-supervised integrated study
CN107180247A (en) * 2017-05-19 2017-09-19 中国人民解放军国防科学技术大学 Relation grader and its method based on selective attention convolutional neural networks
CN107578106A (en) * 2017-09-18 2018-01-12 中国科学技术大学 A kind of neutral net natural language inference method for merging semanteme of word knowledge

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"基于注意力的BiLSTM-CNN中文微博立场检测模型";白静等;《计算机应用与软件》;20180331;第266-274页 *

Also Published As

Publication number Publication date
CN108829722A (en) 2018-11-16

Similar Documents

Publication Publication Date Title
CN108829722B (en) Remote supervision Dual-Attention relation classification method and system
CN110019839B (en) Medical knowledge graph construction method and system based on neural network and remote supervision
CN108416058B (en) Bi-LSTM input information enhancement-based relation extraction method
CN111931506B (en) Entity relationship extraction method based on graph information enhancement
CN110110054A (en) A method of obtaining question and answer pair in the slave non-structured text based on deep learning
CN113468888A (en) Entity relation joint extraction method and device based on neural network
CN113051929A (en) Entity relationship extraction method based on fine-grained semantic information enhancement
CN110390049B (en) Automatic answer generation method for software development questions
CN112183064B (en) Text emotion reason recognition system based on multi-task joint learning
CN116204674B (en) Image description method based on visual concept word association structural modeling
CN116661805B (en) Code representation generation method and device, storage medium and electronic equipment
CN111967267B (en) XLNET-based news text region extraction method and system
CN114818717A (en) Chinese named entity recognition method and system fusing vocabulary and syntax information
CN112270187A (en) Bert-LSTM-based rumor detection model
CN114757184B (en) Method and system for realizing knowledge question and answer in aviation field
CN113190656A (en) Chinese named entity extraction method based on multi-label framework and fusion features
CN116306652A (en) Chinese naming entity recognition model based on attention mechanism and BiLSTM
CN114168754A (en) Relation extraction method based on syntactic dependency and fusion information
CN115688784A (en) Chinese named entity recognition method fusing character and word characteristics
CN116484024A (en) Multi-level knowledge base construction method based on knowledge graph
CN116645971A (en) Semantic communication text transmission optimization method based on deep learning
CN113360601A (en) PGN-GAN text abstract model fusing topics
CN111104520B (en) Personage entity linking method based on personage identity
CN110717316B (en) Topic segmentation method and device for subtitle dialog flow
CN112633007A (en) Semantic understanding model construction method and device and semantic understanding method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant