CN106970981B - Method for constructing relation extraction model based on transfer matrix - Google Patents

Method for constructing relation extraction model based on transfer matrix Download PDF

Info

Publication number
CN106970981B
CN106970981B CN201710193366.9A CN201710193366A CN106970981B CN 106970981 B CN106970981 B CN 106970981B CN 201710193366 A CN201710193366 A CN 201710193366A CN 106970981 B CN106970981 B CN 106970981B
Authority
CN
China
Prior art keywords
relationship
training
extraction model
sentences
distribution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710193366.9A
Other languages
Chinese (zh)
Other versions
CN106970981A (en
Inventor
罗炳峰
冯岩松
贾爱霞
赵东岩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University
Original Assignee
Peking University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University filed Critical Peking University
Priority to CN201710193366.9A priority Critical patent/CN106970981B/en
Publication of CN106970981A publication Critical patent/CN106970981A/en
Application granted granted Critical
Publication of CN106970981B publication Critical patent/CN106970981B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a method for constructing a relation extraction model based on a transfer matrix. The method comprises the following steps: 1) selecting a basic relation extraction model M, the input of which is a sentence or a group of sentences describing the same pair of subjects and objects, and the output of which is a distribution p of the relations described by the input sentence or the input group of sentencesiAnd generating a vector representation s of the input sentence or the input set of sentences in the intermediate resulti(ii) a 2) According to siConstructing a transition matrix Ti(ii) a 3) Distributing the relation p output by the model MiMultiplying by the transition matrix TiAnd normalized to obtain a distribution o of possible annotated relations of the input sentence or of the input set of sentencesi(ii) a 4) Distribute o in this relationshipiAnd (4) fitting a noisy label as a target, training the basic relation extraction model M until a preset termination condition is reached, and obtaining a relation extraction model. The model can be free from the influence of noise, so that a better relation extraction effect can be obtained.

Description

Method for constructing relation extraction model based on transfer matrix
Technical Field
The invention relates to a method for enhancing the resistance to noise data in the training process of a relationship extractor by utilizing a transfer matrix so as to improve the relationship extraction performance, belonging to the field of information extraction.
Background
With the development of information technology and internet, more and more text information can be acquired by people, and how to automatically construct a knowledge base by using a large amount of text information becomes a very important problem, so that a computer can better utilize information contained in the texts.
The knowledge base is generally composed of triples in the shape of (subject, predicate, object), for example, the triplet (china, capital, beijing) contains the knowledge that the capital of china is beijing. Thus, the process of automatically building a knowledge base is the process of automatically generating these triples. The relation extraction aims to solve the problem of how to automatically identify the relation between two examples (one example can be an entity such as 'China', and also can be time, numerical values and the like) described in the text, so that the triples are formed and filled in a knowledge base.
The data used by the relational extraction task is mainly constructed by a remote supervision method, namely, a word description which can possibly set forth some seed knowledge is automatically retrieved by using the seed knowledge, and then the noisy data is used for training a relational extraction model. The benefit of this approach is that a large amount of training data can be acquired at low cost, while the disadvantage is that a large portion of the data set will be noisy data. Also, manually labeled data may contain noise, for example, a person may miss labeling certain data because of carelessness, or may not be able to identify whether a sentence is describing a relationship because of insufficient domain knowledge. Since the noise data can have a significant effect on the training of the model, the quality of a relationship extraction model depends largely on its resistance to the noise data.
Disclosure of Invention
The invention aims to provide a method for constructing a relation extraction model with strong resistance to noise in training data. The input of the relation extraction task can be a sentence, namely, the relation between the target subject and the target object described by the input sentence is judged; or a set of sentences, each of which includes the target subject and the target object, that is, the relationship between the target subject and the target object is comprehensively judged in combination with the set of sentences. Assuming that | C | relationships are to be extracted, a | C | × | C | transition matrix T is constructed, where any element T in the matrix TijRepresenting the probability that the true relationship expressed by an input sentence (or a set of sentences input) is i, and it is incorrectly labeled as relationship j. Thus, given a basic relationship extraction model M, let its output relationship distribution be p, in trainingInstead of fitting p directly to the noisy label, the invention uses a transfer matrix T to convert p into a relationship distribution o to which it may be labeled, and uses o to fit the noisy label. Through the display modeling mode of the noise, the basic relation extractor M can be free from the influence of the noise during training, so that a better relation extraction effect can be obtained.
In order to achieve the purpose, the technical scheme of the invention is as follows:
(1) selecting a base relationship extraction model M that can generate a vector representation of an input sentence (or an overall vector representation of an input set of sentences), which outputs a distribution p of relationships described for a to-be-processed sentence i (or an ith set of sentences to be processed)iAnd generating a vector representation s of the sentence i to be processed (or the ith group of sentences to be processed) in the intermediate resulti
(2) From the vector representation s of the current sentence i to be processed (or the ith set of sentences to be processed)iConstructing a transition matrix Ti. Wherein
Figure BDA0001256765210000021
(i.e., matrix T)iRow j and column k) represents the probability that, for a sentence i to be processed (or an ith group of sentences to be processed), if its expressed relationship is j, it is incorrectly labeled k (where j and k are respectively associated with the matrix T)iCorresponding to the jth row and kth column). I.e. this transition matrix can be considered as a model of the noise pattern of an input sentence (or a set of sentences of the input).
(3) During training, the relation distribution p described by the input sentence (or the input sentence) predicted by the basic relation extraction model M is distributediMultiplication by a transfer matrix TiAnd normalized to obtain the distribution o of the relationships it may be labeled asiAnd let oiAnd (4) training by fitting a noisy label as a target until a preset termination condition is reached (for example, a pre-specified training round number is reached, or the extraction effect is improved slightly compared with that of the previous round, and the like).
(4) Through the step (3), the basic relationship extraction model M is fully trained. Due to the introduction of the transfer matrix, the influence of noise can be avoided when M is trained, and therefore a better relation extraction effect is achieved. In actual use, the prediction result of M is directly used, and a transfer matrix is not used.
In step (1), although the present invention defines that the basic relational extraction model M must be able to generate a vector representation of an input sentence or an overall vector representation of a group of input sentences, in practice most of the relational extraction models meet this requirement. For a traditional model based on a manually formulated feature template, the vector representation of the sentence may be a vector of extracted feature components; for neural network models, the layer in front of the output layer can generally be used as a vector representation of the sentence (other layers that can completely model the entire sentence can also be used). The overall vector representation of a group of sentences may be a weighted average of the vector representations of each sentence in the group, or may be a weighted average of the vector representations of each sentence after each sentence is further modeled by a recurrent neural network.
In step (2), s is represented from a sentence vector (or an overall vector representation of a set of sentences)iTo the transition matrix TiCan be represented by the following sub-formula:
Figure BDA0001256765210000031
wherein, wjkIs used to calculate the transition matrix TiIs at the jth row and kth column of
Figure BDA0001256765210000032
Parameter (d) of (a) in the above formula wjkThe superscript T of (a) denotes transpose, b is the bias term. w is ajkAnd b is initialized randomly before training begins and updated by back propagation during training.
In addition, sometimes a model may generate a vector representation s for each relation/for an input sentence (or a set of input sentences)il. At this time, the matrix T is transferrediCan be obtained byGenerating an equation:
Figure BDA0001256765210000033
wherein s isijIs a vector representation of sentence i (or ith group of sentences) for relationship j, wkIs the weight vector of the relation k, the superscript T denotes the transposition, bjIs the bias term for relationship j. w is akAnd bjThe method is initialized randomly before training begins and updated through back propagation in the training process.
And, the vector representation of sentence i (or the overall vector representation of the ith group of sentences) siA new vector representation of sentence i (or an overall vector representation of the ith set of sentences) s 'may also be derived through several fully-connected layers'iAnd then generating a transfer matrix by using the formula.
In step (3), two different training modes may be used depending on whether the training data may be divided into several subsets of different noise levels.
1) If the training data is not further divisible according to the noise level, then it is necessary to use a progressive approach for training. The loss function at this time is as follows:
Figure BDA0001256765210000034
wherein N is the number of samples in the training set, one sample can be a sentence or a group of sentences,
Figure BDA0001256765210000035
representing the output p of the basis relation extractor M for the samples iiThe error of the labeling fitted with noise,
Figure BDA0001256765210000036
distribution o representing the relationship that a sample i may be labeled asiError of fitting a noisy annotation, Trace (T)i) Representing a transition matrix TiA is a real number between 0 and 1, betaIs a real number coefficient. The error loss can be used here with all piecewise-derivative functions that measure the difference between the predicted relationship distribution and the annotated relationship, including the cross entropy. Trace (T)i) Here a regularization term. Taking into account the transition matrix TiIs 1, and T is the case without noiseiShould be a unit array, then by controlling TiThe trace of (sum of diagonal elements) is equivalent to the noise modeling strength of the control transfer matrix. Large beta indicates the desired T of the inventioniApproaching a unit array, with a small (or negative) beta indicating that the invention encourages TiTo model the noise.
At the beginning of the training, α is set to 1 and β is set to a larger positive number, i.e., the present invention initially discourages modeling noise, but instead expects the fundamental relationship extraction model M to quickly learn the basic classification capability from noisy labels. Then, by gradually reducing α and β, the invention gradually emphasizes the importance of noise modeling during the training process, thereby reducing the influence of noise on the training of M. The optimization method used for training can be all gradient-based optimization methods including random gradient descent. And when the preset termination condition is reached, stopping training. The termination condition may be that the training reaches a certain number of rounds, or that the relationship extraction effect on the development set is not significantly improved any more, and the like.
2) If the training data can be divided into subsets (TD) according to the noise level from small to large1,TD2… …), the following loss function may be used:
Figure BDA0001256765210000041
where S is the number of subsets, NiIs a subset TDiThe number of samples in (2), a sample may be a sentence or a group of sentences,
Figure BDA0001256765210000042
as a subset TDiDistribution of relationships into which the jth sample may be labeledoijError of fitting noisy annotations, betaiIs a subset TDiUp-transfer matrix TijThe regular term coefficients of the trace. Here, since the noise modeling strength of the transition matrix on each subset can be specified in advance by using the prior knowledge of the noise degrees of different subsets, l may not be usedosspTo directly optimize losso. In particular, for a subset with a low degree of noise, β may be setiFor a subset with a relatively large positive number but a high degree of noise, β may be setiNegative or relatively small positive.
The loss function loss, described above, can use all piecewise-derivative functions that measure the difference between the predicted relationship distribution and the labeled relationship, including the cross entropy. The optimization method can employ all gradient-based optimization methods including random gradient descent. During training, except training on all subsets directly, the training can also be performed on the subset with the minimum noise degree, and then the other subsets are gradually added into the training set from small to large according to the sequence of the noise degree. The termination condition of the training can be that the training reaches a certain number of rounds, or the relationship extraction effect on the development set is not obviously improved any more, and the like.
In the step (4), since the basic relationship extraction model M is actually fitted with the implicit real relationship distribution during training, the prediction result of the basic relationship extraction model M can be directly used during actual use.
Compared with the prior art, the invention has the following positive effects:
during training, the basic relationship extraction model M does not need to be directly fitted with a noisy label, but the distribution of the relationship described by the input sentence (or a group of input sentences) predicted by the model M is connected with the noisy label through a transfer matrix, so that the denoising effect is achieved. Compared with the existing method for directly fitting noisy labels with the output of M, the method disclosed by the invention has the advantages that the training of the basic relationship extraction model M is free from the influence of noise, the generation of biased models is avoided, and a better relationship extraction effect can be achieved.
Fig. 2 shows the effect of the present invention on the time of extracting objects (see the description of the following examples), and the extraction result is shown as an accuracy-recall curve, and the higher the curve is, the better the effect is. The data set can be divided into 3 subsets with different noise degrees, the effect of training on reliable subsets and then on unreliable subsets (training of the data sets in sequence) is obviously superior to the effect of training by mixing all the subsets together (mixed data set), and the noise in the data set has great influence on the training of the model. After the transfer matrix method disclosed by the invention is used, the relation extraction effect is further improved (training and transfer matrix are performed in sequence), and the method can effectively improve the anti-noise capability of the basic relation extraction model in the training process, so that the relation extraction effect is obviously improved.
Drawings
FIG. 1 is a block diagram of a relationship extraction method in an embodiment of the invention;
fig. 2 is a diagram illustrating an extraction effect of the relationship extraction method according to the embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it should be understood that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention is based on the relation that the object is time in an open encyclopedia knowledge platform of the wiki data and the corpus of the wiki encyclopedia. It will be apparent to those skilled in the art that other sets of relationships and other corpora may be used in a particular implementation.
Specifically, there are 12 types of relationships to be extracted in this embodiment, including a birth date, a death date, an organization establishment date, a work publication date, a spacecraft launch date, and the like. The data set construction process is as follows:
1) collecting triplets in the wiki data containing relationships to be extracted, such as (centesimal, organization date, 1/2000);
2) for a triplet, find all sentences in wikipedia containing both the subject and object in the triplet, such as "the plum-shaped hydroarm creates a Baidu in 1 month and 1 day of 2000";
3) the present invention considers the sentences containing both the subject and the object of the triplet as the natural language description of the triplet. However, this assumption is incomplete and exceptions necessarily occur. The present invention then further assumes that the finer the temporal granularity mentioned in the sentence, the more likely it is that the present invention considers it to describe the triplet. For example, a sentence containing "1/2000" and "hundredths" is more likely to describe this triple (hundredths, organization establishment date, 1/2000) than a sentence containing only "2000" and "hundredths". According to this principle, the present invention divides the data set into three subsets of different degrees of reliability: including yearly-monthly days, including yearly-monthly, including yearly;
4) the invention takes out a part of the triples of the subset including the year, the month and the day as the test set, and the rest of the triples are used as the training set (sentences related to the triples of the test set in the other two subsets are also removed).
FIG. 1 is a diagram of a framework of a relationship extraction method based on a transition matrix according to an embodiment of the present invention;
step 1: a vector representation of the sentence to be processed is generated.
For an input sentence, the present invention first generates a vector representation thereof. Here, the present invention first converts each word in the sentence into a corresponding word vector, and then generates a vector representation of the sentence through a Convolutional Neural network (see Zeng Daojian, Liu Kang, Chen Yubo, Zhao jun.
Step 2: a distribution of the relationships described by the sentence is generated using a base relationship extractor.
Obtaining a vector representation of a sentencesiThen, the invention generates the predicted relationship distribution through the softmax classifier, and the concrete formula is as follows:
Figure BDA0001256765210000061
wherein p isijIs the distribution p of the relationships described by the sentence iiItem j of (d), representing the probability that sentence i describes relationship j, wjIs the weight vector of the relationship j, wjThe superscript T of (a) denotes transpose.
And step 3: modeling noise to generate transfer matrix Ti
Since there is only one vector representation for a sentence, the present invention generates the transition matrix T using the following formula mentioned in the methods sectioni
Figure BDA0001256765210000062
Wherein
Figure BDA0001256765210000063
Is a transfer matrix TiRow j and column k.
And 4, step 4: distribution of relationships p using predictionsiAnd a transfer matrix TiGenerating a distribution o of relationships that the sentence may be tagged toi
Figure BDA0001256765210000064
Figure BDA0001256765210000065
Wherein o isijRepresents oiAnd the second formula is p'iThe normalization is performed such that it satisfies the properties of the probability distribution.
During training, the invention trains 15 rounds on the subset of 'including year, month and day', then trains 15 rounds after adding the subset of 'including year and month', and trains 15 rounds after adding the subset of 'including year'. Wherein the regularization coefficients (coefficients of the traces of the transfer matrix) of the three subsets "including year, month and day", "including year" and "including year" are 0.01, -0.01 and-0.1, respectively.
Fig. 2 shows the relationship extraction effect of the method, and the evaluation index is an accuracy-recall curve. Specifically, the relational extraction results are ranked from high to low according to the confidence coefficient output by the relational extractor, the accuracy and the recall ratio are calculated for each extraction result and the result higher than the confidence coefficient of each extraction result in sequence, and finally an accuracy-recall ratio curve is drawn, wherein the higher the curve is, the better the relational extraction effect is. The calculation method of the accuracy and the recall rate comprises the following steps:
Figure BDA0001256765210000071
Figure BDA0001256765210000072
it can be seen from the figure that, under the condition of not adding a transition matrix, different data subsets are sequentially added into a training set for training according to the sequence of the noise degrees from small to large (the data sets are sequentially trained), which is better than the training effect directly on all the data sets (mixed data sets), and shows that the noise in the data sets has a significant influence on the effect of the trained relation extraction model. After the transfer matrix is added (training and transferring the matrix in sequence), the relation extraction effect is obviously improved compared with that before the transfer matrix is added, and the method can effectively model noise, so that a basic relation extractor is free from the influence of the noise, and a better relation extraction effect is obtained.
In summary, in the embodiment of the present invention, a reliable relationship extractor for extracting a relationship that an object is time is constructed based on wiki data and wikipedia. In the process of training the extractor, the method provided by the invention can effectively avoid the influence of noise in data on the relation extractor, thereby training the relation extractor with better effect.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is intended to include such modifications and variations.

Claims (8)

1. A method for constructing a relation extraction model based on a transfer matrix comprises the following steps:
1) selecting a basic relation extraction model M, the output of which is a distribution p of the relations described by the input sentence iiAnd generating a vector representation s of the input sentence i in the intermediate resulti
2) From the vector representation s of the input sentence iiConstructing a transition matrix Ti(ii) a Wherein the content of the first and second substances,
Figure FDA0002568573660000011
is a matrix TiThe jth row and kth column of (a), representing the probability that the input sentence i expresses a relationship of j, but is incorrectly labeled as k;
3) the relation distribution p output by the basic relation extraction model MiMultiplying by the transition matrix TiAnd normalizing to obtain the distribution o of the relationship which the input sentence may be labeled intoi
4) Distribute o in this relationshipiFitting a noisy label as a target, and training the basic relationship extraction model M until a preset termination condition is reached to obtain a relationship extraction model;
the method for training the basic relationship extraction model M comprises the following steps: a) if the training data cannot be further divided according to the noise level, the loss function used in the training is
Figure FDA0002568573660000012
Wherein N is a sample of training dataThe total number, a sample is a sentence or a group of sentences,
Figure FDA0002568573660000013
representing a relationship distribution p output by a basis relationship extractor M for a sample iiThe error of the labeling fitted with noise,
Figure FDA0002568573660000014
representing a distribution of relationships o to which a sample i may be labelediError of fitting noisy annotations, Trace (T)i) Representing the transition matrix T corresponding to the sample iiα is a real number between 0 and 1, β is a real number coefficient; b) if the training data can be further divided into subsets according to the degree of noise, the loss function used in the training is
Figure FDA0002568573660000015
Where S is the total number of subsets, NiIs a subset TDiThe number of samples of (a) to (b),
Figure FDA0002568573660000016
as a subset TDiIs marked as a distribution o of relationshipsijError of fitting noisy annotations, betaiIs a subset TDiUp-transfer matrix TijThe regular term coefficients of the trace.
2. The method of claim 1, wherein if the fundamental relationship extraction model M generates only one vector representation s for an input sentence iiThen it is stated
Figure FDA0002568573660000017
Wherein, wjkIs used to calculate the transition matrix TiIs at the jth row and kth column of
Figure FDA0002568573660000018
Is determined by the parameters of (a) and (b),
Figure FDA0002568573660000019
is wjkB is a bias term, wjkAnd b are updated by back-propagation during training, | C | is the size of the set of relationships to be extracted.
3. The method of claim 1, wherein if the basic relationship extraction model M generates a vector representation s for each relationship/for an input sentence iilThen it is stated
Figure FDA00025685736600000110
Wherein s isijIs a vector representation of the input sentence i for the relation j, wkIs a weight vector for the relationship k,
Figure FDA00025685736600000111
is wkTranspose of (b)jIs a bias term of the relationship j, wkAnd bjUpdated by back-propagation during the training process, | C | is the size of the set of relationships to be extracted.
4. The method of claim 1, wherein the vector can be further represented as siAfter processing of a plurality of full connection layers, a new vector representation s 'of sentence i is obtained'iThen according to s'iConstructing the transfer matrix Ti
5. A method for constructing a relation extraction model based on a transfer matrix comprises the following steps:
1) selecting a basic relationship extraction model M, which is input as a set of sentences describing the same pair of subject and object, and which is output as a distribution p of the relationships described by the set of sentencesiAnd generating an overall vector representation s of the input set of sentences in the intermediate resulti
2) From the global vector representation s of the set of sentencesiConstructing a transition matrix Ti(ii) a Wherein the content of the first and second substances,
Figure FDA0002568573660000021
is a matrix TiThe jth row and kth column of (a), indicating a probability that the set of sentences expresses a relationship of j, but is incorrectly labeled as k;
3) the relation distribution p output by the basic relation extraction model MiMultiplying by the transition matrix TiAnd normalizing to obtain the distribution o of the possible labeled relations of a group of input sentencesi
4) Distribute o in this relationshipiFitting a noisy label as a target, and training the basic relationship extraction model M until a preset termination condition is reached to obtain a relationship extraction model;
the method for training the basic relationship extraction model M comprises the following steps: a) if the training data cannot be further divided according to the noise level, the loss function used in the training is
Figure FDA0002568573660000022
Wherein N is the total number of samples of the training data, one sample can be a sentence or a group of sentences,
Figure FDA0002568573660000023
representing a relationship distribution p output by a basis relationship extractor M for a sample iiThe error of the labeling fitted with noise,
Figure FDA0002568573660000024
representing a distribution of relationships o to which a sample i may be labelediError of fitting noisy annotations, Trace (T)i) Representing the transition matrix T corresponding to the sample iiα is a real number between 0 and 1, β is a real number coefficient; b) if the training data can be further divided into subsets according to the degree of noise, the loss function used in the training is
Figure FDA0002568573660000025
Where S is the total number of subsets, NiIs a subset TDiThe number of samples of (a) to (b),
Figure FDA0002568573660000026
as a subset TDiIs marked as a distribution o of relationshipsijError of fitting noisy annotations, betaiIs a subset TDiUp-transfer matrix TijThe regular term coefficients of the trace.
6. The method of claim 5, wherein if the fundamental relationship extraction model M generates only one vector representation s for the set of sentencesiThen it is stated
Figure FDA0002568573660000027
Wherein, wjkIs used to calculate the transition matrix TiIs at the jth row and kth column of
Figure FDA0002568573660000028
Is determined by the parameters of (a) and (b),
Figure FDA0002568573660000029
is wjkB is a bias term, wjkAnd b are updated by back-propagation during training, | C | is the size of the set of relationships to be extracted.
7. The method of claim 5, wherein if the fundamental relationship extraction model M generates a vector representation s for each relationship/for the set of sentencesilThen it is stated
Figure FDA0002568573660000031
Wherein s isijIs a vector representation, w, of the set of sentences for the relationship jkIs a weight vector for the relationship k,
Figure FDA0002568573660000032
is wkTranspose of (b)jIs a bias term of the relationship j, wkAnd bjUpdated by back-propagation during the training process, | C | is the size of the set of relationships to be extracted.
8. The method of claim 5, wherein the vector can be further represented as siAfter several full connected layers processing, a new overall vector representation s 'of the set of sentences is obtained'iThen according to s'iConstructing the transfer matrix Ti
CN201710193366.9A 2017-03-28 2017-03-28 Method for constructing relation extraction model based on transfer matrix Active CN106970981B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710193366.9A CN106970981B (en) 2017-03-28 2017-03-28 Method for constructing relation extraction model based on transfer matrix

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710193366.9A CN106970981B (en) 2017-03-28 2017-03-28 Method for constructing relation extraction model based on transfer matrix

Publications (2)

Publication Number Publication Date
CN106970981A CN106970981A (en) 2017-07-21
CN106970981B true CN106970981B (en) 2021-01-19

Family

ID=59336048

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710193366.9A Active CN106970981B (en) 2017-03-28 2017-03-28 Method for constructing relation extraction model based on transfer matrix

Country Status (1)

Country Link
CN (1) CN106970981B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110276066B (en) * 2018-03-16 2021-07-27 北京国双科技有限公司 Entity association relation analysis method and related device
CN111914091B (en) * 2019-05-07 2022-10-14 四川大学 Entity and relation combined extraction method based on reinforcement learning
CN110489529B (en) * 2019-08-26 2021-12-14 哈尔滨工业大学(深圳) Dialogue generating method based on syntactic structure and reordering
CN110795527B (en) * 2019-09-03 2022-04-29 腾讯科技(深圳)有限公司 Candidate entity ordering method, training method and related device
CN113672727B (en) * 2021-07-28 2024-04-05 重庆大学 Financial text entity relation extraction method and system
CN116542250B (en) * 2023-06-29 2024-04-19 杭州同花顺数据开发有限公司 Information extraction model acquisition method and system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011118526A (en) * 2009-12-01 2011-06-16 Hitachi Ltd Device for extraction of word semantic relation
CN103678703A (en) * 2013-12-30 2014-03-26 中国科学院自动化研究所 Method and device for extracting open category named entity by means of random walking on map
CN104035975A (en) * 2014-05-23 2014-09-10 华东师范大学 Method utilizing Chinese online resources for supervising extraction of character relations remotely
CN106354710A (en) * 2016-08-18 2017-01-25 清华大学 Neural network relation extracting method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9858261B2 (en) * 2014-06-23 2018-01-02 International Business Machines Corporation Relation extraction using manifold models

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011118526A (en) * 2009-12-01 2011-06-16 Hitachi Ltd Device for extraction of word semantic relation
CN103678703A (en) * 2013-12-30 2014-03-26 中国科学院自动化研究所 Method and device for extracting open category named entity by means of random walking on map
CN104035975A (en) * 2014-05-23 2014-09-10 华东师范大学 Method utilizing Chinese online resources for supervising extraction of character relations remotely
CN106354710A (en) * 2016-08-18 2017-01-25 清华大学 Neural network relation extracting method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Learning with Noise:Enhance Distantly Supervised Relation Extraction with Dynamic Transition Matrix;Bingfeng luo et al;《https://www.researchgate.net/publication/318737364_Learning_with_Noise_Enhance_Distantly_Supervised_Relation_Extraction_with_Dynamic_Transition_Matrix》;20170131;1-10 *
中文实体关系抽取研究;牟晋娟等;《计算机工程与设计》;20091231;第30卷(第15期);3587-3590 *
基于核方法的中文实体关系抽取研究;黄瑞红等;《中文信息学报》;20080930;第22卷(第5期);102-108 *

Also Published As

Publication number Publication date
CN106970981A (en) 2017-07-21

Similar Documents

Publication Publication Date Title
CN106970981B (en) Method for constructing relation extraction model based on transfer matrix
CN108984745B (en) Neural network text classification method fusing multiple knowledge maps
CN108363753B (en) Comment text emotion classification model training and emotion classification method, device and equipment
CN107832400B (en) A kind of method that location-based LSTM and CNN conjunctive model carries out relationship classification
CN106886543B (en) Knowledge graph representation learning method and system combined with entity description
CN104834747B (en) Short text classification method based on convolutional neural networks
WO2022267976A1 (en) Entity alignment method and apparatus for multi-modal knowledge graphs, and storage medium
CN111143576A (en) Event-oriented dynamic knowledge graph construction method and device
CN113239186B (en) Graph convolution network relation extraction method based on multi-dependency relation representation mechanism
CN110334219A (en) The knowledge mapping for incorporating text semantic feature based on attention mechanism indicates learning method
WO2020063092A1 (en) Knowledge graph processing method and apparatus
CN110222178A (en) Text sentiment classification method, device, electronic equipment and readable storage medium storing program for executing
CN110674850A (en) Image description generation method based on attention mechanism
CN107220220A (en) Electronic equipment and method for text-processing
WO2019196210A1 (en) Data analysis method, computer readable storage medium, terminal device and apparatus
CN103207855A (en) Fine-grained sentiment analysis system and method specific to product comment information
CN111104509B (en) Entity relationship classification method based on probability distribution self-adaption
CN111027309B (en) Entity attribute value extraction method based on two-way long-short-term memory network
CN109214562A (en) A kind of power grid scientific research hotspot prediction and method for pushing based on RNN
CN112836051B (en) Online self-learning court electronic file text classification method
US20220036003A1 (en) Methods and systems for automated detection of personal information using neural networks
CN114722820A (en) Chinese entity relation extraction method based on gating mechanism and graph attention network
CN111368082A (en) Emotion analysis method for domain adaptive word embedding based on hierarchical network
CN114925205B (en) GCN-GRU text classification method based on contrast learning
CN108920446A (en) A kind of processing method of Engineering document

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant