CN113901758A - Relation extraction method for knowledge graph automatic construction system - Google Patents

Relation extraction method for knowledge graph automatic construction system Download PDF

Info

Publication number
CN113901758A
CN113901758A CN202111133794.5A CN202111133794A CN113901758A CN 113901758 A CN113901758 A CN 113901758A CN 202111133794 A CN202111133794 A CN 202111133794A CN 113901758 A CN113901758 A CN 113901758A
Authority
CN
China
Prior art keywords
matrix
output
graph
text
input
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111133794.5A
Other languages
Chinese (zh)
Inventor
徐小龙
董益豪
朱曼
吴晓诗
胡惠娟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Posts and Telecommunications filed Critical Nanjing University of Posts and Telecommunications
Priority to CN202111133794.5A priority Critical patent/CN113901758A/en
Publication of CN113901758A publication Critical patent/CN113901758A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/126Character encoding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Machine Translation (AREA)

Abstract

A relation extraction method for an automatic knowledge graph construction system comprises the steps of firstly, coding a text, converting the text into word vectors, and preliminarily extracting text features; generating a syntax dependency tree by using a syntax dependency structure of the text, generating a weighted dependency adjacent matrix by weighting each relation type, and extracting syntax dependency information in the text by using a graph convolution neural network; synchronously, a multi-head attention mechanism is directly applied to the coded text to generate an attention matrix, and the graph convolution neural network with the same structure is used for extracting information except the syntactic dependency information of the text; and finally, obtaining the characteristic expressions of two entities and the sentence, scoring all possible relation categories by using a feedforward neural network and a normalized exponential function, and selecting the relation with the highest score as a relation classification result. The method can fully acquire the information of different dimensions of the text, and obtains excellent effect on the public data set extracted by the relation.

Description

Relation extraction method for knowledge graph automatic construction system
Technical Field
The invention belongs to the technical field of natural language processing and artificial intelligence, and particularly relates to a relation extraction method for an automatic knowledge graph construction system.
Background
The relation extraction is a key subtask in the field of natural language processing, and is an important component of an information extraction task. The relation extraction aims to extract relation information among entities from unstructured texts, and through combination with a named entity recognition task, triples in the form of < subject, predicate (relation), object > required for building a knowledge graph system can be generated.
The traditional relation extraction method mainly analyzes texts by applying linguistic knowledge, and performs text matching and relation extraction by manually designing extraction rules or kernel functions by using a method based on statistics and rules. However, due to the complexity of natural language, the relation extraction model based on artificial rules cannot meet the performance requirements of people, artificial noise is often introduced into the model, the performance is very limited, and meanwhile the problem of weak generalization exists.
With the rapid development of neural networks and deep learning, researchers have begun introducing neural networks into the task of relationship extraction. The neural network and the deep learning method can effectively fit and extract text features by simulating the working principle of cerebral neurons, and break the limitation of artificial design rules. Existing neural network-based relational extraction models are mainly classified into sequence-based models and dependency-based models.
The model based on the sequence encodes the word sequence in the sentence, the distance position characteristics of the words in the sequence relative to the entity are extracted by utilizing the convolutional neural network, the cyclic neural network is more sensitive to the relation between the remote entity pairs as a time sequence model, and the problem that the relation information between the remote words in the text is difficult to obtain can be effectively relieved by combining the cyclic neural network with the convolutional neural network. However, the sequence-based model looks at word sequences and ignores the overall syntactic structure information of the sentence.
In contrast to sequence-based models, dependency-based models can efficiently exploit the syntactic structural features of sentences and capture other implicit long-distance syntactic relationships. The dependency-based model generally converts sentences into dependency trees according to the dependency relationship among the words, and further converts the dependency trees into corresponding dependency adjacency matrixes to participate in the training of the neural network, and the implicit long-distance syntactic relationship and the multi-hop relationship are captured through the dependency relationship among each word. And since the dependency structure is usually a graph structure, the graph convolution neural network is also introduced into a dependency-based relationship extraction model. The current major work focuses on how to more effectively prune the dependency tree and prune information irrelevant to relationship extraction to improve the model performance, however, the rule-based pruning also has artificial noise, and the attention-based soft pruning destroys the original dependency structure and cannot fully utilize rich information contained in the dependency matrix.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a relation extraction method for an automatic knowledge graph construction system, which utilizes a multi-head attention mechanism and a weighted dependency matrix to acquire key information of different dimensions of a text in parallel and achieves an excellent effect on a public data set of relation extraction.
The invention provides a relation extraction method for an automatic knowledge graph construction system, which comprises the following steps,
step s1, embedding words in each word in the text by using a pre-trained word vector dictionary, converting part-of-speech tagging information and named entity identification information of each word into vector representation and splicing the vector representation of the word with the vector representation of the word, and acquiring a vector xi
Step S2, vector xiPerforming bidirectional long-short term memory network operation, and splicing the forward operation result and the backward operation result to obtain a vector h't
S3, constructing a syntax dependency tree A through a syntax dependency structure of the text, setting a learnable weight variable D, constructing a dependency adjacent matrix by using the A, carrying out independent heating on matrix values, and carrying out bitwise multiplication on the matrix values and the weight variable D to obtain a weighted dependency matrix A';
step S4, pass vector h'tAcquiring feature expression matrixes Q and K of the text, and acquiring K attention moment matrixes of the text by using a multi-head attention mechanism
Figure BDA0003281490100000021
Obtaining a matrix A' after linear dimensionality reduction;
step S5, taking the matrix A 'and the matrix A' as the input of the graph convolution module with different graph convolution network layer numbers to carry out graph convolution operation, and respectively obtaining the matrix
Figure BDA0003281490100000031
Sum matrix
Figure BDA0003281490100000032
Obtaining matrix H after linear dimensionality reductionoutput
Step s6, slave matrix HoutputTo obtain the feature expression matrix h of the sentencesentAnd a feature representation matrix of two entities
Figure BDA0003281490100000033
And
Figure BDA0003281490100000034
obtaining a relational feature representation matrix h using a feedforward neural networkrelationAnd finally, carrying out relation prediction through a normalized exponential function to obtain a final classification result.
As a further technical solution of the present invention, in step S1, a vector is calculated
Figure BDA0003281490100000035
Wherein the vector wiWord vectors, being word vectors of the words themselves
Figure BDA0003281490100000036
Sum vector
Figure BDA0003281490100000037
And respectively carrying out splicing operation on the word vectors of the part of speech tagging information and the named entity identification information of the word.
Further, in step S2, the vector xiHidden state vector h in one direction at time ttThe calculation is carried out, and the formula is as follows,
It=σ(xtWxi+ht-1Whi+bi)
Ft=σ(xtWxf+ht-1Whf+bf)
ot=σ(xtWxo+ht-1Who+bo)
Figure BDA0003281490100000038
Figure BDA0003281490100000039
Figure BDA00032814901000000310
Figure BDA00032814901000000311
ht=ot⊙tanh(Ct)
wherein x istFor the input at time t, σ is sigmoid activation function, tanh is hyperbolic tangent activation function, Wxi、Wxf、WxoAnd WxcAre respectively xtWeight parameter matrix at input gate, forget gate, output gate and memory cell, Whi、Whf、WhoAnd WhcAre respectively htIn transitWeight parameter matrix of entry, forgetting, output gates and memory cells, bi、bf、boAnd bcBias parameters for input gate, forgetting gate, output gate and memory cell, It、Ft、Ot
Figure BDA0003281490100000041
And CtOutputs of the input gate, the forgetting gate, the output gate, the candidate memory cell and the memory cell at time t, respectively, which is the matrix multiplication by elements;
will output in the forward direction
Figure BDA0003281490100000042
And backward output
Figure BDA00032814901000000412
Spliced to obtain an output h'tIs composed of
Figure BDA00032814901000000411
Furthermore, the weighted dependency matrix A' in step 3 is calculated as,
A′=φ(onehot(A)·D)
φ(x)=max(x,0);
wherein, A is an original dependency tree, onehot is an independent heating operation, phi is a ReLU activation function, and max is a maximum value.
Further, in step s4, the attention force matrix
Figure BDA0003281490100000045
Is of the formula
Figure BDA0003281490100000046
Where K is the number of multi-head attention heads, Q and K are feature representations of the text obtained through steps S1 and S2, and W isi QAnd Wi KIs a weight parameter matrix, d is the input dimension, softmax isNormalizing the exponential function to obtain k attention moment arrays
Figure BDA0003281490100000047
After splicing, obtaining A' through linear layer dimensionality reduction, the formula is,
Figure BDA0003281490100000048
wherein, WAAnd bAThe weight parameter matrix and the bias parameter of the linear transformation layer.
Further, the result of each graph convolution module is calculated in step S5 as,
Figure BDA0003281490100000049
outputo=Wo[input0;GCN(input0);..;GCN(outputi-1)]
inputc=Wc[inputi-1;output0;...;outputN]
Figure BDA00032814901000000410
wherein, in the graph convolution network of the L layer, the characteristic expression set of the initial input is
Figure BDA0003281490100000051
Level I node I accept
Figure BDA0003281490100000052
As input, and output
Figure BDA0003281490100000053
W(l)Convolution of the network weight parameter matrix for the graph, b(l)Is the bias parameter of the graph convolution network, N is the number of graph convolution layers of the previous sub-module, M is the number of sub-modules,Wo、Wc、WfAll are linear transform layer weight parameter matrices.
Further, the two graph volume modules obtained in step S5 are generated separately
Figure BDA0003281490100000054
And
Figure BDA0003281490100000055
h is obtained after splicing and linear dimensionality reductionoutput
Further, in step S6, the final relational feature is calculated by the formula,
Figure BDA0003281490100000056
where FFNN represents a feed forward neural network computation.
The advantage of the present invention is that,
1. the invention uses word vectors pre-trained by a large-scale dictionary and a bidirectional long-short term memory network to encode texts, and obtains the initial vector representation of the texts. A part of text characteristic information is already contained in the initial vector representation, and the vector is used as the input of a subsequent neural network model.
2. The invention utilizes a multi-head attention mechanism to obtain a plurality of attention matrixes of a text, wherein each attention matrix is obtained from different important parts of the text. The multi-head attention mechanism can acquire important information except text syntax dependency information and effectively extract features through a graph convolution network module.
3. The method utilizes the weighted dependency matrix to obtain the syntactic dependency structure information of the text, assigns learnable weight to each relationship type, changes the dependency matrix from a 0-1 matrix into the weighted matrix through the iterative update of the neural network, enables the matrix to express more accurate syntactic structure information, and performs the feature extraction through the graph convolution network module.
4. The multi-head attention mechanism and the weighted dependency matrix in the method respectively extract the key information of different dimensions in the text, and meanwhile, parallel calculation can be carried out, so that the time cost is reduced while the performance is improved.
Drawings
FIG. 1 is a schematic diagram of a relationship extraction model of the present invention,
FIG. 2 is a schematic diagram of a process for constructing a weighted dependency matrix according to the present invention.
Detailed Description
Referring to fig. 1, the embodiment provides a relationship extraction method for an automatic knowledge graph construction system, which utilizes a multi-head attention mechanism and a weighted dependency matrix to obtain key information of different dimensions of a text in parallel, and includes the following specific steps:
step 1: embedding initial words in a word vector dictionary which is trained in advance for each word in an original text to obtain vector representation w of each wordiWhere i is the ith word in the text. Additionally, the part-of-speech tagging information and the named entity identification information of each word are converted into vector representation to obtain a vector
Figure BDA0003281490100000061
And
Figure BDA0003281490100000062
the vector representation of the word is spliced to finally obtain
Figure BDA0003281490100000063
As the final word embedding vector representation for each word.
Step 2: vector representation x for each word obtained in step 1iCarrying out bidirectional long-short term memory network operation, coding the forward sequence and the backward sequence of the sentence to obtain a hidden state vector H ═ H of the sentence passing through the bidirectional long-short term memory network1,h2,...,hn]Where n is the number of words in the sentence, the hidden state vector h in a certain direction at time ttThe calculation formula of (a) is as follows:
It=σ(xtWxi+ht-1Whi+bi) (1)
Ft=σ(xtWxf+ht-1Whf+bf) (2)
Ot=σ(xtWxo+ht-1Who+bo) (3)
Figure BDA0003281490100000064
Figure BDA0003281490100000066
Figure BDA0003281490100000065
Figure BDA0003281490100000071
ht=Ot⊙tanh(Ct) (8)
wherein xtRepresenting the input at time t, σ represents the sigmoid activation function, tanh represents the hyperbolic tangent activation function, Wxi、Wxf、WxoAnd WxcRespectively represent xtWeight parameter matrix at input gate, forget gate, output gate and memory cell, Whi、Whf、WhoAnd WhcRespectively represent htIn the weight parameter matrix of the input gate, the forgetting gate, the output gate and the memory cell, bi、bf、boAnd bcRespectively representing the bias parameters of the input gate, the forgetting gate, the output gate and the memory cell, It、Ft、Ot
Figure BDA0003281490100000072
And CtRespectively representing the outputs of the input gate, the forgetting gate, the output gate, the candidate memory cell and the memory cell at the time tAnd, represents a matrix multiplication by elements. At time t, the final output h'tIs output from the forward direction
Figure BDA0003281490100000073
And backward output
Figure BDA0003281490100000074
And splicing to obtain the following calculation formula:
Figure BDA0003281490100000075
and step 3: and constructing a syntactic dependency tree according to the syntactic dependency structure of the text, wherein the sentence contains n words, and the dependency tree has n nodes and can be converted into an n multiplied by n dependency adjacency matrix A. If there is a dependency between word a and word b, then Aab1, otherwise Aab0. Setting a learnable weight variable D ═ D1,d2,...,dQ]Where Q is the number of relationship categories contained in the data set, dqThe weight of the relationship class with index q is 1 by default. First, we replace the values outside the main diagonal of the dependency matrix with the index of the correspondence class in the weight variable D. For index g, construct a one-hot vector r of length Nq=[0,...,0,1,0,...0]Wherein r isq[q]The rest value is 0, so that the weight variable D can participate in the calculation of the neural network through matrix bitwise multiplication to realize parameter updating, and a weight scalar obtained through matrix summation keeps the shape of the dependency matrix constant. For the original dependency tree A, the formula for constructing the adjacency matrix A' is as follows:
A′=φ(onehot(A)·D) (10)
φ(x)=max(x,0) (11)
where onehot represents the one-hot operation, #representsthe ReLU activation function, and max represents taking the maximum value. As shown in particular in fig. 2.
And 4, step 4: the multi-head attention mechanism is directly acted on the text to obtain k attention moment arrays (k is the number of the multi-head attention heads) of the text
Figure BDA0003281490100000081
Is the same as the dependency matrix a, the calculation formula is as follows:
Figure BDA0003281490100000082
where Q and K are the feature representations of the text obtained in steps 1 and 2, Wi QAnd Wi KFor the weight parameter matrix, d represents the input dimension and softmax represents the normalized exponential function. Obtaining k matrices
Figure BDA0003281490100000083
Then, after the two are spliced, the dimension is reduced through a linear layer to obtain A' which is used as the input of the graph convolution module, and the calculation formula is as follows:
Figure BDA0003281490100000084
wherein WAAnd WAThe weight parameter matrix and the bias parameter of the linear transformation layer.
And 5: taking A 'and A' obtained in the steps 3 and 4 as the input of the graph convolution module, using graph convolution networks with different depths as sub-modules for each graph convolution module, and using dense connection in the sub-modules to obtain the output of each graph convolution layer(l)Splicing the sub-modules to be used as the input of the next sub-module, and finally performing dense connection on the outputs of all the sub-modules to obtain the final product
Figure BDA0003281490100000085
In a graph convolution network of L layer, the characteristic expression set of initial input is
Figure BDA0003281490100000086
Level 1 node i accept
Figure BDA0003281490100000087
As input, and output
Figure BDA0003281490100000088
The calculation formula is as follows:
Figure BDA0003281490100000091
outputo=Wo[input0;GCN(input0);..;GCN(outputi-1)] (15)
inputc=Wc[inputi-1;output0;...;outputN] (16
Figure BDA0003281490100000097
wherein W(l)Representing a matrix of graph convolution network weight parameters, b(l)Representing the bias parameter of the graph convolution network, N is the number of graph convolution layers of the previous sub-module, M is the number of sub-modules, Wo、Wc、WfAll are linear transform layer weight parameter matrices. In the formula (17), for each input obtained by calculationi(i is more than or equal to 1), discarding operation is carried out, and neurons are discarded randomly. Separate generation of the two graph convolution modules of FIG. 1
Figure BDA0003281490100000092
And
Figure BDA0003281490100000093
h is obtained after splicing and linear dimensionality reductionoutputAs input to the relational classification layer.
Step 6: h from step 5outputRespectively obtain the feature representation h of the sentencesentCharacterization of two entities
Figure BDA0003281490100000098
And
Figure BDA0003281490100000099
deriving a final relational feature representation h using a feed-forward neural networkrelationAnd finally, carrying out relation prediction by a normalized exponential function, wherein the calculation formula is as follows:
Figure BDA0003281490100000096
where FFNN represents a feed forward neural network computation.
The foregoing illustrates and describes the principles, general features, and advantages of the present invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are intended to further illustrate the principles of the invention, and that various changes and modifications may be made without departing from the spirit and scope of the invention, which is intended to be protected by the appended claims. The scope of the invention is defined by the claims and their equivalents.

Claims (8)

1. A relation extraction method for an automatic knowledge graph construction system is characterized by comprising the following steps,
step S1, embedding each word in the text by using a pre-trained word vector dictionary, converting part-of-speech tagging information and named entity identification information of each word into vector representation and splicing the vector representation of the word with the vector representation of the word, and acquiring a vector xi
Step S2, vector xiPerforming bidirectional long-short term memory network operation, and splicing the forward operation result and the backward operation result to obtain a vector h't
S3, constructing a syntax dependency tree A through a syntax dependency structure of the text, setting a learnable weight variable D, constructing a dependency adjacent matrix by using the A, carrying out independent heating on matrix values, and carrying out bitwise multiplication on the matrix values and the weight variable D to obtain a weighted dependency matrix A';
step S4, pass vector h'tObtaining features of textSymbolizing matrixes Q and K, and acquiring K attention moment matrixes of the text by using a multi-head attention mechanism
Figure FDA0003281490090000011
Obtaining a matrix A' after linear dimensionality reduction;
step S5, taking the matrix A 'and the matrix A' as the input of the graph convolution module with different graph convolution network layer numbers to carry out graph convolution operation, and respectively obtaining the matrix
Figure FDA0003281490090000012
Sum matrix
Figure FDA0003281490090000013
Obtaining matrix H after linear dimensionality reductionoutput
Step S6, slave matrix HoutputTo obtain the feature expression matrix h of the sentencesentAnd a feature representation matrix of two entities
Figure FDA0003281490090000014
And
Figure FDA0003281490090000015
obtaining a relational feature representation matrix h using a feedforward neural networkrelationAnd finally, carrying out relation prediction through a normalized exponential function to obtain a final classification result.
2. The relation extraction method for the knowledge-graph-oriented automatic construction system according to claim 1, wherein in the step S1, vectors are used
Figure FDA0003281490090000016
Wherein the vector wiWord vectors, being word vectors of the words themselves
Figure FDA0003281490090000017
Sum vector
Figure FDA0003281490090000018
And respectively carrying out splicing operation on the word vectors of the part of speech tagging information and the named entity identification information of the word.
3. The relation extraction method for the knowledge-graph-oriented automatic construction system according to claim 1, wherein in the step S2, the vector x isiHidden state vector h in one direction at time ttThe calculation is carried out, and the formula is as follows,
It=σ(xtWxi+ht-1Whi+bi)
Ft=σ(xtWxf+ht-1Whf+bf)
Ot=σ(xtWxo+ht-1Who+bo)
Figure FDA0003281490090000021
Figure FDA0003281490090000029
Figure FDA0003281490090000022
Figure FDA00032814900900000210
ht=Ot⊙tanh(Ct)
wherein x istFor the input at time t, σ is sigmoid activation function, tanh is hyperbolic tangent activation function, Wxi、Wxf、WxoAnd WxcAre respectively xtAt the input door, forget the doorWeight parameter matrix, W, of output gates and memory cellshi、Whf、WhoAnd WhcAre respectively htIn the weight parameter matrix of the input gate, the forgetting gate, the output gate and the memory cell, bi、bf、boAnd bcBias parameters for input gate, forgetting gate, output gate and memory cell, It、Ft、Ot
Figure FDA0003281490090000023
And CtOutputs of the input gate, the forgetting gate, the output gate, the candidate memory cell and the memory cell at time t, respectively, which is the matrix multiplication by elements;
will output in the forward direction
Figure FDA0003281490090000024
And backward output
Figure FDA00032814900900000211
Spliced to obtain an output h'tIs composed of
Figure FDA00032814900900000212
4. The relation extraction method for the knowledge-graph-oriented automatic construction system according to claim 1, wherein the weighted dependency matrix A' in step 3 is calculated by the following formula,
A′=φ(onehot(A)·D)
φ(x)=max(x,0);
wherein, A is an original dependency tree, onehot is an independent heating operation, phi is a ReLU activation function, and max is a maximum value.
5. The relation extraction method for the knowledge-graph-oriented automatic construction system according to claim 1, wherein the attention force matrix in the step S4
Figure FDA0003281490090000027
Is of the formula
Figure FDA0003281490090000028
Where K is the number of multi-head attention heads, Q and K are feature representations of the text obtained through steps S1 and S2,
Figure FDA0003281490090000031
and
Figure FDA0003281490090000032
is a weight parameter matrix, d is an input dimension, softmax is a normalized exponential function, and k attention moment matrices are combined
Figure FDA0003281490090000033
After splicing, obtaining A' through linear layer dimensionality reduction, the formula is,
Figure FDA0003281490090000034
wherein, WAAnd bAThe weight parameter matrix and the bias parameter of the linear transformation layer.
6. The relation extraction method for the automatic knowledge-graph construction system according to claim 1, wherein the formula for calculating the result of each graph convolution module in the step S5 is,
Figure FDA0003281490090000035
outputo=Wo[input0;GCN(input0);..;GCN(outputi-1)]
inputc=Wc[inputi-1;output0;...;outputN]
Figure FDA0003281490090000036
wherein, in the graph convolution network of the L layer, the characteristic expression set of the initial input is
Figure FDA0003281490090000037
Level 1 node i accept
Figure FDA0003281490090000038
As input, and output
Figure FDA0003281490090000039
W(l)Convolution of the network weight parameter matrix for the graph, b(l)Is a graph convolution network offset parameter, N is the number of graph convolution layers of the last sub-module, M is the number of sub-modules, Wo、Wc、WfAll are linear transform layer weight parameter matrices.
7. The relation extraction method for the knowledge-graph-oriented automatic construction system according to claim 1, wherein the two graph volume modules obtained in the step S5 are generated respectively
Figure FDA00032814900900000310
And
Figure FDA00032814900900000311
h is obtained after splicing and linear dimensionality reductionoutput
8. The relation extraction method for the knowledge-graph automatic construction system according to claim 1, wherein in step S6, the final relation feature is calculated by the formula,
Figure FDA00032814900900000312
where FFNN represents a feed forward neural network computation.
CN202111133794.5A 2021-09-27 2021-09-27 Relation extraction method for knowledge graph automatic construction system Pending CN113901758A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111133794.5A CN113901758A (en) 2021-09-27 2021-09-27 Relation extraction method for knowledge graph automatic construction system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111133794.5A CN113901758A (en) 2021-09-27 2021-09-27 Relation extraction method for knowledge graph automatic construction system

Publications (1)

Publication Number Publication Date
CN113901758A true CN113901758A (en) 2022-01-07

Family

ID=79029679

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111133794.5A Pending CN113901758A (en) 2021-09-27 2021-09-27 Relation extraction method for knowledge graph automatic construction system

Country Status (1)

Country Link
CN (1) CN113901758A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114547298A (en) * 2022-02-14 2022-05-27 大连理工大学 Biomedical relation extraction method, device and medium based on combination of multi-head attention and graph convolution network and R-Drop mechanism
CN115774993A (en) * 2022-12-29 2023-03-10 广东南方网络信息科技有限公司 Conditional error identification method and device based on syntactic analysis

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111177394A (en) * 2020-01-03 2020-05-19 浙江大学 Knowledge map relation data classification method based on syntactic attention neural network
CN112163425A (en) * 2020-09-25 2021-01-01 大连民族大学 Text entity relation extraction method based on multi-feature information enhancement
CN112163426A (en) * 2020-09-30 2021-01-01 中国矿业大学 Relationship extraction method based on combination of attention mechanism and graph long-time memory neural network
CN113239186A (en) * 2021-02-26 2021-08-10 中国科学院电子学研究所苏州研究院 Graph convolution network relation extraction method based on multi-dependency relation representation mechanism
WO2021174774A1 (en) * 2020-07-30 2021-09-10 平安科技(深圳)有限公司 Neural network relationship extraction method, computer device, and readable storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111177394A (en) * 2020-01-03 2020-05-19 浙江大学 Knowledge map relation data classification method based on syntactic attention neural network
WO2021174774A1 (en) * 2020-07-30 2021-09-10 平安科技(深圳)有限公司 Neural network relationship extraction method, computer device, and readable storage medium
CN112163425A (en) * 2020-09-25 2021-01-01 大连民族大学 Text entity relation extraction method based on multi-feature information enhancement
CN112163426A (en) * 2020-09-30 2021-01-01 中国矿业大学 Relationship extraction method based on combination of attention mechanism and graph long-time memory neural network
CN113239186A (en) * 2021-02-26 2021-08-10 中国科学院电子学研究所苏州研究院 Graph convolution network relation extraction method based on multi-dependency relation representation mechanism

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
YIHAO DONG 等: "Weighted-Dependency with Attention-Based Graph Convolutional Network for Relation Extraction", NEURAL PROCESSING LETTERS, 9 September 2023 (2023-09-09) *
ZHIXIN LI 等: "Improve relation extraction with dual attention-guided graph convolutional networks", NEURAL COMPUTING AND APPLICATIONS, 18 June 2020 (2020-06-18) *
刘峰 等: "基于Multi-head Attention和Bi-LSTM的实体关系分类", 计算机系统应用, no. 06, 15 June 2019 (2019-06-15) *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114547298A (en) * 2022-02-14 2022-05-27 大连理工大学 Biomedical relation extraction method, device and medium based on combination of multi-head attention and graph convolution network and R-Drop mechanism
CN114547298B (en) * 2022-02-14 2024-10-15 大连理工大学 Biomedical relation extraction method, device and medium based on combination of multi-head attention and graph convolution network and R-Drop mechanism
CN115774993A (en) * 2022-12-29 2023-03-10 广东南方网络信息科技有限公司 Conditional error identification method and device based on syntactic analysis
CN115774993B (en) * 2022-12-29 2023-09-08 广东南方网络信息科技有限公司 Condition type error identification method and device based on syntactic analysis

Similar Documents

Publication Publication Date Title
CN112163426B (en) Relationship extraction method based on combination of attention mechanism and graph long-time memory neural network
CN109284506B (en) User comment emotion analysis system and method based on attention convolution neural network
CN111985245B (en) Relationship extraction method and system based on attention cycle gating graph convolution network
US20220147836A1 (en) Method and device for text-enhanced knowledge graph joint representation learning
US6601049B1 (en) Self-adjusting multi-layer neural network architectures and methods therefor
CN109947912A (en) A kind of model method based on paragraph internal reasoning and combined problem answer matches
CN110222163A (en) A kind of intelligent answer method and system merging CNN and two-way LSTM
CN108229582A (en) Entity recognition dual training method is named in a kind of multitask towards medical domain
CN111460132B (en) Generation type conference abstract method based on graph convolution neural network
CN113239186A (en) Graph convolution network relation extraction method based on multi-dependency relation representation mechanism
CN112232087B (en) Specific aspect emotion analysis method of multi-granularity attention model based on Transformer
CN108170848B (en) Chinese mobile intelligent customer service-oriented conversation scene classification method
CN112784532B (en) Multi-head attention memory system for short text sentiment classification
CN111274375A (en) Multi-turn dialogue method and system based on bidirectional GRU network
CN112115687A (en) Problem generation method combining triples and entity types in knowledge base
Tjandra et al. Gated recurrent neural tensor network
CN113901758A (en) Relation extraction method for knowledge graph automatic construction system
CN113255366B (en) Aspect-level text emotion analysis method based on heterogeneous graph neural network
CN112860904B (en) External knowledge-integrated biomedical relation extraction method
CN117131933A (en) Multi-mode knowledge graph establishing method and application
CN115374270A (en) Legal text abstract generation method based on graph neural network
CN116403231A (en) Multi-hop reading understanding method and system based on double-view contrast learning and graph pruning
CN115496072A (en) Relation extraction method based on comparison learning
CN116361438A (en) Question-answering method and system based on text-knowledge expansion graph collaborative reasoning network
CN113887836B (en) Descriptive event prediction method integrating event environment information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination