CN112364654A - Education-field-oriented entity and relation combined extraction method - Google Patents
Education-field-oriented entity and relation combined extraction method Download PDFInfo
- Publication number
- CN112364654A CN112364654A CN202011252896.4A CN202011252896A CN112364654A CN 112364654 A CN112364654 A CN 112364654A CN 202011252896 A CN202011252896 A CN 202011252896A CN 112364654 A CN112364654 A CN 112364654A
- Authority
- CN
- China
- Prior art keywords
- label
- entity
- relation
- attention
- knowledge
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/20—Education
Abstract
The invention discloses an entity and relation combined extraction method facing the education field, which is used for solving the problem that the existing method is lack of application in the education field, the invention utilizes a pre-trained XLNET language model to obtain high-level feature embedding, captures text context semantic information through a Morrifier BiGRU neural network, and introduces a MultiHead Attention mechanism behind the Morrifier BiGRU neural network to capture more important parts in text features, thereby effectively solving the problem that a large number of modifiers interfere in the entity; the invention extracts the entity and the relation at the same time in a combined extraction mode, realizes the dependence between the entity and the relation subtask through a parameter sharing coding layer, and thereby relieves the problem of error propagation.
Description
Technical Field
The invention relates to an entity and relation combined extraction method for the education field, and belongs to the natural language processing technology.
Background
With the rapid development of online learning in the education field, the data volume of online courses grows exponentially, and how to efficiently and accurately extract useful entity and relationship information from the data becomes a research hotspot. In the past decades, text mining and Natural Language Processing (NLP) have made great progress; but the information extraction technology in the education field has a great promotion space. Information extraction techniques, which are representative in the field of online education, include extracting specific types of course knowledge point entity information and relationships between entities from text information of online courses. The extracted information is used for various types of research, is not only suitable for various NLP tasks (such as a document classification and question-answering system), but also plays an important role in personalized recommendation of online learning. As entity recognition and relationship extraction are widely applied in knowledge discovery and data mining analysis, the need for this technology will continue to grow.
Entity recognition and relationship extraction mainly comprise a dictionary-based mode, a rule-based mode, a machine learning-based mode and a deep learning-based mode. In the dictionary-based approach, terms in the dictionary are simply matched with words in the target sequence for entity extraction. While this approach is simple, the continuing increase in the number of entities and the diversity of symbols in the online course text data makes extraction difficult. In rule-based approaches, entity extraction tends to exhibit higher performance when applied to only one particular domain. In machine learning based methods, entity extraction is performed using various algorithms and statistical models. However, both rule-based and machine-learning methods are highly dependent on feature engineering, which is not only labor and time consuming, but also requires a lot of domain knowledge. Different from the previous method, the deep learning method does not need heavy manpower to make the features, and the deep learning method automatically extracts the most representative features by using a neural network, so that a very good effect is obtained.
In the existing research on named entity identification and relationship extraction, most scholars divide the process into two independent tasks, and solve the extraction problem of entities and relationships in a pipeline mode, and the method considers the extraction of the entities and the relationships as two independent subtasks executed successively: named Entity Recognition (NER) and Relationship Extraction (RE). Specifically, named entities in sentences are extracted firstly, then pairwise combination pairing is carried out on the extracted named entities, and finally semantic relations existing between the named entity pairs are identified. However, there are two major disadvantages to this type of approach: firstly, error propagation is carried out, errors of the named entity recognition module are transmitted to a downstream relation extraction module, and therefore the relation extraction performance is influenced; the second is to ignore the dependency existing between the two subtasks.
Disclosure of Invention
The purpose of the invention is as follows: in order to overcome the defects in the prior art, the invention provides an entity and relation combined extraction method facing the field of education, which is used for solving the problems that the existing method is lack of application in the field of education, high-level feature embedding is obtained by utilizing a pre-trained XLNET language model and an attention mechanism, and entity recognition and relation classification are processed simultaneously through a combined model to relieve error propagation.
The technical scheme is as follows: in order to achieve the purpose, the invention adopts the technical scheme that:
an entity and relation combined extraction method oriented to the education field comprises the following steps:
(1) establishing a course knowledge point named entity corpus, wherein the course knowledge point named entity corpus consists of text data containing course knowledge points;
(2) carrying out distributed expression on the preprocessed text data containing the course knowledge points, taking sentences as input, and obtaining a text pre-training vector through an XLNET language model (disorder language model);
(3) inputting the obtained text pre-training vector into a Mogrifier BiGRU neural network (a deformed bidirectional gated recurrent neural network) for text feature extraction;
(4) introducing a MultiHead Attention mechanism (a multi-head Attention mechanism) behind a Mogrifer BiGRU neural network to capture a more important part in text features; the important part refers to a part which can form a knowledge entity in the text characteristics;
(5) and (4) obtaining the relationship between the named entities of the curriculum knowledge points and the knowledge entities by combining a CRF (conditional random field) model.
Specifically, in the step (1), a BIO labeling method (a standard method for converting a sequence into an original label) is firstly adopted to label the knowledge entities of the text data in the course knowledge point named entity corpus, that is, the text data is divided into P categories, each category is a label, the pth category is represented as a label P, where P is 1,2, …, and P; dividing the relation between the knowledge entities into Q relations, wherein the Q-th relation is expressed as a relation Q, and Q is 1,2, … and Q; dividing the text data into a training set and a test set; in the BIO labeling method, B represents the beginning of a knowledge entity, I represents other parts of the knowledge entity, and O represents a non-knowledge entity.
Specifically, in the step (2), the sentence to which the XLNET language model is input is represented by S ═ S1,s2,…,sN]The text pre-training vector output by the XLNET language model is represented as X ═ X1,x2,…,xN](ii) a Wherein s isiRepresenting the ith word, x, constituting the sentence SiIs a character siI is 1,2, …, N.
Specifically, the Morrifier BiGRU neural network is different from the traditional GRU network (supplementing Chinese explanation) in that the context modeling capability of the whole model can be enhanced in a pre-interaction mode; the Morrifier BiGRU neural network comprises a forward GRU network and a backward GRU network, and the input and hidden layer output of the Morrifier BiGRU neural network are respectively X ═ X1,x2,…,xN]And H ═ H1,h2,…,hN]The input and hidden layer output of the forward GRU network are respectivelyAndthe input and hidden layer output to the GRU network are respectivelyAnd
the superscripts t and t-1 denote t time and t-1 time, in pairsAndcarry out bidirectional multi-round interaction to obtain
For the forward GRU network, the interaction process is as follows:
For the backward GRU network, the interaction process is as follows:
Wherein: σ is a logistic regression function, R1、R2、R3、R4Is a model parameter; to reduce the number of parameters, R may be used1、R2、R3、R4Are all designed as product form of low rank matrix.
Specifically, in the step (4), a MultiHead authorization mechanism is introduced after the Morrifier BiGRU neural network, and the MultiHead authorization mechanism is used for further capturing the words siThe context semantics of (1) and highlighting the significance of the keywords in the sentence (S), and assigning Attention weight, taking the MultiHead Attention mechanism as the Attention layer. The MultiHead authorization mechanism differs from the traditionally used authorization mechanism in that: the MultiHead Attention mechanism can generate a plurality of different Attention scores in parallel, and finally, the Attention scores are spliced to serve as a final Attention score, so that important parts in text features can be captured better.
Specifically, the computation process of the MultiHead Attention mechanism includes the following steps:
(41) changing X to [ X ]1,x2,…,xN]H ═ H output by Mogrifier BiGRU neural network1,h2,…,hN]Mapping into K, Q, V three vectors;
(42) k, Q, V Attention is focused on the jth point of the Multihead Attention mechanism Wherein:is a matrix of three global parameters, dNDenotes the input dimension of the MultiHead Attention mechanism, D denotes the total number of heads of Attention of the MultiHead Attention mechanism, Dk=dq=dv=dN/D;
(44) Splicing the D attention values to obtain the multi-head attentionWherein: woAs a weight matrix, the ith row and the jth column element B of BijIndicating the word siWeight on jth attention;
(45) associated words siHidden state h ofiAnd attention weight bijGenerating words siContent vector of
(46) The important part of text features captured by introducing a MultiHead Attention mechanism after a Morrifier BiGRU neural network is C ═ C1,c2,…,cN]。
Specifically, in the step (5), the CRF model is used as a label score layer, the CRF model is used to calculate the label score of each word under each label, then a Viterbi algorithm (Viterbi algorithm) is applied to obtain a label sequence with the highest label score, and then a relationship between the course knowledge point named entity and the knowledge entity is obtained through the relationship extraction layer.
More specifically, in the step (5), the CRF model is used to calculate the word siLabel score under label p
Wherein: superscript (ner) represents knowledge entity annotation recognition; v(ner)And U(ner)Representing a weight matrix, b(ner)Representing a bias matrix, V(ner)∈Rp×l,U(ner)∈Rl×2d,b(ner)∈RlL is the layer width of the CRF model, and d is the number of hidden layer units of the Morrifier BiGRU neural network; f (-) represents a nonlinear activation function;
assigning labels to all words in the sentence S, so as to obtain a label sequence of the sentence S, where each sentence S has R ═ NPAnd (3) breeding the label sequences, and calculating the label score of S under the r label sequences:
wherein: y isrDenotes the r-th tag sequence, Y ═ Y1,Y2,…,YR],r=1,2,…,R,Indicates the lower character s of the r-th label sequenceiThe label score under the assigned label is,under the designation of the r-th tag sequenceTag score of sentence S under assigned tag, A(i,r),(i+1,r)Indicates the lower character s of the r-th label sequenceiTransferring the assigned label to the word si+1A represents a transition matrix, A ∈ R(P+2)×(P+2)(ii) a Because the start label and the end label are considered when the hierarchy is built, the dimensionality in the transfer process is 2 more than that of P;
scoring tags for tag sequencesAnd (3) carrying out normalization to obtain the probability distribution of each label sequence:
The label score layer is trained using a method that minimizes cross-entropy loss.
More specifically, in the step (5), when the relationship extraction layer is used for extracting the relationship between the named entity of the curriculum knowledge point and the knowledge entity, the word s under the given relationship q is calculated firstlyiAnd character sjThe relationship score between:
S(re)(mj,mi,q)=V(re)f(U(re)mj+W(re)mi+b(re))
wherein: m isi=[ci;gi],mj=[cj;gj]Superscript (re) denotes relationship recognition, V(re)、U(re)And W(re)Representing a weight matrix, b(re) Representing a bias matrix, V(re)∈Rl,U(re)∈Rl×(2a+d),W(re)∈Rl×(2a+d),b(re)∈RlL is the layer width of the relation extraction layer, d is the number of hidden layer units of the Mogrifier BiGRU neural network, and a is the dimension of the label; f (-) represents a nonlinear activation function;
word siAnd character sjThe probability distribution case with the relation q:
the relationship extraction layer is trained using a method of minimizing cross-entropy loss.
Specifically, the cross entropy loss L in the training process of the relationship extraction layerRECalculated using the formula:
the objective function is min (L)NER+LRE) Wherein: l isNERCross entropy loss during training for label scoring tiers.
Has the advantages that: compared with the prior art, the entity and relationship combined extraction method for the education field has the following advantages that: 1. according to the invention, a pre-trained XLANT language model is used for designing high-level feature embedding, and dynamic embedding representation is carried out on the same word according to context information instead of directly using fixed word vector information, so that the accuracy of converting a word embedding layer text into a low-density embedding vector can be greatly improved, the negative influence of a ambiguous word on the model performance is reduced, and the local and global information of the word is effectively captured; 2. according to the method, a MultiHead authorization mechanism is introduced behind a Morrifier BiGRU neural network to capture a more important part in text characteristics, so that the problem of interference of a large number of modifiers in an entity is effectively solved; 3. the invention extracts the entity and the relation at the same time in a combined extraction mode, realizes the dependence between the entity and the relation subtask through a parameter sharing coding layer, and thereby relieves the problem of error propagation.
Drawings
FIG. 1 is a flow chart of the present invention.
Detailed Description
The present invention will be further described with reference to the accompanying drawings.
Fig. 1 shows a combined extraction method of entities and relations in the education domain, which comprises the following steps:
the method comprises the following steps: establishing a course knowledge point named entity corpus, wherein the course knowledge point named entity corpus is composed of text data containing course knowledge points.
Firstly, carrying out knowledge entity annotation on text data in a course knowledge point named entity corpus by adopting a BIO annotation method, namely dividing the text data into P categories, wherein each category is a label, and the P-th category is represented as a label P, wherein P is 1,2, … and P; dividing the relation between the knowledge entities into Q relations, wherein the Q-th relation is expressed as a relation Q, and Q is 1,2, … and Q; dividing the text data into a training set and a test set; in the BIO labeling method, B represents the beginning of a knowledge entity, I represents other parts of the knowledge entity, and O represents a non-knowledge entity.
Step two: and performing distributed representation on the preprocessed text data containing the course knowledge points, taking sentences as input, and obtaining a text pre-training vector through an XLNET language model.
A sentence input into the XLNET language model is denoted as S ═ S1,s2,…,sN]The text pre-training vector output by the XLNET language model is represented as X ═ X1,x2,…,xN](ii) a Wherein s isiRepresenting the ith word, x, constituting the sentence SiIs a character siI is 1,2, …, N.
Step three: and inputting the obtained text pre-training vector into a Mogrifier BiGRU neural network for text feature extraction.
The Morrifier BiGRU neural network comprises a forward GRU network and a backward GRU network, and the input and hidden layer output of the Morrifier BiGRU neural network are respectively X ═ X1,x2,…,xN]And H ═ H1,h2,…,hN]The input and hidden layer output of the forward GRU network are respectivelyAndthe input and hidden layer output to the GRU network are respectivelyAnd
the superscripts t and t-1 denote t time and t-1 time, in pairsAndcarry out bidirectional multi-round interaction to obtain
For the forward GRU network, the interaction process is as follows:
For the backward GRU network, the interaction process is as follows:
Wherein: σ is a logistic regression function, R1、R2、R3、R4Are model parameters.
Step four: introducing a MultiHead Attention mechanism behind a Morrifier BiGRU neural network to capture more important parts in text features; the important part refers to the part of the text feature which can form the knowledge entity.
Introducing a MultiHead Attention mechanism after a Morrifier BiGRU neural network, and further capturing words s by using the MultiHead Attention mechanismiThe context semantics of (1) and highlighting the significance of the keywords in the sentence (S), and assigning Attention weight, taking the MultiHead Attention mechanism as the Attention layer. The calculation process of the MultiHead Attention mechanism comprises the following steps:
(41) changing X to [ X ]1,x2,…,xN]H ═ H output by Mogrifier BiGRU neural network1,h2,…,hN]Mapping into K, Q, V three vectors;
(42) k, Q, V Attention is focused on the jth point of the Multihead Attention mechanism Wherein:is a matrix of three global parameters, dNDenotes the input dimension of the MultiHead Attention mechanism, D denotes the total number of heads of Attention of the MultiHead Attention mechanism, Dk=dq=dv=dN/D;
(44) Splicing the D attention values to obtain the multi-head attentionWherein: woAs a weight matrix, the ith row and the jth column element B of BijIndicating the word siWeight on jth attention;
(45) associated words siHidden state h ofiAnd attention weight bijGenerating words siContent vector of
(46) The important part of text features captured by introducing a MultiHead Attention mechanism after a Morrifier BiGRU neural network is C ═ C1,c2,…,cN]。
Step five: and obtaining the relationship between the course knowledge point named entity and the knowledge entity by combining the CRF model.
And taking the CRF model as a label score layer, firstly calculating the label score of each word under each label by using the CRF model, then obtaining a label sequence with the highest label score by applying a Viterbi algorithm, and then obtaining the relation between the course knowledge point named entity and the knowledge entity through a relation extraction layer.
Wherein: superscript (ner) represents knowledge entity annotation recognition; v(ner)And U(ner)Representing a weight matrix, b(ner)Representing a bias matrix, V(ner)∈Rp×l,U(ner)∈Rl×2d,b(ner)∈RlL is the layer width of the CRF model, and d is the number of hidden layer units of the Morrifier BiGRU neural network; f (-) represents a nonlinear activation function;
assigning labels to all words in the sentence S, so as to obtain a label sequence of the sentence S, where each sentence S has R ═ NPAnd (3) breeding the label sequences, and calculating the label score of S under the r label sequences:
wherein: y isrDenotes the r-th tag sequence, Y ═ Y1,Y2,…,YR],r=1,2,…,R,Indicates the lower character s of the r-th label sequenceiThe label score under the assigned label is,represents the tag score of the sentence S under the assigned tag under the r label sequence, A(i,r),(i+1,r)Indicates the lower character s of the r-th label sequenceiTransferring the assigned label to the word si+1A represents a transition matrix, A ∈ R(P+2)×(P+2);
Scoring tags for tag sequencesAnd (3) carrying out normalization to obtain the probability distribution of each label sequence:
The label score layer is trained using a method that minimizes cross-entropy loss.
When the relation extraction layer is adopted to extract the relation between the named entities of the curriculum knowledge points and the knowledge entities, the character s under the given relation q is calculated firstlyiAnd character sjThe relationship score between:
S(re)(mj,mi,q)=V(re)f(U(re)mj+W(re)mi+b(re))
wherein: m isi=[ci;gi],mj=[cj;gj]Superscript (re) denotes relationship recognition, V(re)、U(re)And W(re)Representing a weight matrix, b(re)Representing a bias matrix, V(re)∈Rl,U(re)∈Rl×(2a+d),W(re)∈Rl×(2a+d),b(re)∈RlL is the layer width of the relation extraction layer, d is the number of hidden layer units of the Mogrifier BiGRU neural network, and a is the dimension of the label; f (-) represents a nonlinear activation function;
word siAnd character sjThe probability distribution case with the relation q:
the relationship extraction layer is trained using a method of minimizing cross-entropy loss.
Cross entropy loss L in the training process of the relation extraction layerRECalculated using the formula:
the objective function is min (L)NER+LRE) Wherein: l isNERCross entropy loss during training for label scoring tiers.
The above description is only of the preferred embodiments of the present invention, and it should be noted that: it will be apparent to those skilled in the art that various modifications and adaptations can be made without departing from the principles of the invention and these are intended to be within the scope of the invention.
Claims (10)
1. An entity and relation combined extraction method oriented to the education field is characterized in that: the method comprises the following steps:
(1) establishing a course knowledge point named entity corpus, wherein the course knowledge point named entity corpus consists of text data containing course knowledge points;
(2) carrying out distributed representation on the preprocessed text data containing the course knowledge points, taking sentences as input, and obtaining a text pre-training vector through an XLNET language model;
(3) inputting the obtained text pre-training vector into a Mogrifier BiGRU neural network for text feature extraction;
(4) introducing a MultiHead Attention mechanism behind a Morrifier BiGRU neural network to capture more important parts in text features; the important part refers to a part which can form a knowledge entity in the text characteristics;
(5) and obtaining the relationship between the course knowledge point named entity and the knowledge entity by combining the CRF model.
2. The method of claim 1, wherein the method comprises the following steps: in the step (1), knowledge entity labeling is performed on text data in the course knowledge point named entity corpus by using a BIO labeling method, that is, the text data is divided into P categories, each category is a label, the pth category is represented as a label P, and P is 1,2, …, and P; dividing the relation between the knowledge entities into Q relations, wherein the Q-th relation is expressed as a relation Q, and Q is 1,2, … and Q; dividing the text data into a training set and a test set; in the BIO labeling method, B represents the beginning of a knowledge entity, I represents other parts of the knowledge entity, and O represents a non-knowledge entity.
3. The method of claim 1, wherein the method comprises the following steps: said step (2)In (1), a sentence input to the XLNET language model is denoted as S ═ S1,s2,…,sN]The text pre-training vector output by the XLNET language model is represented as X ═ X1,x2,…,xN](ii) a Wherein s isiRepresenting the ith word, x, constituting the sentence SiIs a character siI is 1,2, …, N.
4. The method of claim 3, wherein the method comprises the following steps: in the step (3), the morrifier BiGRU neural network includes a forward GRU network and a backward GRU network, and the input and the hidden layer output of the morrifier BiGRU neural network are X ═ X respectively1,x2,…,xN]And H ═ H1,h2,…,hN]The input and hidden layer output of the forward GRU network are respectivelyAndthe input and hidden layer output to the GRU network are respectivelyAnd
the superscripts t and t-1 denote t time and t-1 time, in pairsAndcarry out bidirectional multi-round interaction to obtain
For the forward GRU network, the interaction process is as follows:
For the backward GRU network, the interaction process is as follows:
Wherein: σ is a logistic regression function, R1、R2、R3、R4Are model parameters.
5. The method of claim 4, wherein the method comprises the following steps: in the step (4), a MultiHead authorization mechanism is introduced after the Morrifier BiGRU neural network, and the MultiHead authorization mechanism is used for further capturing the words siThe context semantics of (1) and highlighting the significance of the keywords in the sentence (S), and assigning Attention weight, taking the MultiHead Attention mechanism as the Attention layer.
6. The method of claim 5, wherein the method comprises the following steps: the calculation process of the MultiHead Attention mechanism comprises the following steps:
(41) changing X to [ X ]1,x2,…,xN]H ═ H output by Mogrifier BiGRU neural network1,h2,…,hN]Mapping into K, Q, V three vectors;
(42) k, Q, V Attention is focused on the jth point of the Multihead Attention mechanism Wherein:is a matrix of three global parameters, dNDenotes the input dimension of the MultiHead Attention mechanism, D denotes the total number of heads of Attention of the MultiHead Attention mechanism, Dk=dq=dv=dN/D;
(44) Splicing the D attention values to obtain the multi-head attentionWherein: woAs a weight matrix, the ith row and the jth column element B of BijIndicating the word siWeight on jth attention;
(45) associated words siHidden state h ofiAnd attention weight bijGenerating words siContent vector of
(46) The important part of text features captured by introducing a MultiHead Attention mechanism after a Morrifier BiGRU neural network is C ═ C1,c2,…,cN]。
7. The method of claim 5, wherein the method comprises the following steps: in the step (5), the CRF model is used as a label score layer, the CRF model is used for calculating the label score of each word under each label, then a Viterbi algorithm is applied to obtain a label sequence with the highest label score, and then the relation between the course knowledge point named entity and the knowledge entity is obtained through the relation extraction layer.
8. The method of claim 7The entity and relation combined extraction method facing the education field is characterized in that: in the step (5), the CRF model is used for calculating the words siLabel score under label p
Wherein: superscript (ner) represents knowledge entity annotation recognition; v(ner)And U(ner)Representing a weight matrix, b(ner)Representing a bias matrix, V(ner)∈Rp×l,U(ner)∈Rl×2d,b(ner)∈RlL is the layer width of the CRF model, and d is the number of hidden layer units of the Morrifier BiGRU neural network; f (-) represents a nonlinear activation function;
assigning labels to all words in the sentence S, so as to obtain a label sequence of the sentence S, where each sentence S has R ═ NPAnd (3) breeding the label sequences, and calculating the label score of S under the r label sequences:
wherein: y isrDenotes the r-th tag sequence, Y ═ Y1,Y2,…,YR],r=1,2,…,R,Indicates the lower character s of the r-th label sequenceiThe label score under the assigned label is,represents the tag score of the sentence S under the assigned tag under the r label sequence, A(i,r),(i+1,r)Indicates the lower character s of the r-th label sequenceiTransferring the assigned label to the word si+1A represents a transition matrix, A ∈ R(P+2)×(P+2);
Scoring tags for tag sequencesAnd (3) carrying out normalization to obtain the probability distribution of each label sequence:
The label score layer is trained using a method that minimizes cross-entropy loss.
9. The method of claim 8, wherein the method comprises the steps of: in the step (5), when the relation between the course knowledge point named entity and the knowledge entity is extracted by adopting the relation extraction layer, the character s under the given relation q is calculated firstlyiAnd character sjThe relationship score between:
S(re)(mj,mi,q)=V(re)f(U(re)mj+W(re)mi+b(re))
wherein: m isi=[ci;gi],mj=[cj;gj]Superscript (re) denotes relationship recognition, V(re)、U(re)And W(re)Representing a weight matrix, b(re)Representing a bias matrix, V(re)∈Rl,U(re)∈Rl×(2a+d),W(re)∈Rl×(2a+d),b(re)∈RlL is the layer width of the relation extraction layer, d is the number of hidden layer units of the Mogrifier BiGRU neural network, and a is the dimension of the label; f (-) represents a nonlinear activation function;
word siAnd character sjThe probability distribution case with the relation q:
the relationship extraction layer is trained using a method of minimizing cross-entropy loss.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011252896.4A CN112364654A (en) | 2020-11-11 | 2020-11-11 | Education-field-oriented entity and relation combined extraction method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011252896.4A CN112364654A (en) | 2020-11-11 | 2020-11-11 | Education-field-oriented entity and relation combined extraction method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112364654A true CN112364654A (en) | 2021-02-12 |
Family
ID=74515944
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011252896.4A Pending CN112364654A (en) | 2020-11-11 | 2020-11-11 | Education-field-oriented entity and relation combined extraction method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112364654A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113553385A (en) * | 2021-07-08 | 2021-10-26 | 北京计算机技术及应用研究所 | Relation extraction method of legal elements in judicial documents |
-
2020
- 2020-11-11 CN CN202011252896.4A patent/CN112364654A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113553385A (en) * | 2021-07-08 | 2021-10-26 | 北京计算机技术及应用研究所 | Relation extraction method of legal elements in judicial documents |
CN113553385B (en) * | 2021-07-08 | 2023-08-25 | 北京计算机技术及应用研究所 | Relation extraction method for legal elements in judicial document |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Huang et al. | Facial expression recognition with grid-wise attention and visual transformer | |
CN112115238B (en) | Question-answering method and system based on BERT and knowledge base | |
Qiu et al. | DGeoSegmenter: A dictionary-based Chinese word segmenter for the geoscience domain | |
CN107943784B (en) | Relationship extraction method based on generation of countermeasure network | |
CN111160037B (en) | Fine-grained emotion analysis method supporting cross-language migration | |
CN111581401B (en) | Local citation recommendation system and method based on depth correlation matching | |
Chen et al. | A semantics-assisted video captioning model trained with scheduled sampling | |
CN109508459B (en) | Method for extracting theme and key information from news | |
CN111858896B (en) | Knowledge base question-answering method based on deep learning | |
Li et al. | UD_BBC: Named entity recognition in social network combined BERT-BiLSTM-CRF with active learning | |
Han et al. | A survey of transformer-based multimodal pre-trained modals | |
Xu et al. | Enhanced attentive convolutional neural networks for sentence pair modeling | |
Puscasiu et al. | Automated image captioning | |
CN112905736A (en) | Unsupervised text emotion analysis method based on quantum theory | |
CN111242059A (en) | Method for generating unsupervised image description model based on recursive memory network | |
CN113360667B (en) | Biomedical trigger word detection and named entity identification method based on multi-task learning | |
CN113297375B (en) | Document classification method, system, device and storage medium based on label | |
Elleuch et al. | The Effectiveness of Transfer Learning for Arabic Handwriting Recognition using Deep CNN. | |
CN112101014B (en) | Chinese chemical industry document word segmentation method based on mixed feature fusion | |
CN112364654A (en) | Education-field-oriented entity and relation combined extraction method | |
CN116680407A (en) | Knowledge graph construction method and device | |
Goel et al. | Injecting prior knowledge into image caption generation | |
CN116127954A (en) | Dictionary-based new work specialized Chinese knowledge concept extraction method | |
CN115455144A (en) | Data enhancement method of completion type space filling type for small sample intention recognition | |
CN115409028A (en) | Knowledge and data driven multi-granularity Chinese text sentiment analysis method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |