CN112214610A

CN112214610A - Entity relation joint extraction method based on span and knowledge enhancement

Info

Publication number: CN112214610A
Application number: CN202011021524.0A
Authority: CN
Inventors: 张骁雄; 刘姗姗; 丁鲲; 张雨豪; 张慧; 刘茗; 蒋国权; 漆桂林; 周晓磊
Original assignee: National University of Defense Technology
Current assignee: National University of Defense Technology
Priority date: 2020-09-25
Filing date: 2020-09-25
Publication date: 2021-01-12
Anticipated expiration: 2040-09-25
Also published as: CN112214610B

Abstract

The invention discloses a span and knowledge enhancement based entity relationship joint extraction method, and belongs to the technical field of information extraction and natural language processing. Firstly, constructing a sample data set and labeling the data set; then, entity recognition and relation classification are carried out, specifically, for the labeled data, a pre-training language model is utilized to map words in a high-dimensional discrete space to low-dimensional continuous space vectors; performing span identification, filtering and relationship classification by using a span-based model; converting the relationship classification into a graph classification by using a graph-based model, and introducing a syntactic dependency relationship so as to assist the relationship judgment classification; and performing joint training on the output result of the span-based model and the output result of the graph-based model, and identifying entities contained in the data and relationships among the entities. The invention introduces syntax information such as dependency relationship and the like into the end-to-end neural network model, thereby effectively identifying the overlapping relationship and improving the accuracy of entity relationship joint extraction.

Description

Entity relation joint extraction method based on span and knowledge enhancement

Technical Field

The invention belongs to the technical field of information extraction and natural language processing, and particularly relates to an entity relationship joint extraction method based on span and knowledge enhancement.

Background

Extracting entities and their inter-relationships plays a crucial role in understanding text. Specifically, named entity recognition, which is to identify entities having specific meanings in a text and determine the types of the entities (names of people, places, names of organizations, proper nouns, and the like), and relationship classification, which is to determine the types of relationships existing between a given set of entity pairs, are particularly critical in determining the structure of a text for use in downstream tasks such as knowledge graph construction, knowledge-based question answering, and the like.

The traditional entity relationship extraction method is a streamline process, namely named entity identification and relationship classification are divided into two independent subtasks, after a section of text is given, the entities in the text are identified, and then the relationship types among the identified entities are judged. Although the pipeline method is easy to implement, error transmission is easy to occur in the process, and if an error occurs in the named entity identification process, the effect of subsequent relation classification can be influenced. In order to solve the above problems, some recent researches propose a method for extracting a joint entity relationship, so as to fully mine potential dependency relationships between entities and relationships thereof, so that two tasks of named entity identification and relationship classification can bring out the best in each other. Although the combined entity relationship extraction method can effectively alleviate the error transfer problem in the pipeline method, the requirement on data set labeling is high, and a large amount of high-quality labeling data is needed to train the model. However, annotating data in a particular domain is time consuming and difficult. Meanwhile, the existing entity relationship extraction method based on the end-to-end neural network cannot fully mine information such as syntax and semantics among sentences, and the phenomena of overlapping relationship, multiple labels and the like are neglected in data labeled based on labeling systems such as BIO/BILO, so that the effect of entity relationship extraction is influenced.

Disclosure of Invention

The technical problem is as follows: aiming at the problem that the existing entity relationship extraction method is poor in entity relationship extraction effect, the invention provides an entity relationship joint extraction method based on span and knowledge enhancement, which can introduce syntax information such as dependency relationship and the like into an end-to-end neural network model, identify overlapped relationships and further improve the entity relationship extraction accuracy.

The technical scheme is as follows: the invention discloses a span and knowledge enhancement based entity relationship joint extraction method, which comprises the following steps:

s1: building a data set

Collecting data of a specific field, cleaning the collected data and constructing a data set of the field;

s2: annotating data

Randomly selecting a plurality of data in the data set, manually marking, and automatically marking the data which are not manually marked in the data set by using a regular template;

s3: entity identification and relationship classification

For the labeled data, mapping words in a high-dimensional discrete space to low-dimensional continuous space vectors by using a pre-training language model, and embedding codes;

performing span identification, filtering and relationship classification by using a span-based model;

converting the relationship classification into a graph classification by using a graph-based model, and introducing a syntactic dependency relationship so as to assist the relationship judgment classification;

and performing joint training on the output result of the span-based model and the output result of the graph-based model, and identifying entities contained in the data and relationships among the entities.

Further, in step S2, when the data is manually labeled, the entity location information, the entity type, and the relationship between the entities of the data are labeled.

Further, in step S2, when the regular template is used to automatically label the data, the relationship between the entity type and the entity is preset, the regular template is compiled according to the domain to which the data set belongs by using the writing knowledge of domain experts, and the preset entity type and the relationship between the entities are labeled in the data by means of template matching.

Further, in step S3, the pre-training language model uses a BERT model to obtain a vector representation of the word through the effective coding context information.

Further, in step S3, the span-based model includes an entity classifier, a span filter and a relationship classifier, and the entity classifier is used to judge and classify the entities, and the span filter is used to filter out spans that are not entities, and then the relationship classifier is used to judge the entity relationship type for classification.

Further, the method for classifying the entities by using the entity classifier comprises the following steps:

embedding a candidate span encoded by a pre-trained network model t_i,t_i+1,...,t_i+kInputting the data into an entity classifier, and obtaining a vector representation f (t) of the entity through one-time maximum pooling_i,t_i+1,...,t_i+k) And a special classification vector t obtained by coding with the BERT model_clsVector t encoding span width_widthAnd splicing to obtain the vector representation of the final entity:

wherein, i and k both represent serial numbers, and then the spliced result is obtained

Inputting into a full connection layer and obtaining the probability distribution of the entity type through softmax activation:

wherein ,e_iRepresents an entity type, W_iIs a weight, b_iTo be offset, s_iAnd (4) representing the ith span, and judging and classifying the entity type through the probability distribution.

Further, the method for filtering the span by using the span filter comprises the following steps: and in the probability distribution of the entity type obtained based on the entity classifier, if the probability value of the 'none' type is the highest, identifying the span as the 'none' type, judging that the span is not the entity, and filtering the span.

Further, the method for judging the relationship type by using the relationship classifier comprises the following steps:

representing span-encoded vectors obtained by an entity classifier

And

coded vector representation obtained by embedded coding of context between two spans

And splicing to obtain a relation representation, wherein the relation between the entity pairs is opposite, and two opposite relation representations are respectively arranged between all the entity pairs:

wherein ,r_i,j and r_j,iRespectively representing the relationship between the ith entity and the jth entity, wherein i and j represent serial numbers;

inputting the relation representation into a full connection layer, and activating through a sigmoid function to obtain the probability distribution of the relation type:

wherein ,W_i,j and W_j,iRepresents a weight, b_i,j and b_j,iRepresenting bias, judging and classifying the relation types between the entity pairs through the obtained probability distribution of the relation types, wherein sigma (DEG) represents a function.

Further, the method for judging and classifying the entity relationship by using the model based on the graph comprises the following steps:

utilizing a HanLP natural language processing tool to obtain a dependency analysis tree of the sentence, converting the dependency analysis tree into an adjacency matrix, and obtaining an input graph G based on a graph model_i(ii) a Then input the graphG_iInputting the data into a graph convolution neural network model GIN realized by CogDL, and obtaining vector representation of the whole graph by repeatedly and iteratively learning the characteristics of neighbor nodes

Vector representation of a graph

The probability distribution of the graph classification is obtained by inputting into a full connection layer and activating with softmax:

wherein ,

the weight is represented by a weight that is,

and representing bias, and judging and classifying the entity relationship by using the probability distribution of graph classification.

Further, the method for performing joint training on the output result based on the span model and the output result of the graph classification model and identifying the entities contained in the data and the relationship types among the entities comprises the following steps:

entity identification loss gamma of span-based model obtained by using cross entropy loss function_e：

wherein ,M_eThe number of entity types;

is an indicator variable, the value is 0 or 1, if the class and the sample class are the same, the value is 1, otherwise the value is 0;

belonging to entity class c for observing entity spans_eA predicted probability of (d); n represents the total number of samples in the data set, and e is the identification of the entity;

obtaining relation classification loss gamma of model based on span by using BCEWithLogs loss function_r：

wherein ,y_rThe index variable represents whether the prediction relation type is the same as the sample type, and r is the identifier of the relation; graph classification loss gamma for graph-based models using cross entropy loss function_g：

wherein ,M_gIs the number of relationship types;

is an indicator variable;

belong to class c for observation maps_gG is a relation identifier in the graph classification;

performing joint training by using the following formula to obtain joint loss gamma:

γ＝γ_e+γ_r+f(·)γ_g

wherein f (-) is a linear function, and the linear function f (-) takes

Where x represents the number of samples input and N represents a sample in the data setThe sum of the products.

Has the advantages that: compared with the prior art, the invention has the following advantages:

the invention discloses a span and knowledge enhancement based entity relationship joint extraction method, which is used for solving entity relationship joint extraction in a specific field. The method is composed of a span-based model and a graph-based model, wherein the span-based model can perform entity identification and relationship classification by using context expression in a text, and the graph-based model performs a graph classification task by using a syntax tree obtained by syntactic dependency analysis so as to effectively judge the relationship type. The model of the invention can introduce syntax information such as dependency relationship and the like into the end-to-end neural network model, thereby effectively identifying the overlapping relationship and improving the accuracy of entity relationship joint extraction.

Drawings

FIG. 1 is a flow chart of a method of the present invention;

FIG. 2 is an exemplary diagram of a manual annotation in an embodiment of the present invention;

FIG. 3 is a flowchart of a process for automatic tagging in an embodiment of the present invention;

FIG. 4 is an exemplary diagram of a canonical template in an embodiment of the invention;

FIG. 5 is a model diagram of entity relationship joint extraction according to the present invention.

Detailed Description

The invention is further described with reference to the following examples and the accompanying drawings.

Referring to fig. 1, the entity relationship joint extraction method based on span and knowledge enhancement of the present invention includes:

s1: building a data set

In the embodiment of the present invention, crawler software or a crawling program is used to crawl news texts on a portal website, and in other embodiments, the data set may also be data accumulated by an enterprise or data collected in other manners. And after enough data are collected, cleaning the collected data, and cleaning the data which do not meet the requirements to complete the construction of the data set.

For example, in one embodiment of the present invention, for news in the military field, in the implementation process, a certain portal website military news webpage is crawled, 840000 military news articles are collected, articles irrelevant to the military field or not relevant to the military field are washed away by using keywords in the military field, and 85000 articles are finally obtained, and a data set including 85000 articles is constructed.

S2: annotating data

Data marking includes artifical mark and automatic marking, when generally adopting artifical mark, can make full use of expert's experience, consequently the accuracy of mark is higher relatively, but because the data bulk is great in the data set, can not accomplish the mark through the mode of artifical mark completely, consequently need carry out automatic marking, improves the efficiency of mark.

In the embodiment of the invention, a plurality of data in the data set are randomly selected for manual marking, and the rest of the data in the data set are marked in an automatic marking mode. When data is manually marked, position information, entity types and entity relationship types of entities need to be marked. The entity type and the entity relationship type are preset before data labeling, for example, for the data set in the military field, the preset entity type includes: equipment, people, organizations, place names, military activities, job titles, combat readiness engineering; the preset entity relationship types comprise: deployed, held, owned, located. 338 articles are randomly extracted for the data set of the military field, and experts in the military field are invited to label the extracted 338 articles. The expert manually labels the extracted data according to the preset entity type and the entity relationship type, a specific number is given to an entity appearing in the article, meanwhile, the entity position is labeled according to the position of the entity starting and ending in the article, and fig. 2 shows a labeling example of data during manual labeling.

For data that is not manually labeled, in the embodiment of the present invention, labeling is performed by using a regular template, and a flow of labeling by using a regular template is shown in fig. 3:

(1) and defining entity relationship types, wherein for one embodiment of the invention, as the entities and the relationships in the military field are too complex, field experts with higher degree of engagement with the specialty during classification of design types are discussed and formulated according to the common content of the data set, so that the mainstream military entities and relationship types at present can be summarized more accurately, and the extracted relationship triples can be added into the construction of the military knowledge graph.

(2) Randomly extracting 100 military news texts from a data set, manually writing corresponding regular expressions aiming at the relation and the entity in each news text, then testing the effect of the regular expressions on the 338 manually marked military news texts, and supplementing the corresponding missing regular expressions according to the value of the recall rate (recall). It is noted that in other embodiments, other amounts of data may be extracted to write a regular expression.

(3) And (4) iterating, returning to the step (2), and repeating the step (2) until the accuracy (precision) and recall (recall) of the regular extraction reach the threshold values. At this time, the whole process is finished, and the corresponding entities and relations are extracted from the data set by using the good regular expressions and the labeling of the data is carried out.

In the implementation of the invention, 119 pieces of relational regular expressions are jointly designed, and an example of the written regular template is shown in FIG. 4. Analyzing the matching result of the regular template on the labeling data set, and determining the relationship between the relationship and the regular extracted manual labeling relationship statement according to two aspects: the type defined in advance by the relational regular expression is the same as the type defined by manual marking, or the head and tail entities of the manually marked relational sentences are in the sentences extracted by the regular expression.

After the data labeling is finished, the manually labeled data and the automatically labeled data are mixed and disordered, and then the entity recognition machine is classified according to the system.

S3: entity identification and relationship classification

For the labeled data, mapping words in a high-dimensional discrete space to low-dimensional continuous space vectors by using a pre-training language model, and embedding codes; performing span identification, filtering and relationship classification by using a span-based model; converting the relationship classification into a graph classification by using a graph-based model, and introducing a syntactic dependency relationship so as to assist the relationship classification; and performing joint training on the output result of the span-based model and the output result of the graph-based model, and identifying entities contained in the data and relationships among the entities.

In the embodiment of the invention, the pre-training language model adopts a BERT model which is issued by Google and is trained aiming at Chinese, and words in a high-dimensional discrete space can be mapped to a low-dimensional continuous space vector and are embedded into codes. The BERT model is a multi-layer bidirectional Transformer structure, and vector representation of words can be obtained by effectively encoding context information. For example, given a sentence containing n words, input to the BERT-based embedded encoding module will result in a word vector sequence { t } of length n +1_cls,t₁,t₂,...,t_nThe BERT model adds a special classification vector t covering the whole sentence information at the head end of the sequence_cls。

The span-based model comprises an entity classifier, a span filter and a relation classifier, the entity classifier is used for carrying out entity classification on the output of the BERT model, the span filter is used for filtering out non-entity spans, and then the relation classifier is used for judging and classifying entity relations.

After the span-based model obtains the BERT-based text vector representation, the span is obtained by adopting an optimized negative sampling mode, and the span which is not in the labeling entity list is defined as a negative sample. For example, for sentences (U.S., nation, F, -, 1, 5, war, fighter, plane), entities that may be detected are (U.S., (U.S.), (U.S. F), (F-15 fighter), and so on. Unlike the prior art, the span-based model of the present invention does not perform beam search on the entity and relationship hypotheses, but instead sets a maximum value of N_eI.e. choosing at most N among all possible entities_eAnd marking samples not labeled as positive examples in the training set as negative examples. Unlike the existing span-based model, the present invention proposes a new way to select the negative case, i.e. first createEstablishing a set S of military entities, wherein the set contains entities (labeled data and result of entity regular extraction) as many as possible in the data set, segmenting sentences by using jieba (segmentation software), obtaining all possible entities by segmenting words and obtaining part-of-speech corresponding to segmentation result, for example, I can obtain three entities of I, Beijing, Tianan and Tiananmen in Beijing Tiananmen according to the part-of-speech, firstly filtering according to the part-of-speech, only preserving nouns, then carrying out similarity calculation on the nouns and the entities in the entity set S, selecting the value with the highest similarity as the score of the segmentation result, and finally sorting according to the higher similarity and the higher priority if the negative example is higher, if the N is not reached_eFilling the result of word segmentation, and then selecting a random span, wherein the random span can select an entity with the length of 2-10. For example, in the military corpus studied in the embodiment of the present invention, the length of the entity is substantially in this range, and the entity which is more consistent with the characteristics of the military entity but is not labeled can be selected as a negative example to enable the training effect of the model to be better.

After the span-based model selects good possible entities, the vector representation of the entities is processed. The vector representation of the entity consists of three parts, namely vector representation of tokens contained by the entity (see fig. 5, namely, the word of the entity is mapped to the corresponding id of the word in the dictionary of the pre-training model), width embedding (see fig. 5) and special mark CLS (see fig. 5).

Therefore, the method for classifying the entities by using the entity classifier comprises the following steps:

embedding a candidate span encoded by a pre-trained network model t_i,t_i+1,...,t_i+kInputting the data into an entity classifier, and obtaining a vector representation f (t) of the entity through one-time maximum pooling_i,t_i+1,...,t_i+k). Width embedding is an embedding matrix (the matrix contains the characteristics of words) learned in training, namely the width k +1 of an entity indicates that the entity contains k +1 tokens, so that the width embedding of the entity is expressed by using k +1 as a subscript, and a vector with the width k +1 obtained by indexing in the width matrix represents t_widthI.e. to spanWidth encoded vector. The special mark symbol CLS is generated by BERT model, covers the global information of input sentence, and the BERT model is coded to obtain special classification vector t_cls. Representing the vector of the entity by f (t)_i,t_i+1,...,t_i+k) Special classification vectors t encoded with the BERT model_clsVector t encoding span width_widthAnd splicing to obtain the vector representation of the final entity:

Inputting the entity type into a full connection layer and activating the entity type through softmax, wherein the entity type comprises a non-type 'none', and obtaining the probability distribution of the entity type:

wherein ,e_iRepresents an entity type, W_iIs a weight, b_iTo be offset, s_iAnd representing the ith span, and judging the entity type through the probability distribution.

The span filter filters the span according to the probability distribution of the entity type obtained by the entity classifier, filters non-entity spans, and during filtering, if the probability value of the 'none' type in the probability distribution is the highest, the span is identified as the 'none' type, namely, the span is judged not to be an entity, so that the span is filtered.

The relationship classifier is used for entity relationship classification, and the relationship is constructed and classified for all possible entity pairs. First randomly selecting at most N from possible entities_rA set of relationships is formed for the entities. For one by entity pair(s)₁,s₂) Formed entities, their relation vectorsThe representation is composed of two parts, one part is head-tail entity vector representation obtained by the span identification part, and span coding representation can be obtained by an entity classifier

And

another part is textual features. In addition to entity features, relationship extraction may also rely on textual features. In the invention, CLS is not selected as the text feature, but the text between two entities is maximally pooled, the context information between the entity pairs is reserved, and the coded vector representation of the text feature is obtained by embedded coding

If there is no text between the two entities, then

Will be set to 0. Since the relationships of the entity pairs are often asymmetric, and the head and tail entities of the relationships cannot be reversed, each entity pair will get two opposite relationship representations:

wherein ,r_i,j and r_j,iRespectively representing the relationship between the ith entity and the jth entity, wherein i and j represent sequence numbers.

wherein ,W_i,j and W_j,iRepresents a weight, b_i,j and b_j,iAnd expressing bias, judging the relation type between the entity pairs through the obtained probability distribution of the relation type, and expressing a function by sigma (·).

The graph-based model converts the relation classification into the graph classification problem, introduces syntactic dependency analysis and assists the relation classification, thereby effectively relieving the defect that the end-to-end neural network model cannot effectively mine syntactic information.

The method for judging classification by using the graph-based model including the dependency analysis tree, the graph convolution neural network and the graph classifier and using the graph-based model to assist the relationship comprises the following steps: inputting any sentence, obtaining the dependency analysis tree of the sentence by using the HanLP natural language processing tool, and converting the dependency analysis tree into an adjacency matrix to obtain an input graph G of the graph-based model_iMore specifically, for words of each node composing the tree, summing word vectors obtained by a BERT model to form a node label, using the dependency relationship type among the words as an edge label, and using the relationship type of the whole sentence as a graph label; then input graph G_iInputting the data into a Graph convolution neural Network (GIN) model realized by CogDL (a tool kit), and learning the characteristics of neighbor nodes through multiple iterations to obtain a representation vector of the whole Graph

Representation vector of a graph

wherein ,

is the weight of the image,

representing the bias, and judging and classifying the relationship according to the probability distribution of the graph classification.

The method for performing joint training on the output result of the span-based model and the output result of the graph-based model and identifying the entities contained in the data and the relationship types among the entities comprises the following steps:

wherein ,M_eThe number of entity types;

belonging to entity class c for observing entity spans_eE is the identity of the entity.

wherein ,y_rIs an indicator variable and represents whether the prediction relation category is the same as the sample category; n represents the total number of samples in the dataset and r is the identity of the relationship.

Graph classification loss gamma for graph-based models using cross entropy loss function_g：

wherein ,M_gIs the number of relationship types;

is an indicator variable;

γ＝γ_e+γ_r+f(·)γ_g

wherein f (·) is a linear function. In a preferred embodiment of the invention, the linear function f (-) takes

Where x represents the number of samples input and N represents the sum of the samples in the dataset.

Through joint training, entities contained in the sentences and relationship types among the entities are identified.

Based on the method of the present invention, a specific application example is given.

First, a military news webpage of a representative certain website is crawled, and 840,000 military news articles are obtained. Filtering out articles which are irrelevant to the military field or do not contain the military relation based on the keywords of the military field, finally obtaining 85,000 articles, and completing the construction of the data set. Then 338 articles were randomly extracted, inviting the domain experts to do manual labeling. For the article which is not manually marked, the regular template is used for automatic marking, and 119 relational regular expressions are designed in total to automatically mark the data set. Finally, the data sets were randomly divided into a training set and a test set, the ratio of the two data sets being 10: 1. the parameters of the model referred to in the present invention are set as in table 1.

TABLE 1 parameter settings in the model

In order to prove the superiority of the model in the invention, the model results in the invention are compared with the existing model, table 2 lists the evaluation results of different models, and the evaluation results of different models are compared on three index measures, including accuracy, recall rate and F1 value.

TABLE 2 evaluation results of different models

In table 2, the results in row 1 are not based on the graph model, and rows 2 to 4 represent the results of the hybrid model. Comparing the model of the present invention with different GNN (Graph Neural Networks) variants, it can be found that the performance of each model is different. Although the SortPool model performed well in the graph classification task, there was no improvement in the relationship prediction task F1 index score compared to the single model. Likewise, SpERT + PATCH + SAN behaves generically in both graph classification and relationship abstraction. The observation that the model achieves the highest F1 score in the graph classification, the entity identification and the relation classification shows that the performance of the model can be improved by introducing specific external knowledge based on the graph model.

TABLE 3 comparison of results of different joint extraction methods

To jointly train a span-based model and a graph-based model, the entity identification penalty γ derived for the span-based model is required_eModel-derived relationship classification penalty γ based on span_rAnd a graph classification loss γ obtained by a graph-based model_gPolymerization is carried out. Table 3 shows the corresponding extraction results for three different combination methods. The results show that in addition to multiplication, addition and linear functions can be accurately co-trained. Meanwhile, by adopting a linear function F (x) ═ xN, the model can obtain 76.60 and 58.57F 1 scores in entity identification and relationship classification respectively, which are higher than those of the other two combined methods.

The entity relationship joint extraction method based on span and knowledge enhancement provided by the invention is used for solving the problem of entity relationship joint extraction in a specific field. The method is composed of a span-based model and a graph-based model, wherein the span-based model can perform entity identification and relationship classification by using context expression in a text, and the graph-based model performs a graph classification task by using a syntax tree obtained by syntactic dependency analysis so as to effectively judge the relationship type. The model of the invention can introduce syntax information such as dependency relationship and the like into the end-to-end neural network model, thereby effectively identifying the overlapping relationship and improving the accuracy of entity relationship joint extraction.

The above examples are only preferred embodiments of the present invention, it should be noted that: it will be apparent to those skilled in the art that various modifications and equivalents can be made without departing from the spirit of the invention, and it is intended that all such modifications and equivalents fall within the scope of the invention as defined in the claims.

Claims

1. A span and knowledge enhancement based entity relation joint extraction method is characterized by comprising the following steps:

s1: building a data set

s2: annotating data

s3: entity identification and relationship classification

2. The method for extracting entity relationships based on span and knowledge enhancement as claimed in claim 1, wherein in step S2, when the data is labeled manually, the entity location information, the entity type and the relationships between the entities of the data are labeled.

3. The entity relationship joint extraction method based on span and knowledge enhancement as claimed in claim 1, wherein in step S2, when the regular template is used to label the data automatically, the relationship between entity type and entity is preset, according to the domain to which the data set belongs, the regular template is compiled by using the knowledge written by domain experts, and the preset entity type and the relationship between entities are marked in the data by means of template matching.

4. The method for entity relationship joint extraction based on span and knowledge enhancement as claimed in claim 1, wherein in step S3, the pre-trained language model employs a BERT model to obtain the vector representation of the word by effectively encoding context information.

5. The method for extracting entity relationship based on span and knowledge enhancement as claimed in claim 1, wherein in step S3, the span-based model includes an entity classifier, a span filter and a relationship classifier, the entity classifier is used to classify the entity by judgment, the span filter is used to filter out non-entity spans, and then the relationship classifier is used to classify the entity relationship type by judgment.

6. The method of claim 5, wherein the entity classifier is used to classify the entities according to the following steps:

wherein ,e_iRepresents an entity type, W_iRepresents a weight, b_iDenotes the offset, s_iAnd (4) representing the ith span, and judging and classifying the entity type through the probability distribution.

7. The entity relationship joint extraction method based on span and knowledge enhancement as claimed in claim 5 or 6, wherein the method for filtering the span by using the span filter is as follows: and in the probability distribution of the entity type obtained based on the entity classifier, if the probability value of the 'none' type is the highest, identifying the span as the 'none' type, judging that the span is not the entity, and filtering the span.

8. The entity relationship joint extraction method based on span and knowledge enhancement as claimed in claim 7, wherein the method for judging the relationship type by using the relationship classifier comprises:

representing span-encoded vectors obtained by an entity classifier

And

9. The method for extracting entity relationship based on span and knowledge enhancement as claimed in claim 1, wherein the method for judging and classifying entity relationship with the aid of graph-based model comprises:

utilizing a HanLP natural language processing tool to obtain a dependency analysis tree of the sentence, converting the dependency analysis tree into an adjacency matrix, and obtaining an input graph G based on a graph model_i(ii) a Then input graph G_iInputting the data into a graph convolution neural network model GIN realized by CogDL, and obtaining vector representation of the whole graph by repeatedly and iteratively learning the characteristics of neighbor nodes

Vector representation of a graph

wherein ,

the weight is represented by a weight that is,

10. The method for extracting entity relationship based on span and knowledge enhancement as claimed in claim 1, wherein the method for performing joint training on the output result based on the span model and the output result based on the graph classification model and identifying the entity included in the data and the relationship type between the entities comprises:

wherein ,M_eThe number of entity types;

wherein ,y_rThe index variable represents whether the prediction relation type is the same as the sample type, and r is the identifier of the relation;

wherein ,M_gIs the number of relationship types;

is an indicator variable;

γ＝γ_e+γ_r+f(·)γ_g

wherein f (-) is a linear function, and the linear function f (-) takes