CN112214610B

CN112214610B - Entity relationship joint extraction method based on span and knowledge enhancement

Info

Publication number: CN112214610B
Application number: CN202011021524.0A
Authority: CN
Inventors: 张骁雄; 刘姗姗; 丁鲲; 张雨豪; 张慧; 刘茗; 蒋国权; 漆桂林; 周晓磊
Original assignee: National University of Defense Technology
Current assignee: National University of Defense Technology
Priority date: 2020-09-25
Filing date: 2020-09-25
Publication date: 2023-09-08
Anticipated expiration: 2040-09-25
Also published as: CN112214610A

Abstract

The invention discloses a physical relationship joint extraction method based on span and knowledge enhancement, belonging to the technical field of information extraction and natural language processing. Firstly, constructing a sample data set and marking the data set; then entity identification and relation classification are carried out, specifically, the words in the high-dimensional discrete space are mapped to low-dimensional continuous space vectors by utilizing a pre-training language model for the marked data; performing span identification, filtering and relationship classification by using a span-based model; converting the relationship classification into graph classification by using a graph-based model, and introducing syntactic dependency relationship so as to assist the relationship judgment classification; and performing joint training on the output result of the span-based model and the output result of the graph-based model, and identifying entities and relationships among the entities contained in the data. According to the invention, the dependency relationship and other syntactic information are introduced into the end-to-end neural network model, so that the overlapping relationship is effectively identified, and the entity relationship joint extraction accuracy is improved.

Description

Entity relationship joint extraction method based on span and knowledge enhancement

Technical Field

The invention belongs to the technical field of information extraction and natural language processing, and particularly relates to a span and knowledge enhancement-based entity relationship joint extraction method.

Background

Extraction entities and their inherent relationships play a vital role in understanding text. Specifically, named entity recognition and relationship classification are particularly critical in judging text structures for downstream tasks such as knowledge graph construction, knowledge-based questions and answers, wherein named entity recognition refers to recognizing an entity with a specific meaning in text and judging the type (name, place name, organization name, proper noun, etc.) of the entity, and relationship classification refers to judging the type of relationship existing between a given set of entity pairs.

The traditional entity relation extraction method is a pipeline process, namely named entity identification and relation classification are divided into two independent subtasks, after a text is given, the entity in the text is identified, and then the relation type among the identified entities is judged. Although the pipeline method is easy to implement, error transfer is easy to occur in the process, and if errors occur in the named entity identification process, the effect of subsequent relation classification can be affected. Aiming at the problems, a method for extracting the relationship of the combined entity is proposed in some recent researches, so that potential dependency relationships between the entity and the relationship are fully mined, and two tasks of named entity identification and relationship classification can achieve the effect of complement each other. Although the combined entity relation extraction method can effectively alleviate the error transfer problem in the pipeline method, the requirement on the labeling of the data set is high, and a large amount of high-quality labeling data is required for training the model. However, labeling data in a particular field is time consuming and difficult. Meanwhile, the existing entity relation extraction method based on the end-to-end neural network cannot fully mine the information such as the syntax and the semantics among sentences, and the phenomenon such as overlapping relation and multi-label is ignored in the data set marked on the basis of the marking system such as BIO/BILO and the like, so that the effect of entity relation extraction can be influenced.

Disclosure of Invention

Technical problems: aiming at the problem that the existing entity relation extraction method has poor entity relation extraction effect, the invention provides a span and knowledge enhancement-based entity relation joint extraction method, which can introduce syntax information such as dependency relation and the like into an end-to-end neural network model to identify overlapped relation, thereby improving the extraction accuracy of entity relation.

The technical scheme is as follows: the invention relates to a span and knowledge enhancement-based entity relationship joint extraction method, which comprises the following steps:

s1: constructing a dataset

Collecting data of a specific field, cleaning the collected data, and constructing a data set of the field;

s2: labeling data

Randomly selecting a plurality of data in the data set, manually marking the data, and automatically marking the data which are not manually marked in the data set by using a regular template;

s3: entity identification and relationship classification

Mapping words in a high-dimensional discrete space to low-dimensional continuous space vectors by using a pre-training language model for the marked data, and embedding codes;

performing span identification, filtering and relationship classification by using a span-based model;

converting the relationship classification into graph classification by using a graph-based model, and introducing syntactic dependency relationship so as to assist the relationship judgment classification;

and performing joint training on the output result of the span-based model and the output result of the graph-based model, and identifying entities and relationships among the entities contained in the data.

Further, in step S2, when the data is manually marked, the relationship among the entity location information, the entity type and the entity of the data is marked.

Further, in step S2, when the regular template is used to automatically label the data, the relationship between the entity type and the entity is preset, and according to the domain to which the data set belongs, the regular template is written by using the writing knowledge of the domain expert, and the relationship between the entity type and the entity is preset in the data is labeled in a template matching manner.

Further, in step S3, the pre-training language model uses a BERT model to obtain a vector representation of the word through efficient encoding of the context information.

Further, in step S3, the span-based model includes an entity classifier, a span filter and a relationship classifier, the entity classifier is used to determine and classify the entity, the span filter is used to filter out the span of the non-entity, and the relationship classifier is used to determine the relationship type of the entity to classify.

Further, the method for classifying the entities by using the entity classifier comprises the following steps:

candidate span { t } to be embedded with code by pre-trained network model _i ,t _i+1 ,...,t _i+k Inputting into an entity classifier, and performing primary maximum pooling to obtain a vector representation f (t) _i ,t _i+1 ,...,t _i+k ) And is encoded with a BERT model to obtain a special classification vector t _cls Vector t encoding span width _width Splicing to obtain vector representation of the final entity:

wherein i and k each represent a sequence number, and then splicing the spliced resultsInputting into a full connection layer and activating by softmax to obtain probability distribution of entity types:

wherein ,e_i Representing entity type, W _i Weight, b _i For biasing, s _i Representing the ith span, and judging and classifying the entity types through the probability distribution.

Further, the span filtering method using the span filter comprises the following steps: and identifying the span as the "none" type if the probability value of the "none" type is highest in the probability distribution of the entity type obtained based on the entity classifier, judging that the span is not an entity, and filtering the span.

Further, the method for judging the relationship type by using the relationship classifier comprises the following steps:

representing span-coded vectors obtained by entity classifier and />Coding vector representation +_obtained by embedded coding with context between two spans>The relation expression is obtained by splicing, and as the relation among the entity pairs is opposite, two opposite relation expressions exist among all the entity pairs, namely:

wherein ,r_i,j and r_j,i Respectively representing the relation between the ith entity and the jth entity, wherein i and j represent serial numbers;

inputting the relation expression into a full connection layer, and activating through a sigmoid function to obtain probability distribution of relation types:

wherein ,W_i,j and W_j,i Representing weights, b _i,j and b_j,i And (3) representing bias, judging and classifying the relationship types among the entity pairs through the obtained probability distribution of the relationship types, wherein sigma (·) represents a function.

Further, the method for judging and classifying the entity relationship by using the model assistance based on the graph comprises the following steps:

obtaining a dependency analysis tree of sentences by utilizing a HanLP natural language processing tool, and converting the dependency analysis tree into an adjacent matrix to obtain an input graph G based on a graph model _i The method comprises the steps of carrying out a first treatment on the surface of the Then will input the graph G _i Input into a graph convolution neural network model GIN realized by CogDL, and obtain vector representation of the whole graph through multiple iterative learning of the characteristics of neighbor nodes

Representing vectors of a graphThe probability distribution of the graph classification is obtained by inputting into a fully connected layer and activating by softmax:

wherein ,representing weights +.>And representing the bias, and judging and classifying the entity relationship by using probability distribution of graph classification.

Further, the method for carrying out joint training on the output result based on the span model and the output result of the graph classification model and identifying the entities and the relationship types among the entities included in the data comprises the following steps:

entity recognition loss gamma for span-based models using cross entropy loss functions _e ：

wherein ,M_e Is the number of entity types;to indicate a variable, the value is 0 or 1, if the class and sample class are the same as 1, otherwise 0; />For observing that the entity span belongs to the entity class c _e Is used for predicting the probability of (1); n represents the total number of samples in the dataset, e is the identity of the entity;

obtaining a relationship classification loss gamma of a span-based model using a BCEWithLogits loss function _r ：

wherein ,y_r For indicating variables, representing whether the predicted relationship category is the same as the sample category, r is the identity of the relationship; obtaining graph classification loss gamma for graph-based models using cross entropy loss functions _g ：

wherein ,M_g Is the number of relationship types;is an indicator variable; />To observe that the graph belongs to the category c _g G is a relation identifier in the graph classification;

the joint training is performed by using the following formula to obtain the joint loss gamma:

γ＝γ _e +γ _r +f(·)γ _g

wherein f (·) is a linear function, and the linear function f (·) is takenWhere x represents the number of samples entered and N represents the sum of samples in the dataset.

The beneficial effects are that: compared with the prior art, the invention has the following advantages:

the invention discloses a span and knowledge enhancement-based entity relationship joint extraction method, which is used for solving the problem of entity relationship joint extraction in a specific field. The method comprises a span-based model and a graph-based model, wherein the span-based model can utilize context representation in text to perform entity identification and relationship classification, and the graph-based model utilizes a syntax tree obtained by syntactic dependency analysis to perform graph classification tasks so as to effectively judge relationship types. The model of the invention can introduce the syntax information such as the dependency relationship and the like into the end-to-end neural network model, thereby effectively identifying the overlapping relationship and improving the entity relationship joint extraction accuracy.

Drawings

FIG. 1 is a flow chart of the method of the present invention;

FIG. 2 is an exemplary diagram of an artificial annotation in an embodiment of the invention;

FIG. 3 is a flow chart of a process for automatic labeling in an embodiment of the invention;

FIG. 4 is an exemplary diagram of a canonical template in an embodiment of the invention;

FIG. 5 is a diagram of a model of entity-relationship joint extraction according to the present invention.

Detailed Description

The invention is further illustrated by the following examples and the accompanying drawings.

Referring to fig. 1, the span and knowledge enhancement-based entity relationship joint extraction method of the present invention includes:

s1: constructing a dataset

The building of the data set is to build the data set in the field of interest or the specific field, and enough data needs to be collected before the data set is built, in the embodiment of the invention, the crawler software or the crawling program is utilized to crawl news texts on the portal, and in other embodiments, the data accumulated by enterprises or the data collected in other modes can be used. After enough data are collected, the collected data are washed, and data which do not meet the requirements are washed away, so that the construction of a data set is completed.

For example, in one embodiment of the present invention, for news in the military field, during the implementation process, a certain web portal military news page is crawled, 840 000 military news articles are collected in total, the articles which are irrelevant to the military field or do not include military relations are washed out by using keywords in the military field, and finally 85 000 articles are obtained, so that a data set including 85 000 articles is constructed.

S2: labeling data

The data marking comprises manual marking and automatic marking, and expert experience is fully utilized when manual marking is generally adopted, so that the marking accuracy is relatively high, but because the data volume in the data set is large, the marking cannot be completed completely through a manual marking mode, automatic marking is needed, and the marking efficiency is improved.

In the embodiment of the invention, a plurality of data in the data set are randomly selected for manual labeling, and the rest data in the data set are labeled in an automatic labeling mode. When the data is manually marked, the position information, the entity type and the entity relation type of the entity are required to be marked. The entity type and the entity relation type are preset before data marking, for example, aiming at the data set in the military field, the preset entity types comprise: equipment, figures, organizations, place names, military operations, job titles, and combat readiness projects; the preset entity relation type comprises the following steps: deployment, hold, own, located. For the data set in the military field, 338 articles are randomly extracted, and an expert in the military field is invited to mark the extracted 338 articles. The expert marks the extracted data manually according to the preset entity type and entity relation type, a specific number is given to the entity appearing in the article, meanwhile, the entity position marks according to the beginning and ending positions of the entity in the article, and fig. 2 shows a marking example of the data during manual marking.

For data which is not manually marked, in the embodiment of the invention, the marking is performed by using a regular template, and the marking flow by using the regular is as shown in fig. 3:

(1) For the embodiment of the invention, because the entity and the relation in the military field are too complex, the field expert with higher degree of fit with the profession when the design type is classified is formulated according to the common content discussion of the data set, the currently mainstream military entity and relation type can be summarized more accurately, and the extracted relation triples can be added into the construction of the military knowledge graph.

(2) Randomly extracting 100 military news texts from the data set, manually writing corresponding regular expressions aiming at the relation and the entity in each news text, then testing the effect of the regular expressions on the artificially marked 338 military news texts, and supplementing the corresponding missing regular expressions according to the value of the recall ratio (recall). It is noted that in other embodiments, other amounts of data may be extracted to write regular expressions.

(3) Iterating back to step (2), and repeating step (2) until the accuracy (precision) and recall (recall) of the canonical extraction reach a threshold. At this time, the whole flow is finished, and the corresponding entities and relations are extracted from the data set by using perfect regular expressions and the data is marked.

In the implementation of the invention, 119 relational regular expressions are designed, and a written regular template is shown in fig. 4 for example. Analyzing the matching result of the regular template on the annotation data set, and successfully extracting the relation rule to the artificial annotation relation statement is determined by two aspects: the type of the relation regular expression pre-defined is the same as the type of the manual annotation definition or the head and tail entities of the manual annotation relation statement are in the statement extracted by the regular expression.

After the data is marked, mixing and disturbing the manually marked data and the automatically marked data, and then classifying the entity identification mechanism.

S3: entity identification and relationship classification

Mapping words in a high-dimensional discrete space to low-dimensional continuous space vectors by using a pre-training language model for the marked data, and embedding codes; performing span identification, filtering and relationship classification by using a span-based model; converting the relationship classification into graph classification by using a graph-based model, and introducing syntactic dependency relationship so as to assist the relationship classification; and performing joint training on the output result of the span-based model and the output result of the graph-based model, and identifying entities and relationships among the entities contained in the data.

In the embodiment of the invention, the pre-training language model adopts a BERT model which is issued by Google and is used for training Chinese, and words in a high-dimensional discrete space can be mapped to low-dimensional continuous space vectors and embedded into codes. The BERT model is a multi-layer bi-directional transform structure that can efficiently encode context information to obtain a vector representation of words. For example, a given one contains nAfter the sentences of the words are input into the BERT-based embedded coding module, a word vector sequence { t } with the length of n+1 is obtained _cls ,t ₁ ,t ₂ ,...,t _n The BERT model adds a special classification vector t covering the whole sentence of information into the head end of the sequence _cls 。

The span-based model comprises an entity classifier, a span filter and a relation classifier, wherein the entity classifier is used for carrying out entity classification on the output of the BERT model, the span filter is used for filtering the span of non-entity, and then the relation classifier is used for judging and classifying the entity relation.

After the text vector representation based on BERT is obtained by the span-based model, the span is obtained in an optimized negative sampling mode, and the span which is not in the labeling entity list is defined as a negative sample. For example, for sentences (America, china, F, -,1,5, war, fighter, machine), the entities that may be detected are (America), (America F), (F-15 fighter), and so forth. Unlike prior art work, the span-based model of the present invention does not perform beam searches on physical and relational assumptions, but instead sets a maximum value N _e I.e. choose at most N among all possible entities _e The entities and marks samples that are not labeled as positive examples in the training set as negative examples. Unlike the existing span-based model, the invention provides a new way of selecting negative examples, namely firstly, a military entity set S is established, the set S comprises as many entities (marked data and results of regular extraction of the entities) as possible in the data set, then, the words are segmented by using jieba (a word segmentation software), the words can obtain all possible entities and obtain the parts of speech corresponding to the segmentation results, for example, "I can obtain I, beijing Tiananmen", tianan and Tiananmen "can obtain three entities, namely I, beijing, tianan and Tianan, firstly, filtering according to the parts of speech, only preserving nouns, then carrying out similarity calculation on the nouns and the entities in the entity set S, selecting the value with the highest similarity as the score of the segmentation result, finally, selecting the negative examples is ordered according to the higher priority of the similarity, if N cannot be reached _e Filling with the rest results of word segmentation, and randomly selecting spanIn the method, the span is randomly selected to select the entity with the length of 2-10. For example, the length of the entity is basically within the range of the military corpus studied in the embodiment of the invention, and the training effect of the model can be better by selecting the entity which is more consistent with the characteristics of the military entity but is not marked as a negative example.

The span-based model, after selecting the possible entities, processes the vector representation of the entities. The vector representation of an entity consists of three parts, namely a vector representation of tokens that the entity contains (see fig. 5, i.e. mapping the words of the entity to the ids corresponding to the words in the dictionary of the pre-trained model), width embedding (see fig. 5) and special tags CLS (see fig. 5).

Therefore, the method for classifying the entities by using the entity classifier comprises the following steps:

candidate span { t } to be embedded with code by pre-trained network model _i ,t _i+1 ,...,t _i+k Inputting into an entity classifier, and performing primary maximum pooling to obtain a vector representation f (t) _i ,t _i+1 ,...,t _i+k ). Width embedding is an embedding matrix learned in training (the matrix contains the features of words), namely that the width of an entity is k+1, which means that k+1 tokens are contained in the entity, and then the width of the entity is embedded as a vector expression t with k+1 as a subscript and obtained by indexing in the width matrix _width I.e. vectors encoding the span width. The special mark symbol CLS is generated by a BERT model, covers the global information of the input sentence, and the BERT model codes to obtain a special classification vector t _cls . Representing the vector of the entity by f (t _i ,t _i+1 ,...,t _i+k ) Special classification vector t encoded with BERT model _cls Vector t encoding span width _width Splicing to obtain vector representation of the final entity:

wherein i and k each represent a sequence number, and then splicing the spliced resultsInput into a fully connected layer and activated by softmax, resulting in types of entities, including no type "none", and resulting in probability distribution of entity types:

wherein ,e_i Representing entity type, W _i Weight, b _i For biasing, s _i Representing the ith span, and judging the entity type through the probability distribution.

And when filtering, if the probability value of the "none" type is highest in the probability distribution, the span is identified as the "none" type, namely, the span is judged not to be an entity, so that the span is filtered.

The relationship classifier is used for classifying entity relationships, and all possible entity pairs are constructed and classified. First randomly selecting a maximum of N from possible entities _r The relationship sets are composed for the entities. For a pair(s) ₁ ,s ₂ ) The relation vector representation of the constituted entity is composed of two parts, one part is the head-tail entity vector representation obtained by the span identification part, and the span coding representation can be obtained by the entity classifier and />The other part is a text feature. In addition to the physical features, the relation extraction also relies on text features. In the invention, CLS is not selected as text feature, but text between two entities is maximally pooled, context information between entity pairs is reserved, and coding of text feature is obtained by embedding codingCode vector representation +.>If there is no text between the two entities +.>Will be set to 0. Since the relationship of entity pairs tends to be asymmetric, the head-to-tail entities of the relationship cannot be reversed, so that each entity pair will be represented by two opposite relationships:

wherein ,r_i,j and r_j,i Respectively representing the relation between the ith entity and the jth entity, i and j representing serial numbers.

wherein ,W_i,j and W_j,i Representing weights, b _i,j and b_j,i And (3) representing bias, judging the relationship type between the entity pairs through the obtained probability distribution of the relationship type, wherein sigma (·) represents a function.

The graph-based model is used for converting the relation classification into the graph classification problem, and syntactic dependency analysis is introduced to assist the relation classification, so that the defect that the end-to-end neural network model cannot effectively mine syntactic information is effectively relieved.

The method for judging and classifying by utilizing the graph-based model comprises the following steps of: inputting any sentence, obtaining a dependency analysis tree of the sentence by utilizing a HanLP natural language processing tool, and converting the dependency analysis tree into an adjacent matrix to obtain an input graph G of a graph-based model _i More specifically, summing word vectors of each node constituting the tree, which are obtained by the BERT model, as node labels, dependency relationship types among the words as edge labels, and relationship types of the whole sentence as graph labels; then will input the graph G _i Input to model GIN (Graph Isomorphism Network, graph isomorphic network) of graph convolution neural network implemented by cog dl (a tool kit), the feature of neighbor node is learned through multiple iterations to obtain the representation vector of the whole graph

Vector representation of a graphThe probability distribution of the graph classification is obtained by inputting into a fully connected layer and activating by softmax:

wherein ,is a weight of->And (3) representing bias, and judging and classifying the relationship through probability distribution of graph classification.

The method for carrying out joint training on the output result of the span-based model and the output result of the graph-based model and identifying the entities and the relationship types among the entities included in the data comprises the following steps:

wherein ,M_e Is the number of entity types;to indicate a variable, the value is 0 or 1, if the class and sample class are the same as 1, otherwise 0; />For observing that the entity span belongs to the entity class c _e E is the identity of the entity.

wherein ,y_r To indicate a variable, indicating whether the predicted relationship class is the same as the sample class; n represents the total sample amount in the dataset and r is the identity of the relationship.

Obtaining graph classification loss gamma for graph-based models using cross entropy loss functions _g ：

γ＝γ _e +γ _r +f(·)γ _g

wherein f (·) is a linear function. In a preferred embodiment of the invention, the linear function f (.) is takenWhere x represents the number of samples entered and N represents the sum of samples in the dataset.

And identifying the entities contained in the sentences and the relationship types among the entities through joint training.

Based on the method of the invention, a specific application example is given.

First, a military news web page of a representative web site is crawled, and 840,000 military news articles are obtained. And filtering articles which are irrelevant to the military field or do not contain military relations based on the keywords in the military field, and finally obtaining 85,000 articles to complete the construction of a data set. Then 338 articles are randomly extracted, and the expert in the field is invited to manually mark. For articles which are not manually marked, automatic marking is carried out by utilizing a regular template, and 119 relational regular expressions are designed to automatically mark a data set. Finally, the data sets were randomly divided into training and testing sets, the ratio of the two data sets being 10:1. the parameters of the model according to the present invention are set as shown in table 1.

Table 1 parameter settings in model

In order to prove the superiority of the model in the invention, the model result in the invention is compared with the existing model, the evaluation results of different models are listed in table 2, and the evaluation results of different models are respectively compared on three index measures, including accuracy, recall and F1 value.

Table 2 evaluation results of different models

In table 2, the results in line 1 are not based on the graph model, and lines 2 to 4 represent the results of the hybrid model. By comparing the model of the present invention with different GNN (Graph Neural Networks, graphic neural network) variants, it was found that the performance of each model was different. Although the SortPool model performs well in graph classification tasks, there is no improvement in the relationship prediction task F1 index score compared to the single model. Similarly, spERT+PATCH+SAN performs generally in both graph classification and relational extraction. The observation shows that the model reaches the highest F1 score in graph classification, entity identification and relation classification, which shows that the performance of the model can be improved by introducing specific external knowledge based on the graph model.

Table 3 comparison of results from different joint extraction methods

To jointly train a span-based model and a graph-based model, entity recognition loss gamma obtained for the span-based model is required _e Span-based model derived relationship classification loss gamma _r Graph classification loss gamma representing graph-based model _g Polymerization is carried out. The extraction results corresponding to the three different joint methods are given in table 3. The results show that in addition to multiplication, addition and linear functions can be accurately joint trained. At the same time, a linear function f (x) =xn, moduloTypes can obtain F1 scores of 76.60 and 58.57 respectively in entity identification and relationship classification, which are higher than those of other two combined methods.

The invention provides a span and knowledge enhancement-based entity relationship joint extraction method, which is used for solving the problem of entity relationship joint extraction in a specific field. The method comprises a span-based model and a graph-based model, wherein the span-based model can utilize context representation in text to perform entity identification and relationship classification, and the graph-based model utilizes a syntax tree obtained by syntactic dependency analysis to perform graph classification tasks so as to effectively judge relationship types. The model of the invention can introduce the syntax information such as the dependency relationship and the like into the end-to-end neural network model, thereby effectively identifying the overlapping relationship and improving the entity relationship joint extraction accuracy.

The above examples are only preferred embodiments of the present invention, it being noted that: it will be apparent to those skilled in the art that several modifications and equivalents can be made without departing from the principles of the invention, and such modifications and equivalents fall within the scope of the invention.

Claims

1. A military field entity relationship joint extraction method based on span and knowledge enhancement is characterized by comprising the following steps:

s1: constructing a dataset

s2: labeling data

s3: entity identification and relationship classification

performing joint training on the output result of the span-based model and the output result of the graph-based model, and identifying entities and relationships among the entities contained in the data;

the span-based model comprises an entity classifier, a span filter and a relation classifier, wherein the entity classifier is used for judging and classifying the entity, the span filter is used for filtering the span of the non-entity, and the relation classifier is used for judging the entity relation type for classifying;

the method for judging and classifying the entities by using the entity classifier comprises the following steps:

wherein ,e_i Representing entity type, W _i Representation ofWeight, b _i Representing bias, s _i Representing the ith span, and judging and classifying the entity types through the probability distribution;

the method for converting the relationship classification into the graph classification by using the graph-based model and introducing the syntactic dependency relationship so as to assist the relationship judgment classification comprises the following steps:

wherein ,representing weights +.>Representing bias, and judging and classifying entity relations by using probability distribution of graph classification;

the method for carrying out joint training on the output result of the span-based model and the output result of the graph-based model and identifying the entities and the relations among the entities contained in the data comprises the following steps:

wherein ,y_r For indicating variables, representing whether the predicted relationship category is the same as the sample category, r is the identity of the relationship; r is (r) _i,j Representing the relation between the ith entity and the jth entity, i and j representing serial numbers;

γ＝γ _e +γ _r +f(·)γ _g

2. The method for jointly extracting entity relations in the military field based on span and knowledge enhancement according to claim 1, wherein in step S2, when the data is manually marked, the entity position information, the entity type and the relation among the entities of the data are marked.

3. The method for jointly extracting entity relations in the military field based on span and knowledge enhancement according to claim 1, wherein in step S2, when the data is automatically marked by using a regular template, the relation between the entity types and the entities is preset, according to the field to which the data set belongs, the regular template is written by using knowledge written by field experts, and the relation between the entity types and the entities preset in the data is marked by means of template matching.

4. The method for jointly extracting entity relationships in the military field based on span and knowledge enhancement according to claim 1, wherein in step S3, a pre-training language model adopts a BERT model, and a vector representation of words is obtained through efficient coding of context information.

5. The method for jointly extracting the entity relationships in the military field based on span and knowledge enhancement according to claim 1, wherein the method for filtering the span by using a span filter is as follows: and identifying the span as the "none" type if the probability value of the "none" type is highest in the probability distribution of the entity type obtained based on the entity classifier, judging that the span is not an entity, and filtering the span.

6. The method for jointly extracting the entity relationships in the military field based on span and knowledge enhancement according to claim 5, wherein the method for judging the relationship types by using the relationship classifier is as follows: