CN113254675B - Knowledge graph construction method based on self-adaptive few-sample relation extraction - Google Patents

Knowledge graph construction method based on self-adaptive few-sample relation extraction Download PDF

Info

Publication number
CN113254675B
CN113254675B CN202110808184.4A CN202110808184A CN113254675B CN 113254675 B CN113254675 B CN 113254675B CN 202110808184 A CN202110808184 A CN 202110808184A CN 113254675 B CN113254675 B CN 113254675B
Authority
CN
China
Prior art keywords
relation
adaptive
relationship
entities
similarity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110808184.4A
Other languages
Chinese (zh)
Other versions
CN113254675A (en
Inventor
孙喜民
周晶
毕立伟
李晓明
王帅
孙博
郑斌
刘丹
常江
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid E Commerce Co Ltd
State Grid E Commerce Technology Co Ltd
Original Assignee
State Grid E Commerce Co Ltd
State Grid E Commerce Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid E Commerce Co Ltd, State Grid E Commerce Technology Co Ltd filed Critical State Grid E Commerce Co Ltd
Priority to CN202110808184.4A priority Critical patent/CN113254675B/en
Publication of CN113254675A publication Critical patent/CN113254675A/en
Application granted granted Critical
Publication of CN113254675B publication Critical patent/CN113254675B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/126Character encoding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Animal Behavior & Ethology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a knowledge graph construction method based on self-adaptive few-sample relation extraction, which comprises the following steps of extracting the relation among entities by adopting a self-adaptive relation extraction model, wherein the construction of the self-adaptive relation extraction model comprises the following steps: s100: encoding the training set instance by using a text encoder to generate context semantics; s200: inputting the support set into a parameter generator to generate an initialization softmax parameter; s300: inputting the context semantics generated in the step S100 into an adaptive graph neural network, and updating the instance by using the adaptive graph neural network; s400: and carrying out classification prediction on the updated examples by using a softmax classifier, and acquiring the relationship type. According to the method and the device, a large amount of manual marking data is not needed when the relation is obtained, time and money consumption caused by a large amount of manual marking is avoided, and the relation extraction task in a specific field can be completed through a small amount of label data in the specific field.

Description

Knowledge graph construction method based on self-adaptive few-sample relation extraction
Technical Field
The invention belongs to the field of natural language processing, and particularly relates to a knowledge graph construction method based on self-adaption few-sample relation extraction.
Background
The knowledge graph is a series of different graphs for displaying the relation between the knowledge development process and the structure, describes knowledge resources and carriers thereof by using a visualization technology, and excavates, analyzes, constructs, draws and displays knowledge and the mutual relation between the knowledge resources and the carriers. In the prior art, the construction of a knowledge graph facing the general field is to form the knowledge graph by using an original unstructured text, and mainly comprises the following steps: (1) extracting entities, namely automatically identifying the entities from the unstructured text; (2) extracting the relationship, namely identifying the relationship between the entities; (3) entity linking, namely performing logic attribution and redundancy elimination on the extracted entities and relationship data; (4) and (4) knowledge reasoning, namely automatically reasoning the missing of the relation value according to the fact triple, and completing the knowledge graph.
The steps (1) and (2) both relate to an information extraction technology, the information extraction is an important component in natural language processing, and especially in the current information society, the extraction of useful information from mass data is significant. The information extraction can be divided into entity extraction, relation extraction, event extraction and the like. The relationship extraction task is generally in the form of given text and two entities involved in the text, and determines whether and what relationship exists between the entities. The relation extraction is not only an important link in knowledge map construction, but also widely used in technologies such as automatic question answering, automatic summarization, emotion analysis and the like.
The traditional supervised learning method has good effect on the relation extraction task, but in practical application, the relation extraction method based on supervised learning requires enough and completely labeled training data, but the work of labeling the data consumes a large amount of manpower and material resources, and is difficult to migrate to other fields. Therefore, it is necessary to research how to improve the relationship extraction performance by using little or even no annotation data.
In order to solve the problem of data requirement in supervised learning, a solution is a remote supervision method, the basic idea is to rely on the existing knowledge base, obtain a text containing an entity pair in the knowledge base from the text as a training corpus, and Mintz proposes an assumption, if a certain entity pair in the knowledge base has a certain relationship, all data containing the entity pair express the relationship. However, the remote supervision has the defect that the generated data has a large amount of noise data, and the problem of long tail of sample distribution cannot be solved essentially. The other solution idea is how to fully utilize a small amount of labeled samples for training, so that the model has better generalization capability, namely, learning of a small amount of samples.
At present, there are two main methods for extracting and learning the relation of few samples: metric learning and meta learning. Metric learning is the learning of a metric function through a priori knowledge, with which the input is mapped to a subspace such that pairs of similar and dissimilar data can be readily resolved, typically for classification problems. Meta-learning is mainly to optimize the strategy of finding the optimal parameters in the hypothesis space, for example, finding a suitable initial model parameter, and learning an optimizer to directly output the parameter update.
Graph neural networks are an emerging field in recent years, extend traditional neural networks to non-euclidean space, perform graph operations on graph structures, and have certain interpretable performance. The graph neural network takes structural information between the categories as a channel for information propagation, and can well extract the relationship between samples. The method simulates the corresponding association and distinguishing mechanism of the human brain in cognition, and acquires more auxiliary information about a new task, so that the problem of insufficient sample data is solved. The graph neural network can well capture the difference between the categories, and the category classification is convenient to realize.
Disclosure of Invention
The invention introduces the graph neural network into the extraction of the few-sample relationship and provides a knowledge graph construction method based on the self-adaption few-sample relationship extraction. The method can avoid time and money consumption caused by a large amount of manual labeling, can quickly complete the relation extraction task of the specific field through a small amount of label data of the specific field, and has good generalization performance on the unseen field.
The invention considers that the model forgets the old task in the migration from the old task to the new task and considers that a large number of labeled training samples are needed when the model carries out the new task training, applies the graph network neural to the multi-task problem, and realizes the rapid and accurate classification on the basis of providing only a small number of sample images without providing a large number of labeled training samples by utilizing the characteristic that the information in the graph neural network can be spread and aggregated among nodes.
The method for constructing the knowledge graph based on the self-adaptive few-sample relation extraction, provided by the embodiment of the invention, comprises the following steps:
automatically extracting entities from the acquired unstructured text;
extracting the relation between the entities by taking the original unstructured text and the identified entities as the input of a relation model;
performing entity linking based on the extracted entities and relationship data;
and automatically deducing the missing of the relation value according to the fact triple, and completing the knowledge graph.
The relational model is constructed as follows:
given a training set comprisingMA class under each of which isNEach instance comprises a sentence and a head entity and a tail entity of the sentence; randomly extracting M1 classes from the training set, and randomly extracting from each classKAn instance, constructing a support set
Figure 776036DEST_PATH_IMAGE001
Figure 916030DEST_PATH_IMAGE002
Figure 682998DEST_PATH_IMAGE003
(ii) a Remaining from each categoryN-KRandom sampling in one sampleLConstructing a query set by each instance;
s100: encoding the training set instance by using a text encoder to generate context semantics;
s200: inputting the support set into a parameter generator to generate an initialization softmax parameter;
s300: inputting the context semantics generated in the step S100 into an adaptive graph neural network, and updating the instance by using the adaptive graph neural network; the adaptive graph neural network is constructed as follows:
s310: constructing a point diagram, wherein nodes represent a feature vector of an example, and edges describe the similarity relation between the examples;
s320: constructing a distribution graph, wherein nodes represent the distribution of an example, and edges describe the similarity relation between the distribution and the distribution; the distribution refers to a vector formed by similarity relation between one example and all other examples;
s330: taking context relation semantics of the support set and the query set as feature vectors, initializing nodes of the point diagram, and initializing corresponding edges of the point diagram by using similarity among the nodes;
s340: initializing nodes of the distribution diagram by using the similar relation vectors of each instance in the support set and the query set, and initializing corresponding edges of the distribution diagram by using the similar relation between the nodes;
vector of similarity relationship
Figure 224837DEST_PATH_IMAGE004
Figure 333739DEST_PATH_IMAGE005
I.e. the first in the distribution diagramiA node; i | represents that, the cascade operation,
Figure 961029DEST_PATH_IMAGE006
and
Figure 361049DEST_PATH_IMAGE007
respectively show examplesiAnd examplesjA relationship category label of, if
Figure 226237DEST_PATH_IMAGE008
Then, then
Figure 771619DEST_PATH_IMAGE009
Otherwise
Figure 807577DEST_PATH_IMAGE010
S350: aggregating the similarity relation between the nodes in the point diagram and the node in the distribution diagram of the previous layer to serve as the updated distribution diagram node, and updating the edge of the distribution diagram;
s350: aggregating the similarity relation between each node in the updated distribution diagram and the corresponding node in the row-level point diagram to serve as the node of the updated point diagram, and updating the aggregation from the point diagram to the point diagram;
s400: and carrying out classification prediction on the updated examples by using a softmax classifier, and acquiring the relationship type.
Further, in step S100, the positions of the sentences and the head and tail entities in the example are encoded.
Further, encoding the positions of the sentences and the head and tail entities in the example, further comprises:
s110: mapping each word in the example sentence into a word vector;
s120: based on the word vectors, coding each word and the relative position of two entities of the sentence where the word is located respectively, and connecting the obtained coding vectors to obtain the position codes of the words;
s130: and inputting the examples and the position codes of the words in the examples into a text encoder to generate the context semantics of each example.
Further, step S200 further includes:
s210: dividing the support set instances according to the relation category;
s220: generating the weight and the bias corresponding to each relationship category by using the example under each relationship category;
s230: and the weights and the bias weights corresponding to all the relation categories form a weight vector and a bias vector, namely initializing the softmax parameter.
Further, in the sub-step S330, the similarity relationship between the nodes of the point diagram
Figure 322872DEST_PATH_IMAGE011
Wherein, in the step (A),
Figure 714670DEST_PATH_IMAGE012
node representing initialization
Figure 555587DEST_PATH_IMAGE013
And node
Figure 580306DEST_PATH_IMAGE014
The similarity relationship between the two components is similar,
Figure 40237DEST_PATH_IMAGE015
representing two layers of convolution-regularization-RELU network and sigmoid active layer;
in sub-step S340, the similarity relationship between nodes of the distribution diagram is used to describe the edges
Figure 411176DEST_PATH_IMAGE016
Figure 547628DEST_PATH_IMAGE017
The method comprises the following steps that (1) the method comprises a two-layer convolution-regularization-RELU network and a sigmoid activation layer;
Figure 371227DEST_PATH_IMAGE018
and
Figure 634850DEST_PATH_IMAGE019
all are existing neural networks.
The invention has the following characteristics and beneficial effects:
the invention not only improves the accuracy of relation extraction under specific tasks, but also improves the generalization performance of tasks which do not appear. A large amount of manual marking data is not needed when the relation is obtained, time and money consumption caused by a large amount of manual marking is avoided, and the relation extraction task of the specific field can be completed through a small amount of label data of the specific field.
The invention not only shows and considers the relationship between the examples, but also pays attention to the relationship between the example distribution and the example distribution, thereby better depicting the boundaries of different relationships and improving the discriminability of relationship representation under specific tasks. Meanwhile, because the input space of the natural language is shared among all NLP tasks, the adaptive method based on the meta-learning may generalize unseen tasks, i.e., relationship classes that do not appear in the training set may also be extracted.
Drawings
FIG. 1 is a detailed flow chart of relationship extraction in the embodiment.
Detailed Description
The following detailed description of embodiments of the invention refers to the accompanying drawings. It is to be understood that the specific embodiments described are merely a few examples of the invention and not all examples. All other embodiments, which can be derived by a person skilled in the art from the described embodiments without inventive step, are within the scope of protection of the invention.
The construction method of the knowledge graph uses the scenes that: the device for constructing the knowledge graph of the vertical field comprises a device for constructing the knowledge graph of the vertical field and a server, wherein the device for constructing the knowledge graph of the vertical field needs the device for constructing the knowledge graph to obtain a plurality of types of unstructured texts in the server, and the method for constructing the knowledge graph of the vertical field is further adopted to process the unstructured texts, so that the knowledge graph of the vertical field is constructed.
The execution subject of the method for constructing the knowledge graph can be a device for constructing the knowledge graph, and the device for constructing the knowledge graph can be realized by any software and/or hardware.
The embodiment of the invention discloses a knowledge graph construction method based on self-adaptive few-sample relation extraction, which comprises the following specific steps of:
and step one, extracting entities, namely automatically identifying the entities from the unstructured text.
This step is the automatic recognition of named entities from the original unstructured text. The embodiment adopts the LSTM-CRF technology, each word in the unstructured text is represented as a word embedding, the word embedding is used as the input of the LSTM model, and the prediction score of each word is output. The score of LSTM layer prediction is re-entered into the CRF layer. In the CRF layer, the tag sequence with the highest predicted score is selected as the best answer.
And step two, extracting the relationship, namely identifying the relationship among the entities.
The step is the key innovation of the knowledge graph construction method. Inputting the original unstructured text and the two entities identified in the step one into a trained adaptive relationship extraction model based on the distribution level relationship, and selecting the relationship category with the highest score from the classification scores output by the model as the relationship between the two entities. The detailed procedure of this step will be provided later.
And step three, entity linking, namely performing logic attribution and redundancy elimination on the extracted entities and relationship data.
And after acquiring the entities and the relationships among the entities from the original unstructured text, performing logic attribution and redundant error filtering on the entities and entity relationship data through entity links. And extracting the entity representation after the model is updated according to the adaptive relationship, and calculating the similarity between any two entities.
And fourthly, knowledge reasoning, namely automatically reasoning the relation value loss according to the fact triple and completing the knowledge graph.
In the step, the lost fact is automatically deduced according to the existing fact triple, the relation value loss between the knowledge graphs is processed, further knowledge discovery is completed, and the completion of the knowledge graphs is carried out. In this embodiment, a distributed inference model transit is adopted, and a relationship in each triple instance (head, relationship, tail) is regarded as a mapping from the head to the tail of the entity, and a condition h + r = t is satisfied, where h denotes a head entity vector, r denotes a relationship vector, and t denotes a tail entity vector. In the knowledge graph, if the head and tail entity vectors do not exist in the existing triples, the relation vectors are calculated through t-h, and the relation between the head and tail entities is obtained to supplement the knowledge graph.
Fig. 1 shows a detailed flow of relationship extraction in the embodiment, which includes the following specific processes:
the method comprises the steps of receiving original unstructured text, namely a relational data set, wherein the relational data set adopts a data set FewRel1.0, and the relational data set is formed by combining data according to relational categories. Extraction from relational data sets by relational categoryMThe individual relationship class data form a training set
Figure 594715DEST_PATH_IMAGE020
The remaining relationship category data constitutes a test set
Figure 403534DEST_PATH_IMAGE021
. Training set
Figure 980008DEST_PATH_IMAGE022
IncludedMA category, each category havingNAn instance, each instance
Figure 47322DEST_PATH_IMAGE023
Figure 861694DEST_PATH_IMAGE024
Is shown asiIn one example of the above-described method,
Figure 605528DEST_PATH_IMAGE025
the representation of a sentence is represented by,
Figure 403719DEST_PATH_IMAGE026
representing sentences
Figure 743565DEST_PATH_IMAGE027
The head entity of (a) is,
Figure 943602DEST_PATH_IMAGE028
representing sentences
Figure 359802DEST_PATH_IMAGE029
The tail entity of (1). To simulate the test-time scenario during the training period, from the training set
Figure 379711DEST_PATH_IMAGE030
In the random extraction
Figure 788827DEST_PATH_IMAGE031
Each class is randomly extracted from each classN1 instance constructs a support set, the first in the support setsEach element is marked as
Figure 843370DEST_PATH_IMAGE032
Figure 663428DEST_PATH_IMAGE033
As an example
Figure 796731DEST_PATH_IMAGE034
Corresponding relationship category labels. Remaining from each categoryN-NRandom sampling of 1 sampleN2 instances construct a query set
Figure 868592DEST_PATH_IMAGE035
Querying the first in the setqEach element is marked as
Figure 653009DEST_PATH_IMAGE036
Figure 988175DEST_PATH_IMAGE037
Is composed of
Figure 982676DEST_PATH_IMAGE038
Corresponding relation category markAnd (6) a label.
Firstly, a text encoder is used for encoding the instances in the training set to generate context semantics.
The coding in the step comprises coding sentences in the examples and entity positions in the sentences, and carrying out nonlinear combination on the sentence coding and the position coding. The specific method comprises the following steps:
in the present embodiment, for each example
Figure 982862DEST_PATH_IMAGE039
Figure 11998DEST_PATH_IMAGE024
Is shown asiAn example. Example sentence Using word2vec
Figure 924590DEST_PATH_IMAGE025
Each word in (1)
Figure 671966DEST_PATH_IMAGE040
Mapping into a word vector
Figure 977308DEST_PATH_IMAGE041
Figure 860950DEST_PATH_IMAGE042
Is the dimension of the word vector and,
Figure 944444DEST_PATH_IMAGE040
representing example sentences
Figure 913537DEST_PATH_IMAGE025
To (1) akThe number of the individual words,ksequentially taking 1, 2 and …KKAs sentences
Figure 255525DEST_PATH_IMAGE025
The number of words in. Will be provided with
Figure 196937DEST_PATH_IMAGE025
Each word in
Figure 841545DEST_PATH_IMAGE040
Respectively coding the relative positions of two entities (head entity and tail entity) of the sentence into two relative vectors, and connecting the two vectors to obtain the position code
Figure 173300DEST_PATH_IMAGE043
Figure 928766DEST_PATH_IMAGE044
Figure 424818DEST_PATH_IMAGE045
Being the dimension of the relative position vector, a connection of 2 relative position vectors, the dimension is
Figure 240327DEST_PATH_IMAGE046
. Here, the number of the first and second electrodes,
Figure 59379DEST_PATH_IMAGE040
the relative position with the sentence entity refers to:
Figure 618536DEST_PATH_IMAGE047
in sentences with entities
Figure 455911DEST_PATH_IMAGE025
The number of spaced words in (a).
By way of example
Figure 442322DEST_PATH_IMAGE048
The generated semantic representation of the context as input to the text encoder is noted
Figure 748669DEST_PATH_IMAGE049
. In the present embodiment, a transform model is used as a text encoder.
Second, support set
Figure 49201DEST_PATH_IMAGE050
The parameters are input into a parameter generator which is connected with a power supply,and generating a softmax parameter of the initialization generator under the current task.
This step further comprises the substeps of:
(1) will support the collection and press
Figure 616448DEST_PATH_IMAGE031
Each category is divided into a plurality of categories, and the example set of each category is recorded as
Figure 399859DEST_PATH_IMAGE051
Figure 52557DEST_PATH_IMAGE052
A label representing a category is attached to the content,
Figure 94462DEST_PATH_IMAGE053
namely the firstnA collection of class instances.
(2) For each instance under each category
Figure 781795DEST_PATH_IMAGE054
Performing nonlinear mapping weighted summation to obtain representation of each category
Figure 234642DEST_PATH_IMAGE055
Figure 374637DEST_PATH_IMAGE056
Showing examples
Figure 689075DEST_PATH_IMAGE057
Text encoder
Figure 434177DEST_PATH_IMAGE058
Recurrent neural network
Figure 933291DEST_PATH_IMAGE059
Then the outputs of all the examples of the nth class are weighted and summed and averaged,
Figure 186680DEST_PATH_IMAGE060
is a vector of the weights that is,
Figure 429443DEST_PATH_IMAGE061
is a bias vector.
Figure 966734DEST_PATH_IMAGE059
In particular to a multilayer perceptron and a tanh active layer with two layers,
Figure 636750DEST_PATH_IMAGE062
is the weight and bias of the linear layer in softmax. For the
Figure 875970DEST_PATH_IMAGE031
Individual class weight vector
Figure 922424DEST_PATH_IMAGE063
And an offset vector
Figure 314222DEST_PATH_IMAGE064
Respectively recording as:
Figure 155139DEST_PATH_IMAGE065
Figure 117541DEST_PATH_IMAGE066
and thirdly, using the output of the first step as input fine tuning to obtain the optimal parameters under the specific task by using the self-adaptive graph neural network based on the distribution level relation, wherein the current parameters can enable the graph model based on the distribution level relation to well classify the current task.
The self-adaptive graph neural network based on the distribution level relation is constructed as follows:
(1) constructing a point map
Figure 702106DEST_PATH_IMAGE067
Figure 276307DEST_PATH_IMAGE068
A point diagram of the l-th generation of examples is shown,
Figure 163492DEST_PATH_IMAGE069
representing a set of nodes, each node representing an instanceiThe feature vector of (2);
Figure 252670DEST_PATH_IMAGE070
representing a set of edges, each edge describing an instanceiAnd examplesjThe similarity relationship between them.
(2) Constructing a distribution map
Figure 499981DEST_PATH_IMAGE071
Figure 991005DEST_PATH_IMAGE072
The distribution diagram of the l generation is shown,
Figure 314670DEST_PATH_IMAGE073
representing a set of nodes, each node
Figure 625566DEST_PATH_IMAGE074
Showing an exampleiDistribution of (2), exampleiIs a multi-dimensional vector, whereinjDimension is a node in a point diagramiAnd nodejSimilar relationship of
Figure 178032DEST_PATH_IMAGE075
Node ofiRespectively solving the similarity relation with all the nodes in the point diagram to obtain an exampleiDistribution of (2).
Figure 523563DEST_PATH_IMAGE076
Representing a set of edges, each edge describing an instanceiAnd examplesjThe similarity relationship between the distributions of (c).
(3) Initializing a dot diagram:
for the initialization of the point diagram, extracting the context semantics corresponding to the instances in the support set and the query set, and initializing the nodes of the first generation point diagram by using the context semantics
Figure 752550DEST_PATH_IMAGE077
Then, the similar relation between the nodes is used for describing the edges
Figure 754004DEST_PATH_IMAGE078
Figure 484063DEST_PATH_IMAGE079
Is a two-layer convolution-regularization-RELU network and sigmoid active layer.
(4) Initializing a distribution diagram:
the purpose of the distribution map is to integrate the relationships between nodes to obtain relationships between distributions, so that each node of the distribution map is a feature vector of similar relationships of dimension M1 x N1jExample of the line description iAnd examplesjThe similarity relationship between them.
The nodes of the first generation profile are initialized as follows:
Figure 808734DEST_PATH_IMAGE080
(1)
in the formula (1), i represents that cascade operation,
Figure 333256DEST_PATH_IMAGE006
and
Figure 494110DEST_PATH_IMAGE007
respectively show examplesiAnd examplesjA relationship category label of, if
Figure 708502DEST_PATH_IMAGE008
Then, then
Figure 669505DEST_PATH_IMAGE009
Otherwise
Figure 317655DEST_PATH_IMAGE010
Describing edges using similarity relationships between nodes of a profile
Figure 123937DEST_PATH_IMAGE082
Figure 423200DEST_PATH_IMAGE017
Is a two-layer convolution-regularization-RELU network and sigmoid active layer.
(5) And aggregation and updating of the dot diagram to the distribution diagram.
For the profile of the l-th layer, the nodes are calculated as follows:
Figure 555104DEST_PATH_IMAGE083
(2)
which aggregates the relationship between each node in the point diagram
Figure 424971DEST_PATH_IMAGE075
And information of the node in the distribution map of the previous layer
Figure 300523DEST_PATH_IMAGE084
Figure 690178DEST_PATH_IMAGE085
The point diagram to distribution diagram propagation process is represented, and the point diagram to distribution diagram is a one-layer multi-layer perceptron network.
The edges in the profile are updated in a similar manner to the point map,
Figure 992984DEST_PATH_IMAGE086
(6) and aggregating and updating the updated distribution map to the point map.
For the firstlAnd (3) deducing node information in the next generation point diagram by using the distribution diagram, wherein the calculation process is as follows:
Figure 350147DEST_PATH_IMAGE087
(3)
which aggregates the relationships between each node in the profile
Figure 763811DEST_PATH_IMAGE088
And information of the node in the previous layer point diagram
Figure 772087DEST_PATH_IMAGE089
D2P shows the profile-to-profile propagation process, which is a one-layer fully connected layer and RELU active layer.TRepresenting the total number of instances in the support set and query set.
In the first placelLayer givenl-node representation and information coding information of any two nodes in a layer 1 (i.e. the upper layer) point diagram
Figure 449056DEST_PATH_IMAGE090
The updating method is as follows:
Figure 418149DEST_PATH_IMAGE091
note that normalization processing is performed here.
And fourthly, performing classification prediction on the classifier parameters based on the current classification task obtained in the second step by using the updated relation representation of each instance obtained in the third step, wherein the prediction result is the extracted relation type.
For the test specimen
Figure 510870DEST_PATH_IMAGE092
Figure 249019DEST_PATH_IMAGE093
Figure 519725DEST_PATH_IMAGE094
Is a graph neural network of the distribution level relations in the third step,
Figure 976114DEST_PATH_IMAGE095
is the classifier parameter under the current task.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.
Furthermore, it should be understood that although the present description refers to embodiments, not every embodiment may contain only a single embodiment, and such description is for clarity only, and those skilled in the art should integrate the description, and the embodiments may be combined as appropriate to form other embodiments understood by those skilled in the art.

Claims (4)

1. The method for constructing the knowledge graph based on the self-adaptive few-sample relation extraction comprises the steps of automatically extracting entities and the relation between the entities from the obtained unstructured text, carrying out entity linkage based on the extracted entities and relation data, and completing the knowledge graph; the method is characterized in that:
the relationship between the entities is extracted by adopting a self-adaptive relationship extraction model, and the self-adaptive relationship extraction model is constructed as follows:
given a training set comprisingMA class under each of which isNEach instance comprises a sentence and a head entity and a tail entity of the sentence; random extraction from training set
Figure DEST_PATH_IMAGE001
Each class is randomly extracted from each classKAn instance, constructing a support set
Figure 159481DEST_PATH_IMAGE002
Figure 670097DEST_PATH_IMAGE004
Figure 918676DEST_PATH_IMAGE006
(ii) a Remaining from each categoryN-KRandom sampling in one sampleLConstructing a query set by each instance;
s100: encoding the training set instance by using a text encoder to generate context semantics;
in step S100, the positions of sentences, head entities and tail entities in the example are coded;
the encoding of the positions of the sentences and the head and tail entities in the examples further comprises:
s110: mapping each word in the example sentence into a word vector;
s120: based on the word vectors, coding each word and the relative position of two entities of the sentence where the word is located respectively, and connecting the obtained coding vectors to obtain the position codes of the words;
s130: inputting the examples and the position codes of the words in the examples into a text encoder to generate context semantics of each example;
s200: inputting the support set into a parameter generator to generate an initialization softmax parameter;
s300: inputting the context semantics generated in the step S100 into an adaptive graph neural network, and updating the instance by using the adaptive graph neural network; the adaptive graph neural network is constructed as follows:
s310: constructing a point diagram, wherein nodes represent a feature vector of an example, and edges describe the similarity relation between the examples;
s320: constructing a distribution graph, wherein nodes represent the distribution of an example, and edges describe the similarity relation between the distribution and the distribution; the distribution refers to a vector formed by similarity relation between one example and all other examples;
s330: taking context relation semantics of the support set and the query set as feature vectors, initializing nodes of the point diagram, and initializing corresponding edges of the point diagram by using similarity among the nodes;
s340: initializing nodes of the distribution diagram by using the similar relation vectors of each instance in the support set and the query set, and initializing corresponding edges of the distribution diagram by using the similar relation between the nodes;
vector of similarity relationship
Figure DEST_PATH_IMAGE007
Figure 392514DEST_PATH_IMAGE008
I.e. the first in the distribution diagramiA node; the expression of | l is for the cascade operation,
Figure DEST_PATH_IMAGE009
and
Figure 269203DEST_PATH_IMAGE010
respectively show examplesiAnd examplesjA relationship category label of, if
Figure DEST_PATH_IMAGE011
Then, then
Figure 417898DEST_PATH_IMAGE012
Otherwise
Figure DEST_PATH_IMAGE013
S350: aggregating the similarity relation between the nodes in the point diagram and the node in the distribution diagram of the previous layer to serve as the updated distribution diagram node, and updating the edge of the distribution diagram;
s350: aggregating the similarity relation between each node in the updated distribution map and the corresponding node in the previous layer of point map to serve as the node of the updated point map, and updating the aggregation from the point map to the point map;
s400: and carrying out classification prediction on the updated examples by using a softmax classifier, and acquiring the relationship type.
2. The method for constructing a knowledge graph based on adaptive few-sample relationship extraction as claimed in claim 1, wherein:
step S200 further includes:
s210: dividing the support set instances according to the relation category;
s220: generating the weight and the bias corresponding to each relationship category by using the example under each relationship category;
s230: and the weights and the bias weights corresponding to all the relation categories form a weight vector and a bias vector, namely initializing the softmax parameter.
3. The method for constructing a knowledge graph based on adaptive few-sample relationship extraction as claimed in claim 1, wherein:
in substep S330, similarity between nodes of the point map
Figure 12827DEST_PATH_IMAGE014
Wherein, in the step (A),
Figure DEST_PATH_IMAGE015
node representing initialization
Figure 24777DEST_PATH_IMAGE016
And node
Figure DEST_PATH_IMAGE017
The similarity relationship between the two components is similar,
Figure 896918DEST_PATH_IMAGE018
representing a neural network;
in sub-step S340, the similarity relationship between nodes of the distribution diagram is used to describe the edges
Figure DEST_PATH_IMAGE019
Figure 952599DEST_PATH_IMAGE018
Representing a neural network.
4. The method for constructing a knowledge graph based on adaptive few-sample relationship extraction as claimed in claim 1, wherein:
the entity linking based on the extracted entities and the relationship data specifically comprises the following steps:
and (3) utilizing the updated entity representation of the adaptive relationship extraction model, calculating the similarity between any two entities, and combining the two entities with the similarity larger than a set threshold value.
CN202110808184.4A 2021-07-16 2021-07-16 Knowledge graph construction method based on self-adaptive few-sample relation extraction Active CN113254675B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110808184.4A CN113254675B (en) 2021-07-16 2021-07-16 Knowledge graph construction method based on self-adaptive few-sample relation extraction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110808184.4A CN113254675B (en) 2021-07-16 2021-07-16 Knowledge graph construction method based on self-adaptive few-sample relation extraction

Publications (2)

Publication Number Publication Date
CN113254675A CN113254675A (en) 2021-08-13
CN113254675B true CN113254675B (en) 2021-11-16

Family

ID=77180471

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110808184.4A Active CN113254675B (en) 2021-07-16 2021-07-16 Knowledge graph construction method based on self-adaptive few-sample relation extraction

Country Status (1)

Country Link
CN (1) CN113254675B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114095529B (en) * 2021-08-30 2022-08-16 云南大学 Knowledge graph-based industrial non-intelligent sensor self-adaptive access middleware and method thereof
CN113783876B (en) * 2021-09-13 2023-10-03 国网数字科技控股有限公司 Network security situation awareness method based on graph neural network and related equipment
CN114722823B (en) * 2022-03-24 2023-04-14 华中科技大学 Method and device for constructing aviation knowledge graph and computer readable medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7480640B1 (en) * 2003-12-16 2009-01-20 Quantum Leap Research, Inc. Automated method and system for generating models from data
CN109508385A (en) * 2018-11-06 2019-03-22 云南大学 A kind of character relation analysis method in web page news data based on Bayesian network

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7480640B1 (en) * 2003-12-16 2009-01-20 Quantum Leap Research, Inc. Automated method and system for generating models from data
CN109508385A (en) * 2018-11-06 2019-03-22 云南大学 A kind of character relation analysis method in web page news data based on Bayesian network

Also Published As

Publication number Publication date
CN113254675A (en) 2021-08-13

Similar Documents

Publication Publication Date Title
WO2023065545A1 (en) Risk prediction method and apparatus, and device and storage medium
CN112163426B (en) Relationship extraction method based on combination of attention mechanism and graph long-time memory neural network
US20210286989A1 (en) Multi-model, multi-task trained neural network for analyzing unstructured and semi-structured electronic documents
CN113254675B (en) Knowledge graph construction method based on self-adaptive few-sample relation extraction
CN109214006B (en) Natural language reasoning method for image enhanced hierarchical semantic representation
CN112560432A (en) Text emotion analysis method based on graph attention network
US20210034708A1 (en) Using neural network and score weighing to incorporate contextual data in sentiment analysis
CN112749274B (en) Chinese text classification method based on attention mechanism and interference word deletion
CN113343690B (en) Text readability automatic evaluation method and device
CN111274790A (en) Chapter-level event embedding method and device based on syntactic dependency graph
CN114911945A (en) Knowledge graph-based multi-value chain data management auxiliary decision model construction method
US20220067579A1 (en) Dynamic ontology classification system
CN112800229A (en) Knowledge graph embedding-based semi-supervised aspect-level emotion analysis method for case-involved field
CN111540470B (en) Social network depression tendency detection model based on BERT transfer learning and training method thereof
CN114925205B (en) GCN-GRU text classification method based on contrast learning
CN116402352A (en) Enterprise risk prediction method and device, electronic equipment and medium
CN113779988A (en) Method for extracting process knowledge events in communication field
Sokkhey et al. Development and optimization of deep belief networks applied for academic performance prediction with larger datasets
CN115659947A (en) Multi-item selection answering method and system based on machine reading understanding and text summarization
CN114880307A (en) Structured modeling method for knowledge in open education field
US11941360B2 (en) Acronym definition network
CN112069825A (en) Entity relation joint extraction method for alert condition record data
CN117216617A (en) Text classification model training method, device, computer equipment and storage medium
CN116432660A (en) Pre-training method and device for emotion analysis model and electronic equipment
CN114898426A (en) Synonym label aggregation method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant