CN113987201A - Zero-sample knowledge graph completion method based on ontology adapter - Google Patents

Zero-sample knowledge graph completion method based on ontology adapter Download PDF

Info

Publication number
CN113987201A
CN113987201A CN202111222330.1A CN202111222330A CN113987201A CN 113987201 A CN113987201 A CN 113987201A CN 202111222330 A CN202111222330 A CN 202111222330A CN 113987201 A CN113987201 A CN 113987201A
Authority
CN
China
Prior art keywords
adapter
ontology
language model
information
knowledge
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111222330.1A
Other languages
Chinese (zh)
Inventor
陈华钧
耿玉霞
庄祥
张文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202111222330.1A priority Critical patent/CN113987201A/en
Publication of CN113987201A publication Critical patent/CN113987201A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Animal Behavior & Ethology (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a zero sample knowledge graph completion method based on an ontology adapter, which comprises the following steps: acquiring ontology knowledge related to a zero sample knowledge graph completion task, wherein the ontology knowledge comprises entity type information, concept hierarchical relationship information, relationship type constraint information and relationship combinational logic constraint information; designing a body adapter for each type of ontology based on a neural network, adding the body adapter to a pre-training language model, and performing task training on the pre-training language model added with the body adapter corresponding to the type of the ontology by using each type of ontology to inject each type of ontology to obtain a pre-training language model introducing ontology; and performing a downstream zero-sample knowledge graph completion task by using a pre-training language model introducing ontology knowledge. The method is based on the pre-training language model with ontology knowledge enhancement, and the problem of completion of the knowledge graph under the condition of zero samples is better solved.

Description

Zero-sample knowledge graph completion method based on ontology adapter
Technical Field
The invention relates to the field of zero sample knowledge graph completion, in particular to a zero sample knowledge graph completion method based on an ontology adapter.
Background
In recent years, knowledge-graphs play an important role in artificial intelligence applications such as search engines, personal assistants, intelligent question answering, recommendation systems, data fusion and the like by virtue of strong knowledge representation and reasoning capabilities of the knowledge-graphs. Many prior knowledge-graphs are incomplete despite the large number of entities and relationship facts contained in the knowledge-graph. Based on this, the task of knowledge-graph completion (also known as link prediction) is presented to complete missing relationship facts (i.e., triples) in the graph. With the development of deep learning technology, many knowledge graph completion research works are mainly developed around the related technology of knowledge graph representation learning, the technology represents entities and relations in a graph in a low-dimensional vector space, and knowledge graph completion tasks are effectively and efficiently performed through some statistical calculation methods based on vectors of the entities and relations.
However, most of the existing representation learning methods are based on the assumption of a closed world, that is, the number of entities and relations contained in the knowledge graph is fixed, and the model can only learn their representations for the known entities and relations and predict and complement the triples missing from them. For newly added entities or relationships, the representation learning methods need to be retrained to obtain representations of new entities or new relationships. Given that many knowledge-graphs have the property of evolving quickly, i.e., new entities or relationships are continually emerging, it is impractical to continuously retrain the model and collect annotation data (i.e., relevant triples) for these new entities/relationships. Therefore, a Zero-sample Knowledge Graph Completion task (zsggc) was proposed to deal with the prediction problem of these new entities or new relationships without retraining the model.
Such methods typically use external information of new entities or new relationships, such as textual description information, to obtain representations of these new entities or new relationships, and make up for the problem of insufficient knowledge-graph structured triplet information based on information in the text domain. Based on the above, the existing research works introduce pre-training language models such as ELMo, GPT, BERT and the like, and by means of the powerful text coding capability, the context semantic capturing capability of words and the contained rich language background knowledge of the pre-training language models, the text description information of entities and relations is better utilized to help the completion of the zero-sample knowledge graph. Meanwhile, the textual representation enables the zero sample knowledge graph completion method based on the pre-training language model to simultaneously process new entities and new relationships appearing in the prediction process, and compared with the previous zero sample method, the zero sample knowledge graph completion method only can process the new entities (or the new relationships) and requires the relationships (or the entities) to be known.
However, the entity and relationship information contained in the text description is relatively limited and noise exists generally, and people hope to introduce external information with richer semantics to improve the capability of knowledge-graph completion under the condition of zero sample.
Disclosure of Invention
In view of the above, an object of the present invention is to provide a zero-sample knowledge graph completion method based on an ontology adapter, wherein ontology knowledge of a knowledge graph is injected into a pre-training language model through the ontology adapter, and the problem of completion of the knowledge graph under the condition of a zero sample is better solved based on the pre-training language model with enhanced ontology knowledge.
In order to achieve the purpose, the invention provides the following technical scheme:
a zero sample knowledge graph completion method based on an ontology adapter comprises the following steps:
acquiring ontology knowledge related to a zero sample knowledge graph completion task, wherein the ontology knowledge comprises entity type information, concept hierarchical relationship information, relationship type constraint information and relationship combinational logic constraint information;
designing a body adapter for each type of ontology based on a neural network, adding the body adapter to a pre-training language model, and performing task training on the pre-training language model added with the body adapter corresponding to the type of the ontology by using each type of ontology to inject each type of ontology to obtain a pre-training language model introducing ontology;
and performing a downstream zero-sample knowledge graph completion task by using a pre-training language model introducing ontology knowledge.
In one embodiment, the body adapter includes a plurality of adapter layers, each adapter layer including at least 2 full attention layers and at least 2 mapping layers.
In one embodiment, the pre-training language model is a pre-training language model comprising a full attention layer; and inserting an adapter layer of the body adapter after each full attention layer of the pre-trained language model, and simultaneously connecting the adapter layers, wherein the output of the last layer of the adapter layer is the output of the body adapter.
In one embodiment, when the category of the ontology is entity type information, an entity type adapter is designed for injecting the entity type information;
during task training, the entity type information is converted into natural language corpus, the converted natural language corpus is input into a pre-training language model added with an entity type adapter, original parameters of the pre-training language model are fixed, parameters of the entity type adapter are optimized through a mask language model, so that the entity type information is injected into the entity type adapter, and the pre-training language model introducing the entity type information is obtained.
In one embodiment, when the category of the ontology knowledge is concept hierarchical relationship information, a concept hierarchical adapter is designed for injecting the concept hierarchical relationship information; the task training is carried out by adopting the following two ways:
the first method is as follows: converting concept hierarchical relationship information into a corpus by adopting a sentence template, inputting the converted corpus into a pre-training language model added with a concept hierarchical adapter, fixing original parameters of the pre-training language model, and optimizing parameters of the concept hierarchical adapter through a mask language model so as to inject the concept hierarchical relationship information into the concept hierarchical adapter to obtain a pre-training language model introducing the concept hierarchical relationship information; and/or the presence of a gas in the gas,
the second method comprises the following steps: each concept is expressed as a sentence, the concepts with the hierarchical relationship jointly form a document with the context relationship, the document is converted into a corpus, the converted corpus is input into a pre-training language model added with a concept hierarchical adapter, the original parameters of the pre-training language model are fixed, the parameters of the concept hierarchical adapter are optimized through a lower sentence prediction method, so that the concept hierarchical relationship information is injected into the concept hierarchical adapter, and the pre-training language model with the concept hierarchical relationship information is obtained.
In one embodiment, when the category of the ontology knowledge is the relationship type constraint information, a relationship type constraint adapter is designed for injecting the relationship type constraint information;
during task training, the relation type constraint information with the graph structure characteristics is converted into a corpus, the converted corpus is input into a pre-training language model added with a relation type constraint adapter, parameters of the relation type constraint adapter are optimized through a mask language model and a structure information recovery method, so that the relation type constraint information is injected into the relation type constraint adapter, and the pre-training language model introduced with the relation type constraint information is obtained.
In one embodiment, the converting the relationship type constraint information with the graph structure feature into the corpus includes:
and aiming at the relation type constraint information with the graph structure characteristics, extracting one-hop subgraphs around each node in the graph structure, and generating an adjacent matrix and a node position matrix of each one-hop subgraph as a corpus.
In one embodiment, when the category of the ontology is the combinational logic constraint information of the relationship, in order to inject the combinational logic constraint information of the relationship, a combinational logic adapter is designed, and the task training is performed in the following two ways:
the first method is as follows: aiming at the combinational logic constraint information formed by a plurality of relations, converting the combinational logic constraint information into linguistic data by adopting a sentence template, inputting the converted linguistic data into a pre-training language model added with a combinational logic adapter, fixing the original parameters of the pre-training language model, and optimizing the parameters of the combinational logic adapter through a mask language model so as to inject the combinational logic constraint information of the relations into the combinational logic adapter to obtain the pre-training language model of the combinational logic constraint information of the introduced relations; and/or the presence of a gas in the gas,
and secondly, combining and splicing relations participating in combination to represent one sentence according to combined logic constraint information formed by a plurality of relations, representing the other sentence according to a semantic relation formed by combining the relations, converting the two sentences into a document with a context relation, inputting the converted corpus into a pre-trained language model added with a combined logic adapter, fixing original parameters of the pre-trained language model, and optimizing parameters of the combined logic adapter by a lower sentence prediction method to inject the combined logic constraint information of the relations into the combined logic adapter to obtain the pre-trained language model of the combined logic constraint information of the introduced relations.
And the sentence template is a residual relation combination splicing result obtained after a plurality of relation combinations are spliced.
In one embodiment, the performing a downstream zero-sample knowledge-graph completion task by using a pre-training language model introducing ontology comprises:
splicing the test triples into sentences according to the text description of the test triples, inputting the sentences into a pre-training language model with body knowledge introduced, coding by using the original parameters of the pre-training language model to obtain a first representation containing the language knowledge, and coding by using various body adapters to obtain a second representation containing various body knowledge;
and splicing the first representation and the second representation to be used as final representations of the test triples and inputting the final representations to a classifier, predicting whether the test triples are established by using the classifier, and completing the established test triples for the zero sample knowledge graph.
Compared with the prior art, the invention has the beneficial effects that at least:
(1) different from the application scene of the knowledge injection of the existing pre-training language model, the invention injects more types of richer ontology knowledge into the pre-training language model based on the ontology adapter, wherein the ontology knowledge comprises entity type information, concept hierarchy relation information, relation type constraint information and the combination logic constraint of the relation, so that the pre-training language model can provide language knowledge for a downstream zero-sample knowledge graph completion task and simultaneously provide richer ontology knowledge of the knowledge graph for the downstream task, thereby improving the zero-sample knowledge graph completion capability based on the pre-training language model.
(2) Different from the existing zero-sample knowledge graph completion method which can only process new entities or new relationships emerging during testing and requires that the relationships or the entities are known, the method provided by the invention can be used for processing scenes in which the new entities and the new relationships emerge simultaneously during testing, and the application scenes are wider.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a schematic flow chart of a zero-sample knowledge-graph completion method based on an ontology adapter according to an embodiment;
FIG. 2 is a schematic structural diagram of a pre-training language model of an external body adapter according to an embodiment;
FIG. 3 is a conceptual hierarchical relationship information diagram in a NELL-ZS dataset according to an embodiment;
FIG. 4 is a schematic diagram of relationship type constraint information in a NELL-ZS dataset according to an embodiment;
FIG. 5 is a diagram illustrating a procedure for generating a corpus of relationship type constraint information in a NELL-ZS dataset according to an exemplary embodiment;
FIG. 6 is a diagram illustrating an embodiment of an ontology adapter enhanced pre-trained language model for a downstream knowledge-graph completion task.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the detailed description and specific examples, while indicating the scope of the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention.
The invention is inspired by a knowledge graph abstract ontology layer, namely the knowledge graph usually comprises an abstract ontology layer, wherein the terms axiom and meta-information related to entities and relations in the knowledge graph are summarized. Information such as entity type information, a hierarchy of entity types and relations, type information of head and tail entities associated with relations, existence quantifier of entity types and the like are defined in the term public; the meta-information defines the text definition, supplementary information, description information, etc. of the entity type and relationship. The information can greatly summarize the characteristics of the entities and the relations in the knowledge graph and the semantic information contained in the characteristics, can be used for modeling the semantic relation between the known entities/relations and the new entities/relations, and brings richer external information for the entities/relations. Based on this heuristic, embodiments provide a zero-sample knowledge-graph completion method based on an ontology adapter.
The zero sample knowledge graph completion method based on the ontology adapter provided by the embodiment can be used for any knowledge graph with the ontology knowledge and entity and relationship text description information. Fig. 1 is a flowchart illustrating a zero-sample knowledge-graph completion method based on an ontology adapter according to an embodiment. As shown in fig. 1, the embodiment provides a zero-sample knowledge-graph completion method based on an ontology adapter, including the following steps:
step 1, acquiring ontology knowledge related to a zero sample knowledge graph completion task.
The ontology knowledge acquired by the embodiment is external information except text which can help the zero-sample knowledge graph completion task, and the specific ontology knowledge comprises entity type information, concept hierarchical relationship information, relationship type constraint information and relationship combinational logic constraint information.
The entity type information refers to type information Of an entity, the concept hierarchical relationship information includes entity type hierarchical information and relationship hierarchical information, the relationship type constraint information is head entity type information and tail entity type information which are associated through a relationship, and the combination logic constraint information Of the relationship refers to that part Of the relationship can be obtained by combining other relationships, for example, the relationship brother Of and the relationship parent Of can be combined to obtain the relationship uncle Of. The ontology knowledge can help guide the judgment of whether the triples are established or not so as to realize completion of the zero sample knowledge graph.
And 2, designing an ontology adapter for each type of ontology knowledge based on the neural network, adding the ontology adapter to the pre-training language model, and then training the ontology adapter to inject each type of ontology knowledge to obtain the pre-training language model introducing the ontology knowledge.
In the embodiment, a specific ontology adapter is designed for each type of ontology knowledge, a specific pre-training task of the ontology knowledge is designed, and parameters of the ontology adapter are optimized to inject the ontology knowledge into a pre-training language model, wherein the specific pre-training task related to the ontology knowledge is mainly embodied in a corpus conversion process and is realized by controlling a training method, so that the obtained pre-training language model introducing the ontology knowledge contains richer ontology knowledge and is better used for a downstream zero-sample knowledge graph completion task.
In an embodiment, the pre-trained language model is a pre-trained language model including an attention-only layer (i.e., a Transformer layer). The body adapter comprises a plurality of adapter layers, each adapter layer comprises at least 2 self-attention layers and at least 2 mapping layers, and preferably, each adapter layer comprises at least 2 self-attention layers and 2 mapping layers. As shown in fig. 2, when the ontology adapter is added to the pre-training language model, the adapter layer of the ontology adapter is inserted after each all-self-attention layer of the pre-training language model, the adapter layers are connected with each other, the output of the last adaptation layer is the output of the ontology adapter, when a pre-training task specific to ontology knowledge is performed, the output of an input sentence under the original parameters of the pre-training language model is spliced with the output of the ontology adapter to perform a target task, and thus, under the guidance of different pre-training tasks, different types of ontologies of different types are used for pre-training corresponding adapters to inject different types of ontology knowledge.
In the embodiment, when the type of the ontology knowledge is entity type information, an entity type adapter is designed for injecting the entity type information, and the entity type adapter adopts the ontology adapter structure.
During task training, the entity type information is converted into natural Language corpus, the converted natural Language corpus is input into a pre-training Language Model added with an entity type adapter, original parameters of the pre-training Language Model are fixed, parameters of the entity type adapter are optimized through a Mask Language Model (MLM), so that the entity type information is injected into the entity type adapter, and the pre-training Language Model with the entity type information introduced is obtained.
The entity type information can be converted into natural language corpora by using a sentence template, for example, for the entity a and the type A thereof, a text sentence of which the type of the entity a is A is obtained by using the sentence template, and a large number of text sentences form the corpora containing the entity and the type information thereof.
MLM is used to mask words in a text sentence randomly and encode sentence features in such a way that information of context around the words predicts the mask words, i.e. parameters of an entity type adapter are optimized according to the task of predicting the mask words using the information of context around the words to inject entity type information into the entity type adapter.
In the embodiment, when the type of ontology knowledge is concept hierarchical relationship information, a concept hierarchical adapter is designed for injecting the concept hierarchical relationship information, and the concept hierarchical adapter adopts the ontology adapter structure. The task training is carried out by adopting the following two ways:
the first method is as follows: converting concept hierarchical relationship information into a corpus by adopting a sentence template, inputting the converted corpus into a pre-training language model added with a concept hierarchical adapter, fixing original parameters of the pre-training language model, and optimizing parameters of the concept hierarchical adapter through a mask language model so as to inject the concept hierarchical relationship information into the concept hierarchical adapter to obtain a pre-training language model introducing the concept hierarchical relationship information;
the second method comprises the following steps: each concept is expressed as a sentence, the concepts with the hierarchical relationship jointly form a document with the context relationship, the document is converted into a corpus, the converted corpus is input into a pre-training language model added with a concept hierarchical adapter, the original parameters of the pre-training language model are fixed, the parameters of the concept hierarchical adapter are optimized through a lower sentence prediction method, so that the concept hierarchical relationship information is injected into the concept hierarchical adapter, and the pre-training language model with the concept hierarchical relationship information is obtained.
In the embodiment, different modes are adopted to convert the hierarchical relationship between concepts into the linguistic data capable of reflecting the hierarchical relationship constraint, and when the conversion modes are different, the corresponding pre-training task and the adopted pre-training mode are also different. When a concept (entity type or relationship) having a hierarchical relationship is textual by using a sentence template, for example, for two concepts a and B having a hierarchical relationship, a text sentence such as "a is a parent concept of B" is obtained by using the sentence template, and a large number of text sentences constitute a corpus in which a hierarchical relationship constraint can be embodied. Aiming at the linguistic data conversion mode, parameters of the concept level adapter are optimized by adopting MLM (multilevel M), namely, the parameters of the concept level adapter are optimized according to a task of predicting and covering words by using the information of the context around the words, so that the concept level relation information is injected into the concept level adapter.
When each concept is expressed as a sentence, a plurality of concepts having a hierarchical relationship collectively constitute a document having a contextual relationship. For such a corpus conversion mode, the corresponding pre-training task is to predict the lower Sentence according to the upper Sentence, that is, to optimize the parameters of the concept level adapter by using a method of Next sequence Prediction (Next sequence Prediction) so as to inject the concept level relationship information into the concept level adapter.
In the embodiment, when the type of the ontology knowledge is the relationship type constraint information, a relationship type constraint adapter is designed for injecting the relationship type constraint information, and the relationship type constraint adapter adopts the ontology adapter structure.
During task training, the relationship type constraint information with graph Structure characteristics is converted into linguistic data, the converted linguistic data is input into a pre-training language model added with a relationship type constraint adapter, parameters of the relationship type constraint adapter are optimized through a mask language model and a Structure information recovery (Structure retrieval) method, so that the relationship type constraint information is injected into the relationship type constraint adapter, and the pre-training language model introducing the relationship type constraint information is obtained.
For the relation type constraint information, the set training task is to predict the mask words according to the information using the context around the words and recover the structural information at the same time. Parameters of the relationship type constraint adapter are optimized by performing the training task to inject relationship type constraint information into the relationship type constraint adapter.
In an embodiment, converting the relationship type constraint information having the graph structure feature into the corpus includes:
and aiming at the relation type constraint information with the graph structure characteristics, extracting one-hop subgraphs around each node in the graph structure, and generating an adjacent matrix and a node position matrix of each one-hop subgraph as a corpus.
Because the relationship type constraint information has a graph structure, and nodes in the graph structure represent entity types, when the corpus is converted, for a certain node in the graph structure, a one-hop subgraph around the node is firstly extracted, the nodes and relationship edges in the one-hop subgraph are converted into a node sequence according to a certain sequence and are numbered, and then an adjacency matrix and a node position matrix are generated according to the node sequence and the corresponding number, wherein the adjacency matrix reserves the subgraph structure of the one-hop subgraph, and the node position matrix reserves the connection information of the nodes and the relationship edges. And the converted node sequence is used as a text to be injected into the adapter network for coding, and the coded nodes represent the connection information between the recovery nodes according to the node position matrix while training through a mask language model, namely whether three nodes in each row of the node position matrix form a triple is predicted.
In the embodiment, when the category of the ontology knowledge is the combinational logic constraint information of the relationship, a combinational logic adapter is designed for injecting the combinational logic constraint information of the relationship, and the combinational logic adapter adopts the ontology adapter structure. The task training is carried out by adopting the following two ways:
the first method is as follows: aiming at the combinational logic constraint information formed by a plurality of relations, converting the combinational logic constraint information into linguistic data by adopting a sentence template, inputting the converted linguistic data into a pre-training language model added with a combinational logic adapter, fixing the original parameters of the pre-training language model, and optimizing the parameters of the combinational logic adapter through a mask language model so as to inject the combinational logic constraint information of the relations into the combinational logic adapter to obtain the pre-training language model of the combinational logic constraint information of the introduced relations; and/or the presence of a gas in the gas,
in a second mode, for the combined logic constraint information formed by a plurality of relations, relations which meet the logic requirement are used as the relation combinations participating in the combination to be spliced to represent one sentence, and the semantic relations formed by combining the relations represent another sentence, for example, if a and B are a couple relation, and B is the father of C, the mother who a is C can be inferred; the two sentences form a document with context relation before and after, the document is converted into the linguistic data, the converted linguistic data is input into a pre-training language model added with a combinational logic adapter, the original parameters of the pre-training language model are fixed, the parameters of the combinational logic adapter are optimized through a lower sentence prediction method, the combinational logic constraint information of the relation is injected into the combinational logic adapter, and the pre-training language model of the combinational logic constraint information of the introduced relation is obtained.
In the embodiment, different modes are adopted to convert the hierarchical relationship between concepts into the linguistic data capable of reflecting the hierarchical relationship constraint, and when the conversion modes are different, the corresponding pre-training task and the adopted pre-training mode are also different. And converting the combinatorial logic constraint information into a corpus by adopting a sentence template aiming at the combinatorial logic constraint information formed by a plurality of relations, wherein the sentence template is a residual relational combination splicing result obtained after a plurality of relations are combined and spliced. For example, for three relationships r having a combined relationship1、r2、r3Using sentence templates to derive relationships such as "r1And r2After combination, the relation r can be obtained3"text sentenceA large number of text sentences form a corpus that embodies the relational combination constraint knowledge. The pre-training task corresponding to the conversion mode is to predict the mask words according to the information utilizing the context around the words, namely, parameters of the MLM optimization combinational logic adapter are adopted to inject the combinational logic constraint information of the relationship into the combinational logic adapter.
Aiming at the combined logic constraint information formed by a plurality of relations, the relations participating in combination are combined and spliced to represent one sentence, the semantic relation formed by combining the relations represents another sentence, and the two sentences form a document with context relation before and after so as to be converted into a corpus. For example, for three relationships r having a combined relationship1、r2、r3Will be a relation r1And r2Combined and spliced and represented as a sentence, and the relation r3Expressed as another sentence, the relation r1、r2And r3The combined relationship constraint of (1) constitutes the documents with context relationship together, so as to convert the documents into language materials. The pre-training task corresponding to the conversion mode is to predict the lower sentences according to the upper sentences, namely, the parameters of the combinational logic adapter are optimized by adopting a lower sentence prediction method, so that the combinational logic constraint information of the relationship is injected into the combinational logic adapter. The lower sentence prediction method is the same as a Mask Language Model (MLM), that is, two sentences are input, and whether the two sentences have a front-back logical dependency relationship is predicted.
And 3, performing a downstream zero-sample knowledge graph completion task by using the pre-training language model with the introduction of the ontology knowledge.
In the embodiment, the method for performing the downstream zero-sample knowledge graph completion task by using the pre-training language model introducing the ontology knowledge comprises the following steps:
splicing the test triples into sentences according to the text description of the test triples, inputting the sentences into a pre-training language model with body knowledge introduced, coding by using the original parameters of the pre-training language model to obtain a first representation containing the language knowledge, and coding by using various body adapters to obtain a second representation containing various body knowledge;
and splicing the first representation and the second representation to be used as final representations of the test triples and inputting the final representations to a classifier, predicting whether the test triples are established by using the classifier, and completing the established test triples for the zero sample knowledge graph.
As shown in fig. 6, the input sentences formed by splicing the test triples according to the text descriptions are input to the pre-training language model with ontology knowledge introduced, and the original parameter codes of the pre-training language model are used to obtain the first representation, s, containing language knowledgepRepresentation s encoded by entity type adapter, concept hierarchy adapter, relationship type constraint adapter and combinational logic adaptertype,shie,scons,scompThe second expression is S ═ Stype,shie,scons,scompIs the first representation spAnd the second representation is that after S splicing, the data are input into a second classifier to test whether the triple is established or not.
According to the zero sample knowledge graph completion method based on the ontology adapter, language background knowledge in a pre-training language model is reserved in an adapter mode, and meanwhile, knowledge with richer types is conveniently introduced. On the basis, the designed ontology adapter brings specific ontology knowledge of the knowledge graph for the pre-training language model, so that more external information is brought for a zero-sample knowledge graph completion task based on the pre-training language model, and the knowledge graph completion capability under the condition of zero samples is improved.
At present, many existing large-scale knowledge maps have the problem of incompleteness, such as Wikidata and DBpedia, and the e-commerce knowledge maps in the vertical field (the map scale can reach the billions), and the like. To solve such problems, a knowledge graph embedding technique is usually used to perform graph completion, i.e., completing one of the arbitrarily missing knowledge graph triples (head entity, relationship, tail entity). With the continuous evolution and expansion of the knowledge graph, for the entity or relationship newly added into the knowledge graph, the knowledge graph completion work is difficult to be carried out by retraining the knowledge embedding model in consideration of the scale of the entity or relationship. Therefore, zero-sample work facing knowledge graphs, especially large-scale knowledge graphs, is in urgent need of development. The method provided by the embodiment can be well applied to the E-commerce knowledge graph to complement the knowledge graph.
To better illustrate the effect of the zero-sample knowledge-graph completion method based on the ontology adapter provided in the above embodiment, the embodiment takes the knowledge graph NELL (new-encoding Language learning) and its zero-sample data set NELL-ZS, and the pre-training Language model BERT as an example.
In the zero-sample knowledge graph completion method, relevant ontology knowledge is firstly extracted from an ontology file/project of NELL, an open ontology project of NELL is stored in a csv file in an RDF triple form, and the csv file data consists of three columns and respectively corresponds to a head entity, a relation (also called a predicate) and a tail entity of the triple. The type information of the entity, the hierarchy information of the entity type and the hierarchy information of the relationship can be extracted through a predicate 'generations relations', the head entity type constraint information of the relationship can be extracted through a predicate 'domain', and the tail entity type constraint information can be extracted through a predicate 'range'. The hierarchical structure of the extracted entity type information is shown in fig. 3 (a), the hierarchical structure of the concept hierarchical relationship information is shown in fig. 3 (b), and the structure diagram formed by the relationship type constraint information is shown in fig. 4. Drawing combined pattern of structure drawing formed based on extracted relation type constraint information
Figure BDA0003313069130000151
The graph is searched for a set of relationships that satisfy the above constraints, i.e., the relationship r of the type of the head entity with B2In the case of a C-tailed entity, there is a relationship r3And r1Wherein r is1Is A, the tail entity type is B, and r3The head entity type of the system is A, and the tail entity type of the system is C. The extracted ontology knowledge is subsequently converted into corpora for training the ontology adapter.
Then, an adapter layer consisting of 2 all-attention layers and 2 mapping layers is inserted after each transform layer of the BERT, as shown in fig. 2. Based on the network structure, different types of ontology knowledge are injected respectively based on different pre-training corpora and pre-training tasks.
The corpus translation of entity type information, e.g. for entity atlanta _ hartsfield and its type information airport, a sentence template may be used to get a sentence, e.g. "the type of entity atlanta _ hartsfield is airport".
The corpus conversion of the concept hierarchical relationship information can be realized in two ways, for example, for the entity type hierarchical relationship shown in (a) in fig. 3, the method using the sentence template can represent the hierarchical constraint between the entity type building and the entity type hotel by a text sentence such as "building is a parent type of hotel". And each entity type is treated as a sentence based on sentence and document methods.
For example, for a structure diagram formed by the relationship type constraint information shown in fig. 4, for an entity type node country, a one-hop subgraph around the entity type node country can be converted into a single relationship diagram shown in fig. 5 (a) when a connection relationship is regarded as a node, a node sequence and numbering information obtained based on the single relationship diagram is shown in fig. 5 (b), and an adjacency matrix and a node position matrix which retain the structure of the single relationship diagram are generated, respectively shown in fig. 5 (c) and 5 (d).
The corpus transformation of the combinational logic constraint information of the relationship can be realized by two ways, such as the relationship parenthof, brotherOf and uncleOf with the combinational constraint
Figure BDA0003313069130000152
The sentence template can be used to obtain a text sentence, such as "the relation brotherOf and the parenthof can be combined to obtain the relation uncleOf", and the sentence and document based method can combine and splice the relation brotherOf and the parenthof into a sentence first, and the relation uncleOf is expressed as a sentence separately, and the two sentences have context constraints.
Finally, the triples (h, r, t) in the knowledge graph are spliced into sentences according to the text description thereof, and the sentences are processed by the characters [ SEP ]]And connecting, wherein the text description of the relationship can be obtained through a predicate 'description' in the ontology file, and the text description of the entity is the name information of the entity. Then, the obtained three-element sentences are inputObtaining two representations s by using pre-training language model enhanced by ontology adapterpAnd S ═ Stype,shie,scons,scompExpressing the sentence as spAnd S, splicing to obtain a final sentence expression: s ═ sp;stype;shie;scons;scomp]And simultaneously inputting a full connection layer, and training a classifier to judge whether the currently input triple is established. The specific flow is shown in fig. 6.
During testing, for a test triple containing a new entity and a new relation, inputting a pre-training language model enhanced by the body adapter in the same way to obtain sentence expression, judging whether the current triple is established or not through the two classifiers, and completing completion of a knowledge graph relation fact if the current triple is established.
The above-mentioned embodiments are intended to illustrate the technical solutions and advantages of the present invention, and it should be understood that the above-mentioned embodiments are only the most preferred embodiments of the present invention, and are not intended to limit the present invention, and any modifications, additions, equivalents, etc. made within the scope of the principles of the present invention should be included in the scope of the present invention.

Claims (10)

1. A zero sample knowledge graph completion method based on an ontology adapter comprises the following steps:
acquiring ontology knowledge related to a zero sample knowledge graph completion task, wherein the ontology knowledge comprises entity type information, concept hierarchical relationship information, relationship type constraint information and relationship combinational logic constraint information;
designing a body adapter for each type of ontology based on a neural network, adding the body adapter to a pre-training language model, and performing task training on the pre-training language model added with the body adapter corresponding to the type of the ontology by using each type of ontology to inject each type of ontology to obtain a pre-training language model introducing ontology;
and performing a downstream zero-sample knowledge graph completion task by using a pre-training language model introducing ontology knowledge.
2. The ontology-adapter-based zero-sample knowledge-graph completion method according to claim 1, wherein the ontology adapter comprises a plurality of adapter layers, each adapter layer comprising at least 2 self-attention layers and at least 2 mapping layers.
3. The ontology-adapter-based zero-sample knowledge-graph completion method according to claim 1 or 2, wherein the pre-trained language model is a pre-trained language model comprising a full attention layer; and inserting an adapter layer of the body adapter after each full attention layer of the pre-trained language model, and simultaneously connecting the adapter layers, wherein the output of the last layer of the adapter layer is the output of the body adapter.
4. The ontology-adapter-based zero-sample knowledge-graph completion method according to claim 1 or 2, wherein when the category of the ontology is entity type information, the entity type adapter is designed for injecting the entity type information;
during task training, the entity type information is converted into natural language corpus, the converted natural language corpus is input into a pre-training language model added with an entity type adapter, original parameters of the pre-training language model are fixed, parameters of the entity type adapter are optimized through a mask language model, so that the entity type information is injected into the entity type adapter, and the pre-training language model introducing the entity type information is obtained.
5. The ontology-adapter-based zero-sample knowledge-graph completion method according to claim 1 or 2, wherein when the category of the ontology knowledge is concept hierarchical relationship information, a concept hierarchical adapter is designed for injecting the concept hierarchical relationship information; the task training is carried out by adopting the following two ways:
the first method is as follows: converting concept hierarchical relationship information into a corpus by adopting a sentence template, inputting the converted corpus into a pre-training language model added with a concept hierarchical adapter, fixing original parameters of the pre-training language model, and optimizing parameters of the concept hierarchical adapter through a mask language model so as to inject the concept hierarchical relationship information into the concept hierarchical adapter to obtain a pre-training language model introducing the concept hierarchical relationship information; and/or the presence of a gas in the gas,
the second method comprises the following steps: each concept is expressed as a sentence, the concepts with the hierarchical relationship jointly form a document with the context relationship, the document is converted into a corpus, the converted corpus is input into a pre-training language model added with a concept hierarchical adapter, the original parameters of the pre-training language model are fixed, the parameters of the concept hierarchical adapter are optimized through a lower sentence prediction method, so that the concept hierarchical relationship information is injected into the concept hierarchical adapter, and the pre-training language model with the concept hierarchical relationship information is obtained.
6. The ontology-adapter-based zero-sample knowledge-graph completion method according to claim 1 or 2, wherein when the ontology class is the relationship type constraint information, the relationship type constraint adapter is designed for injecting the relationship type constraint information;
during task training, the relation type constraint information with the graph structure characteristics is converted into a corpus, the converted corpus is input into a pre-training language model added with a relation type constraint adapter, parameters of the relation type constraint adapter are optimized through a mask language model and a structure information recovery method, so that the relation type constraint information is injected into the relation type constraint adapter, and the pre-training language model introduced with the relation type constraint information is obtained.
7. The ontology-adapter-based zero-sample knowledge-graph completion method according to claim 6, wherein the converting the relationship type constraint information with graph structure features into corpora comprises:
and aiming at the relation type constraint information with the graph structure characteristics, extracting one-hop subgraphs around each node in the graph structure, and generating an adjacent matrix and a node position matrix of each one-hop subgraph as a corpus.
8. The ontology-adapter-based zero-sample knowledge graph completion method according to claim 1 or 2, wherein when the category of the ontology knowledge is the combinational logic constraint information of the relationship, in order to inject the combinational logic constraint information of the relationship, the combinational logic adapter is designed, and task training is performed in the following two ways:
the first method is as follows: aiming at the combinational logic constraint information formed by a plurality of relations, converting the combinational logic constraint information into linguistic data by adopting a sentence template, inputting the converted linguistic data into a pre-training language model added with a combinational logic adapter, fixing the original parameters of the pre-training language model, and optimizing the parameters of the combinational logic adapter through a mask language model so as to inject the combinational logic constraint information of the relations into the combinational logic adapter to obtain the pre-training language model of the combinational logic constraint information of the introduced relations; and/or the presence of a gas in the gas,
and secondly, combining and splicing relations participating in combination to represent one sentence according to combined logic constraint information formed by a plurality of relations, representing the other sentence according to a semantic relation formed by combining the relations, converting the two sentences into a document with a context relation, inputting the converted corpus into a pre-trained language model added with a combined logic adapter, fixing original parameters of the pre-trained language model, and optimizing parameters of the combined logic adapter by a lower sentence prediction method to inject the combined logic constraint information of the relations into the combined logic adapter to obtain the pre-trained language model of the combined logic constraint information of the introduced relations.
9. The ontology-adapter-based zero-sample knowledge-graph completion method according to claim 8, wherein the sentence template is a concatenation result of a plurality of relationship combinations therein after a remaining relationship combination is obtained.
10. The ontology-adapter-based zero-sample knowledge-graph completion method according to claim 1, wherein performing a downstream zero-sample knowledge-graph completion task using a pre-trained language model that introduces ontology knowledge comprises:
splicing the test triples into sentences according to the text description of the test triples, inputting the sentences into a pre-training language model with body knowledge introduced, coding by using the original parameters of the pre-training language model to obtain a first representation containing the language knowledge, and coding by using various body adapters to obtain a second representation containing various body knowledge;
and splicing the first representation and the second representation to be used as final representations of the test triples and inputting the final representations to a classifier, predicting whether the test triples are established by using the classifier, and completing the established test triples for the zero sample knowledge graph.
CN202111222330.1A 2021-10-20 2021-10-20 Zero-sample knowledge graph completion method based on ontology adapter Pending CN113987201A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111222330.1A CN113987201A (en) 2021-10-20 2021-10-20 Zero-sample knowledge graph completion method based on ontology adapter

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111222330.1A CN113987201A (en) 2021-10-20 2021-10-20 Zero-sample knowledge graph completion method based on ontology adapter

Publications (1)

Publication Number Publication Date
CN113987201A true CN113987201A (en) 2022-01-28

Family

ID=79739662

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111222330.1A Pending CN113987201A (en) 2021-10-20 2021-10-20 Zero-sample knowledge graph completion method based on ontology adapter

Country Status (1)

Country Link
CN (1) CN113987201A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115525773A (en) * 2022-10-10 2022-12-27 北京智源人工智能研究院 Training method and device of knowledge graph complement model
CN116306917A (en) * 2023-05-17 2023-06-23 卡奥斯工业智能研究院(青岛)有限公司 Task processing method, device, equipment and computer storage medium
CN116842109A (en) * 2023-06-27 2023-10-03 北京大学 Information retrieval knowledge graph embedding method, device and computer equipment

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115525773A (en) * 2022-10-10 2022-12-27 北京智源人工智能研究院 Training method and device of knowledge graph complement model
CN116306917A (en) * 2023-05-17 2023-06-23 卡奥斯工业智能研究院(青岛)有限公司 Task processing method, device, equipment and computer storage medium
CN116306917B (en) * 2023-05-17 2023-09-08 卡奥斯工业智能研究院(青岛)有限公司 Task processing method, device, equipment and computer storage medium
CN116842109A (en) * 2023-06-27 2023-10-03 北京大学 Information retrieval knowledge graph embedding method, device and computer equipment

Similar Documents

Publication Publication Date Title
CN112200317B (en) Multi-mode knowledge graph construction method
CN113987201A (en) Zero-sample knowledge graph completion method based on ontology adapter
CN108829722A (en) A kind of Dual-Attention relationship classification method and system of remote supervisory
CN111538848A (en) Knowledge representation learning method fusing multi-source information
CN108153864A (en) Method based on neural network generation text snippet
CN113761893B (en) Relation extraction method based on mode pre-training
CN113779220A (en) Mongolian multi-hop question-answering method based on three-channel cognitive map and graph attention network
CN112765956A (en) Dependency syntax analysis method based on multi-task learning and application
CN114489669A (en) Python language code fragment generation method based on graph learning
CN111639254A (en) System and method for generating SPARQL query statement in medical field
CN113312912A (en) Machine reading understanding method for traffic infrastructure detection text
CN114897167A (en) Method and device for constructing knowledge graph in biological field
CN114626368B (en) Method and system for acquiring rule common sense knowledge in vertical field
Ding et al. A Knowledge-Enriched and Span-Based Network for Joint Entity and Relation Extraction.
Nair et al. Knowledge graph based question answering system for remote school education
Xu et al. Enabling language representation with knowledge graph and structured semantic information
Jiang et al. A BERT-Bi-LSTM-Based knowledge graph question answering method
CN113010676B (en) Text knowledge extraction method, device and natural language inference system
CN115309886A (en) Artificial intelligent text creation method based on multi-mode information input
CN114385799A (en) Medical automatic question-answering method and system based on common sense fusion
CN113065324A (en) Text generation method and device based on structured triples and anchor templates
CN112464673A (en) Language meaning understanding method fusing semantic information
Yiming et al. Research on the Construction of Maritime Legal Knowledge Graph
Xie et al. Joint model of triple relation extraction with label embeddings
CN112884354B (en) Method for extracting event information in field of cosmetic safety supervision in double dimensions of words

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination