CN115168620A

CN115168620A - Self-supervision joint learning method oriented to knowledge graph entity alignment

Info

Publication number: CN115168620A
Application number: CN202211098589.4A
Authority: CN
Inventors: 王永恒; 金雄男; 蒋雷; 王芷霖; 王超; 巫英才
Original assignee: Zhejiang University ZJU; Zhejiang Lab
Current assignee: Zhejiang University ZJU; Zhejiang Lab
Priority date: 2022-09-09
Filing date: 2022-09-09
Publication date: 2022-10-11

Abstract

The invention discloses a self-supervision joint learning method for knowledge graph entity alignment, which comprises the following steps: firstly, learning the image characteristics of an entity by using an image pre-training deep learning model, measuring the image similarity by calculating Euclidean distance between the image characteristics, and selecting an entity pair with the highest similarity as seed alignment; embedding the knowledge graph into a computer low-dimensional vector space by using multi-mode information of the knowledge graph based on a knowledge embedding model under the supervision of seed alignment in the step one; and step three, calculating multi-mode interaction similarity vectors of the entities based on the embedding of the knowledge graph in the step two, learning the weight of each mode through mode fusion and generating final entity embedding, calculating cosine distance based on the entity embedding to measure the alignment possibility between the entities, and outputting an alignment entity list. The invention carries out entity alignment in a self-supervision joint learning mode, does not need manual intervention in the whole process and ensures the expansibility of the system.

Description

Self-supervision joint learning method oriented to knowledge graph entity alignment

Technical Field

The invention belongs to the field of artificial intelligence, and relates to a self-supervision joint learning method for knowledge graph entity alignment.

Background

Enabling human knowledge to machines is an important research direction in the field of artificial intelligence. Knowledge maps are a form of expressing structured human knowledge, attracting widespread attention in academia and industry. With the development of knowledge modeling, extraction, and construction technologies, more and more knowledge graphs are constructed and exposed through the Web. Typical knowledge maps are DBpedia, YAGO and Wikidata. The knowledge graph is widely applied to artificial intelligent systems such as intelligent question answering, recommendation systems and information retrieval.

In knowledge graph-based applications, information interaction between knowledge graph systems is often required to obtain data or implement specific functions. Even in a single knowledge graph system, the information sources are typically from different domains. For example, assume that a cargo ship full of GPU chips suddenly disappears from the monitoring system. To assess potential impacts and locate cargo ships, knowledge maps in various fields are needed, such as companies, industries, logistics, satellites, drones, etc. However, cross-domain knowledge-graphs are typically heterogeneous. Within the same domain, knowledge graphs constructed by different organizations are also often heterogeneous. In addition, the complexity of human knowledge and the variability of subjective opinion of the world make it impractical to build an all-inclusive, unified knowledge map. Therefore, interoperability difficulties between knowledge graphs due to heterogeneity are ubiquitous.

Knowledge fusion is an effective method to solve the above problems. Knowledge fusion aims at establishing relationships between heterogeneous knowledge graphs so that the knowledge graphs can communicate and cooperate with each other. Knowledge fusion can be divided into body-layer fusion and physical-layer fusion according to the type of heterogeneity. In practical applications, since knowledge-graph entities are usually large in scale, entity-layer fusion has become a major task of knowledge fusion in recent years, especially entity alignment.

The goal of entity alignment is to establish equivalence relationships between entities, which are typically contained in different knowledge graphs. Studies of entity alignment can be broadly divided into three types: the technology of semantic network research community, the method of database research community and the method based on knowledge map embedding. The first two methods have a limitation in that they can only align entities if the knowledge graph contains a certain amount of attribute information. In contrast, the knowledge-graph-based embedding method is applicable to most scenarios and ensures competitive performance.

In academia, entity alignment based on embedding is a popular research topic. SEA, OTEA and AKE are embedded by knowledge graph based on knowledge triplets and aligned by transformation. IPTransE and RSN4EA employ path-based knowledge-graph embedding and perform alignment in a parameter sharing mode. MuGNN and KECG utilize neighbor-based embedding and perform alignment in a calibrated manner. Still other entity alignment methods may utilize auxiliary information such as attributes and text. However, these approaches employ supervised or semi-supervised learning based on manual labeling for entity alignment. Manual annotation is a time consuming and costly task that limits the applicability and scalability of the (semi-) supervised entity alignment approach.

More recently, researchers have proposed an automated, supervised entity alignment method that can generate seed alignments without manual intervention. The self-supervision entity alignment method EVA uses a pre-training image learning model to generate seed alignment, and then provides a joint learning method for embedding structure, image, relation and attribute information. However, the description and type information of the entity is ignored, resulting in limited alignment accuracy.

Disclosure of Invention

In order to solve the technical problems in the prior art, the invention provides a self-supervision joint learning method oriented to knowledge graph entity alignment, which effectively utilizes multi-mode information carried by a knowledge graph and combines a knowledge embedding model to automatically and efficiently fuse the knowledge graph, and the specific technical scheme is as follows:

a self-supervision joint learning method facing knowledge graph entity alignment comprises the following steps:

firstly, learning the image characteristics of an entity by using an image pre-training deep learning model, measuring the similarity of images by calculating Euclidean distance between the image characteristics, and selecting the entity pair with the highest similarity as a seed alignment according to the image similarity;

secondly, under the supervision of seed alignment in the first step, embedding the knowledge graph into a computer low-dimensional vector space by using structural information, entity description, attribute values and entity type information of the knowledge graph based on a graph convolution network, a language pre-training model and a hierarchical type embedding model;

and step three, based on the embedding of the knowledge graph in the step two, calculating similarity vectors of paired description interaction, neighbor description interaction, attribute interaction and type interaction of the entities, learning the weight of each modality through modality fusion, generating final entity embedding, finally, calculating cosine distance based on the entity embedding to measure the alignment possibility between the entities, and outputting an alignment entity list.

Further, the first step specifically comprises: adopting a ResNet-152 model pre-trained on a recognition task of an ImageNet image database as an image pre-training deep learning model, extracting and outputting the characteristics of each entity image from a first layer of the model, and measuring the similarity between the image pairs of the entities by utilizing the Euclidean distance between the extracted characteristics; and finally, selecting the top-k pair entity with the highest similarity as seed alignment according to the image similarity.

Further, the second step specifically includes the following substeps:

step (2.1) embedding the entities by using a language pre-training model based on the character description and attribute information of the entities, so that the entities with similar character description are adjacent in a vector space;

step (2.2) based on the structure information of the knowledge graph, the graph convolution network is utilized to embed the knowledge graph for enhancing the neighbor information;

and (2.3) based on the type information of the entities, embedding the models through the hierarchical types to enable the entities with similar types to be adjacent in a vector space.

Further, the step (2.1) specifically comprises:

first, a training data set is constructed based on seed alignment

，

Is an entity

The corresponding entities, i.e. similar entities,

is randomly selected from

The dissimilar entities of (a);

then, the text description of the entity is led into a language pre-training model BERT for fine adjustment of knowledge graph embedding;

finally, the CLS embedding of the language pre-training model BERT is filtered by utilizing a multi-layer perceptronIn, obtaining the embedding of entities

CLS is a special class label used in model BERT;

pair of margin classification loss functions for use in fine tuning

The following were used:

wherein the content of the first and second substances,

is obtained by

And

in between

The distance is initialized and the distance is initialized,

∈

or

，

Representing the margin employed between a similar entity pair and a dissimilar entity pair,

smaller means that the two entities are more similar in describing the embedding.

Further, the step (2.2) is specifically:

based on a graph convolution network GCN, an entity is embedded into aggregation neighbor entity information in an information propagation mode, and the information propagation mode has the following rules:

wherein the content of the first and second substances,

representing the adjacency matrix with the self-join added,

is a matrix of units, and is,

and

meaning that the layer-specific trainable weight matrix,

it is indicated that an activation function is to be used,

and (4) representing the activation matrix at the l-th layer, wherein N represents the number of entities, and D represents the dimension of an entity vector.

Further, the step (2.3) is specifically:

perceiving distance functions by hierarchical type

Calculating the distance between the two entities in the type embedding space, and learning the vector representation of the entities in the type embedding space based on a hierarchical type embedding model HTE;

the hierarchical type perceptual distance function is as follows:

wherein the content of the first and second substances,

is a parameter of the distance to be measured,

indicating the similarity of the type between the two entities,

calculating the similarity of the type through the information content of the nearest common father type by using the similarity value of the normalized type, wherein the similarity value of the type is as follows:

wherein the content of the first and second substances,

is shown at the same time as

And

of the set of parent classes of (a) is,

meaning the value of the information content of type t, the more specific the information content, the higher the value of the information content.

Further, the third step specifically includes the following substeps:

step (3.1) based on the entity text description embedding of step (2.1), calculating the cosine distance between two entity vectors, and generating a one-dimensional similarity vector for each entity, namely calculating to obtain a pair description similarity vector;

step (3.2) based on the entity neighbor information embedding of step (2.2), calculating a neighbor description similarity vector;

step (3.3) based on the entity attribute information embedding of step (2.1), calculating an attribute similarity vector;

step (3.4) calculating the cosine distance between two entity vectors based on the entity type embedding in step (2.3), and generating a one-dimensional similarity vector for each entity, namely calculating to obtain a type similarity vector;

step (3.5) under the supervision of alignment in the step one, jointly learning pairwise description, neighbor description, attribute and type similarity to generate the similarity between the final entity pair, and finding out a corresponding aligned entity based on the entity similarity;

and (3.6) selecting a candidate entity with the highest alignment probability for each entity by adopting a greedy alignment strategy to generate an aligned entity pair, deleting the aligned entities from the corresponding knowledge graph, and repeating the step (3.6) until the entity in one knowledge graph is empty.

Further, the step (3.2) is specifically: first of all, the structure

The value of the matrix represents the ith neighbor of the entity e and the entity

The cosine similarity of the jth neighbor of (1); secondly, applying a max-posing technology to each row and each column of the neighbor matrix to select the most relevant neighbor pair; then, extracting the characteristics of the collected similarity through an RBF kernel aggregation function; finally, row and column aggregation vectors are concatenated to represent the neighbor description similarity vector.

Further, the step (3.3) is specifically: first of all, is constructed

The value of the matrix represents the ith attribute of the entity e and the entity

The cosine similarity of the jth attribute of (1); secondly, applying max-pooling and RBF kernel aggregation functions to each row and column of the attribute matrix; finally, row and column aggregation vectors are concatenated to represent the attribute similarity vector.

Further, the specific process of the step (3.5) is as follows: based on entity similarity vector

Learning alignment possibilities between two entities through a multi-layered perceptron

，

The greater the value of (a), the greater the probability that two entities are aligned, wherein

The vector is generated by fusing the vectors of the steps (3.1) to (3.4) in a weight concatenation mode.

Has the beneficial effects that:

the invention aims to solve the problem of how to break through the constraint of manual labeling and fuse large-scale knowledge maps, and aims to provide an entity alignment-oriented self-supervision joint learning method, fully and effectively utilize multi-mode information carried by the knowledge maps, combine with a knowledge embedding model, automatically and efficiently fuse the knowledge maps, perform entity alignment in a self-supervision joint learning mode, and ensure the expansibility without manual intervention in the whole process.

Drawings

FIG. 1 is a schematic flow chart of an adaptive supervised joint learning method for knowledge-graph entity alignment according to the present invention;

FIG. 2 is a schematic diagram of an adaptive supervised joint learning framework for knowledge-graph entity alignment according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of the type hierarchy of the knowledge graph Dbpedia according to an embodiment of the invention;

fig. 4 is a schematic structural diagram of an automatic supervision joint learning foolproof system facing knowledge-graph entity alignment according to the embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and technical effects of the present invention more clearly apparent, the present invention is further described in detail below with reference to the accompanying drawings and examples.

As pre-trained language models develop, textual descriptions can play an important role in entity alignment, and furthermore, entities of disparate types are less likely to be aligned entities, and types in a knowledge graph are typically hierarchical, meaning that similarities between types are measurable.

As shown in fig. 2, the entity alignment framework based on knowledge graph embedding mainly includes a seed alignment generation module, a knowledge graph embedding module, and an alignment interaction module. The seed alignment generation module collects initial alignment labels, namely, an initial entity pair set with the same meaning is generated; the knowledge map embedding module maps the high-dimensional knowledge map to a low-dimensional computing space for computer processing, and the embedding process is executed under the supervision of seed alignment; the alignment interaction module measures similarity between the entity pairs based on the embedding and generates a final alignment list.

As shown in fig. 1, the invention provides an entity alignment-oriented self-supervision joint learning method, which adopts a seed alignment generation module to automatically generate an entity pair with the same semantics and high reliability as seed alignment according to an image of an entity; mapping high-dimensional map data to a low-dimensional vector space which is easy to process by a computer under the supervision of automatically generated seed alignment by using a knowledge map embedding module; judging whether a pair of entities in different knowledge graphs are the same or not by adopting an alignment interaction module according to embedding, wherein the alignment interaction module is an interaction model based on an alignment strategy, and the alignment strategy guides the sequence of an alignment process, and the method specifically comprises the following steps:

step one, learning the image characteristics of an entity by using an image pre-training deep learning model, measuring the similarity of images by calculating Euclidean distances between the image characteristics, and selecting the entity pair with the highest similarity as a seed alignment according to the image similarity.

Specifically, at the seed alignment generation module, the image features are learned by using a pre-trained deep learning model to measure the similarity of the images. Specifically, the entity alignment Self-supervision joint learning method Self-EA adopts a ResNet-152 model which is trained in advance on the recognition task of an ImageNet image database. ImageNet is one of the most well-known image databases organized according to the WordNet hierarchy, containing 1400 more than ten thousand images. The ResNet-152 model is a deep residual learning network of depth 152 layers. Performing a simple feed-forward process on the ResNet-152 model and extracting the output from the first layer as features for each entity image; then, measuring the similarity between the image pairs of the entities by using the Euclidean distance between the extracted features; and finally, selecting the top-k pair entity with the highest similarity as seed alignment according to the image similarity.

WordNet is an english dictionary based on cognitive linguists designed jointly by psychologists, linguists and computer engineers at Princeton university. WordNet differs most significantly from other standard dictionaries in that: it divides vocabulary into five major classes: nouns, verbs, adjectives, adverbs, and fictitious words. In fact, wordNet contains only nouns, verbs, adjectives and adverbs. The null words are usually part of the syntactic component of the language, and WordNet ignores the smaller set of null words in english. WordNet differs from general dictionaries in the organization structure, which is organized with synonym set (sync) as the basic unit of construction, and a user can find a suitable word in the synonym set to express a known concept.

Step two, embedding the knowledge graph into a low-dimensional vector space which is easy to process by a computer by using the structure information, the entity description, the attribute value and the entity type information of the knowledge graph in a knowledge graph embedding module based on a graph convolution network, a language pre-training model and a hierarchical type embedding model, and simultaneously keeping the original information of the entity as much as possible, wherein the method specifically comprises the following substeps:

step (2.1) based on the text description and attribute of the entity, the entity is embedded by using a language pre-training model, so that entities with similar text description are adjacent in a vector space, and the specific process is as follows:

first, a training data set is constructed based on seed alignment

，

Is an entity

The corresponding entities, i.e. similar entities,

is randomly selected from

The dissimilar entities of (a);

finally, using the multilayer perceptron to filter CLS embedding of the language pre-training model BERT to obtain embedding of the entity

CLS is a special class label used in model BERT;

pair of margin classification loss functions for use in fine tuning

The following were used:

wherein the content of the first and second substances,

is obtained by

And

in between

The distance is initialized and the distance is initialized,

smaller means that the two entities are more similar in describing embeddings.

Step (2.2) based on the structure information of the knowledge graph, the graph convolution network is utilized to embed the knowledge graph for enhancing the neighbor information, and the specific process is as follows:

wherein the content of the first and second substances,

representing the adjacency matrix with the self-join added,

is a matrix of units, and is,

and

meaning that a layer-specific trainable weight matrix is constructed,

an activation function is indicated. In addition, the first and second substrates are,

And (2.3) based on the type information of the entities, embedding a hierarchical type embedded model HTE to ensure that the entities with similar types are adjacent in a vector space, wherein the specific process comprises the following steps:

perceiving distance function by hierarchy type

Calculating the distance between the two entities in the type embedding space, and learning the vector representation of the entities in the type embedding space based on a hierarchical type embedding model HTE; the hierarchical type perceptual distance function is as follows:

wherein the content of the first and second substances,

is a parameter of the distance to be measured,

indicating the similarity of the type between the two entities,

wherein, the first and the second end of the pipe are connected with each other,

is shown at the same time as

And

the set of parent classes of (a) is,

As shown in fig. 3, the knowledge graph DBpedia type hierarchy part structure, the relationship between types in the graph is subclass, the numerical value represents the information content value, and the more abstract the type is, the lower the information content value is; conversely, the more specific the type, the higher the information content value. For example, the closest common parent of the sports league and the samba school is an organization, so the similarity of the type before generalization is 0.531 of the information content of the organization, the closest common parent of the sports league and the morba school is the broadest owl: thing (something common), and the information content is 0.

Thirdly, calculating similarity vectors of pairwise description interaction, neighbor description interaction, attribute interaction and type interaction of entities based on embedding of the knowledge graph in the second step in an alignment interaction module, learning the weight of each modality through modality fusion, generating final entity embedding, calculating cosine distance based on the entity embedding to measure alignment possibility between the entities, and outputting an alignment entity list, wherein the alignment interaction module specifically comprises the following substeps:

and (3.1) based on the entity text description embedding in the step (2.1), calculating the cosine distance between two entity vectors, and generating a one-dimensional similarity vector for each entity, namely calculating to obtain a pair description similarity vector.

Step (3.2) based on the entity neighbor information embedding of step (2.2), calculating the neighbor description similarity vector, the concrete process is: first of all, is constructed

And (3.3) based on the entity attribute embedding in the step (2.1), calculating an attribute similarity vector, wherein the specific process is as follows: first of all, is constructed

The value of the matrix represents the ith attribute of e

The cosine similarity of the jth attribute of (1); secondly, applying max-pooling and RBF kernel aggregation functions to each row and column; finally, row and column aggregation vectors are concatenated to represent the attribute similarity vector.

And (3.4) calculating the cosine distance between two entity vectors based on the entity type embedding in the step (2.3), and generating a one-dimensional similarity vector for each entity, namely calculating to obtain a type similarity vector.

And (3.5) under the supervision of seed alignment, jointly learning pairwise description, neighbor description, attribute and type similarity to generate the similarity between the final entity pairs, and finding out the corresponding aligned entity based on the entity similarity, wherein the specific process comprises the following steps of: vector based on entity similarity

，

And (3.6) selecting a candidate entity with the highest alignment probability for each entity by adopting a greedy alignment strategy to generate an aligned entity pair, deleting the aligned entities from the corresponding knowledge graph, and repeating the step (3.6) until the entity in one graph is empty.

The invention fully utilizes the self-carried information of the knowledge graph, carries out entity alignment in a self-supervision joint learning mode, does not need manual intervention in the whole process, and ensures the expansibility of the knowledge graph. The experimental result of the data set DBP15K aligned in the reference cross-language entity shows that in the knowledge map fusion task aiming at Chinese-English, japanese-English and French-English, the alignment accuracy (Hit @ 1) of the invention reaches 96.2%, 97.0% and 99.4%. Compared with the latest self-supervision entity alignment method EVA, the alignment accuracy of the invention is improved by 23.5% on average. The alignment accuracy of the present invention is even slightly higher (0.07%) than the latest supervised learning methods based on manual labeling. This result demonstrates the high accuracy and expandability of the present invention.

The average image coverage of the entities in the reference data set DBP15K is 71.3%, which provides a good condition for the automatic seed alignment based on image collection of the invention, however, the invention can also show excellent performance when the knowledge graph only contains a few images, and the experimental result shows that the alignment accuracy of the invention can reach 93.4% as long as there are 800 images in Japanese-English setting; in French-English setting, as long as 100 images exist, the alignment accuracy of the method can reach 97.8 percent; under the condition that the knowledge graph has no image, the method can also carry out entity alignment by inputting a manual marking seed alignment mode, and achieves excellent alignment accuracy. Specifically, when the alignment of the artificial seeds is 800, the alignment accuracy of the method is 96% in the Japanese-English setting; the alignment accuracy of the present invention was 97.8% when the artificial seed alignment was 100 in the french-english setting, which demonstrates the versatility of the present invention.

Corresponding to the embodiment of the self-supervision joint learning method facing the knowledge-graph entity alignment, the invention also provides an embodiment of a self-supervision joint learning device facing the knowledge-graph entity alignment.

Referring to fig. 4, an embodiment of the present invention provides an apparatus for self-supervised joint learning for knowledge-graph entity alignment, which includes one or more processors, and is configured to implement a method for self-supervised joint learning for knowledge-graph entity alignment in the foregoing embodiment.

The embodiment of the invention, which is directed to an automatic supervision joint learning device for knowledge graph entity alignment, can be applied to any equipment with data processing capability, such as computers and other equipment or devices. The device embodiments may be implemented by software, or by hardware, or by a combination of hardware and software. The software implementation is taken as an example, and as a logical device, the device is formed by reading corresponding computer program instructions in the nonvolatile memory into the memory for running through the processor of any device with data processing capability. From a hardware aspect, as shown in fig. 4, the hardware structure diagram of an arbitrary device with data processing capability where an automatic supervision joint learning apparatus aligned to a knowledge graph entity is located according to the present invention is shown, except for the processor, the memory, the network interface, and the nonvolatile memory shown in fig. 4, in an embodiment, an arbitrary device with data processing capability where an apparatus is located may also include other hardware according to an actual function of the arbitrary device with data processing capability, which is not described again.

The specific details of the implementation process of the functions and actions of each unit in the above device are the implementation processes of the corresponding steps in the above method, and are not described herein again.

For the device embodiments, since they substantially correspond to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points. The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the scheme of the invention. One of ordinary skill in the art can understand and implement it without inventive effort.

An embodiment of the present invention further provides a computer-readable storage medium, on which a program is stored, where the program, when executed by a processor, implements a self-supervised joint learning method oriented to knowledge-graph entity alignment in the foregoing embodiments.

The computer readable storage medium may be an internal storage unit, such as a hard disk or a memory, of any data processing capability device described in any of the foregoing embodiments. The computer readable storage medium may also be an external storage device such as a plug-in hard disk, a Smart Media Card (SMC), an SD Card, a Flash memory Card (Flash Card), etc. provided on the device. Further, the computer readable storage medium may include both internal storage units and external storage devices of any data processing capable device. The computer-readable storage medium is used for storing the computer program and other programs and data required by the arbitrary data processing-capable device, and may also be used for temporarily storing data that has been output or is to be output.

The above description is only a preferred embodiment of the present invention, and is not intended to limit the present invention in any way. Although the foregoing has described the practice of the present invention in detail, it will be apparent to those skilled in the art that modifications may be made to the practice of the invention as described in the foregoing examples, or that certain features may be substituted in the practice of the invention. All changes, equivalents and modifications which come within the spirit and scope of the invention are desired to be protected.

Claims

1. A self-supervision joint learning method facing knowledge graph entity alignment is characterized by comprising the following steps:

2. The method of claim 1, wherein the first step is specifically: adopting a ResNet-152 model pre-trained on an identification task of an ImageNet image database as an image pre-training deep learning model, extracting and outputting characteristics of each entity image from a first layer of the model, and measuring the similarity between the image pairs of the entities by utilizing the Euclidean distance between the extracted characteristics; and finally, selecting the top-k pair entity with the highest similarity as seed alignment according to the image similarity.

3. The method for self-supervised joint learning oriented to knowledge-graph entity alignment as claimed in claim 2, wherein the second step specifically comprises the following sub-steps:

4. The method for self-supervised joint learning oriented to knowledge-graph entity alignment as claimed in claim 3, wherein the step (2.1) is specifically as follows:

first, a training data set is constructed based on seed alignment

，

Is an entity

The corresponding entities, i.e. similar entities,

is randomly selected from

The dissimilar entities of (a);

CLS is a special class label used in model BERT;

pair of margin classification loss functions for use in fine tuning

The following:

is obtained by

And with

In between

The distance is initialized and the distance is initialized,

∈

or

，

Represents the margin employed between a similar entity pair and a dissimilar entity pair,

the smaller the representation of two entitiesThe more similar the embedding is described.

5. The method for self-supervised joint learning oriented to knowledge-graph entity alignment as claimed in claim 4, wherein the step (2.2) is specifically as follows:

wherein the content of the first and second substances,

representing the adjacency matrix with the self-join added,

is a matrix of units, and is,

and with

Meaning that the layer-specific trainable weight matrix,

it is an indication of an activation function,

6. The method for self-supervised joint learning oriented to knowledge-graph entity alignment as claimed in claim 5, wherein the step (2.3) is specifically as follows:

distance perception by hierarchical typeFunction of departure

the hierarchical type perceptual distance function is as follows:

wherein the content of the first and second substances,

is a parameter of the distance to be measured,

indicating the similarity of the type between the two entities,

wherein the content of the first and second substances,

is expressed simultaneously as

And with

The set of parent classes of (a) is,

7. The method for self-supervised joint learning oriented to knowledge-graph entity alignment as claimed in claim 6, wherein the third step specifically comprises the following sub-steps:

step (3.3) calculating an attribute similarity vector based on the entity attribute information embedding in step (2.1);

8. The method for self-supervised joint learning oriented to knowledge-graph entity alignment as claimed in claim 7, wherein the step (3.2) is specifically as follows: first of all, the structure

Of the neighbor matrix, the value of the matrixThe ith neighbor representing entity e and entity

9. The method for self-supervised joint learning oriented to knowledge-graph entity alignment as claimed in claim 7, wherein the step (3.3) is specifically as follows: first of all, is constructed

10. The method for self-supervised joint learning oriented to knowledge-graph entity alignment as claimed in claim 7, wherein the step (3.5) comprises the following specific processes: based on entity similarity vector

，

The larger the value of (A), the two real values are representedThe greater the probability of the body alignment, wherein