CN113111657B

CN113111657B - Cross-language knowledge graph alignment and fusion method, device and storage medium

Info

Publication number: CN113111657B
Application number: CN202110241500.4A
Authority: CN
Inventors: 俞山青; 张建林; 甘燃; 童天航; 傅晨波; 宣琦
Original assignee: Zhejiang University of Technology ZJUT
Current assignee: Zhejiang University of Technology ZJUT
Priority date: 2021-03-04
Filing date: 2021-03-04
Publication date: 2024-05-03
Anticipated expiration: 2041-03-04
Also published as: CN113111657A

Abstract

The invention discloses a cross-language knowledge graph alignment and fusion method, a device and a storage medium, wherein the method comprises the following steps: s1, constructing a first-order sub-graph feature matrix of an entity in a knowledge graph network of two different languages; s2, inputting the structural feature matrix, the attribute feature matrix and the first-order sub-image feature matrix of the entity into an alignment model, obtaining three embedded vector matrixes of the structural features, the attribute features and the sub-image features of all the entities in the knowledge graph network of two different languages, and splicing the three embedded vector matrixes; s3, calculating the similarity of the embedded vectors between the entity to be aligned and each entity in the target knowledge graph network; s4, ordering the entities in the target knowledge graph network according to the similarity, and obtaining candidate equivalent entities of the entities to be aligned; s5, fusing the knowledge graph networks of the two different languages according to the candidate equivalent entities. The method can effectively realize the alignment and fusion of the cross-language knowledge graph.

Description

Cross-language knowledge graph alignment and fusion method, device and storage medium

Technical Field

The present invention relates to the field of alignment and fusion technologies of cross-language knowledge graphs, and in particular, to a method, an apparatus, and a storage medium for alignment and fusion of cross-language knowledge graphs.

Background

Knowledge Graphs (KG) aimed at organizing human Knowledge in a structured form are playing an increasingly important role as an infrastructure in the fields of artificial intelligence and natural language processing. A knowledge graph is a collection of knowledge facts, typically represented using triples (head entity, relationship, tail entity).

Currently, most of knowledge maps are built based on a single language data source, except for very few global-built knowledge maps such as Google knowledge maps. The knowledge graph described by the single language is often only used for serving users of the single language, and brings a huge gap to the fusion of global knowledge. Under the background of big data age, how to align and fuse the cross-language knowledge graphs so as to realize the global sharing of knowledge. The multi-language knowledge graph fusion is a problem which needs to be solved in the process of further development of the knowledge graph.

The research of the cross-language knowledge graph is still in a starting stage, and the task of aligning and fusing the constructed knowledge graphs in different languages is to be completed. Knowledge bases used in the construction process of knowledge maps of different languages are uncertain, and the knowledge bases of different languages may have larger differences in coverage, granularity, description of the same knowledge and the like due to different language environments and knowledge backgrounds possessed by users of different languages.

The alignment and fusion technology of the cross-language knowledge patterns is beneficial to linking and fusing the knowledge patterns with personalized knowledge of multiple nations and nations worldwide, and realizes barrier-free cross-language information retrieval, natural language processing and the like. For example, in the aspect of medical treatment, the knowledge graph of the domestic traditional Chinese medicine is aligned and fused with the knowledge graph of the foreign western medicine, so that the knowledge graph of the combination of the traditional Chinese medicine and the western medicine can be constructed, and more comprehensive and effective medical knowledge can be provided for doctors and patients. In the aspect of a search engine, through using the aligned and fused multilingual knowledge graphs, a user can alternately use multiple languages to acquire knowledge which is acquired through a single language from the knowledge graphs without barriers, and can acquire knowledge with a wider range and multiple language versions than the prior art.

The knowledge graph also contains rich network structures, and entities in the knowledge graph can be regarded as nodes in the network, and the relationships can be regarded as edges in the network. Subgraphs are the fundamental constituent elements in a network, so studying the substructure of a network is an effective way to analyze a network. Recently, graph embedding algorithms such as Word2vec and deep are widely applied to tasks such as node classification. But the embedded vectors obtained by such models only contain local structure information around the nodes, and global structure information of the whole network is ignored. The construction of the sub-graph network can supplement the structural characteristics of the original network so as to better perform downstream tasks such as node classification, network classification and the like.

Disclosure of Invention

The invention aims to provide a cross-language knowledge graph alignment and fusion method, device and storage medium, which are used for solving the technical problem of insufficient embedded vector expression capability caused by using only structural information in a cross-language knowledge graph entity alignment method in the prior art and can effectively realize the alignment and fusion of the cross-language knowledge graph.

In order to achieve the above object, the present invention provides the following solutions: the invention discloses a cross-language knowledge graph alignment and fusion method, which comprises the following steps:

Step S1, acquiring two knowledge graph networks with different languages, respectively establishing a first-order sub-graph network according to the two knowledge graph networks with different languages, extracting first-order sub-graph features of the entity based on the first-order sub-graph network, and constructing a first-order sub-graph feature matrix of the entity in the knowledge graph networks with different languages;

S2, constructing an alignment model based on a graph roll-up neural network GCN, respectively acquiring structural feature matrixes and attribute feature matrixes of entities in two knowledge graph networks with different languages, inputting the structural feature matrixes, the attribute feature matrixes and the first-order sub-image feature matrixes of the entities into the trained alignment model, acquiring structural feature embedded vector matrixes, attribute feature embedded vector matrixes and sub-image feature embedded vector matrixes of all the entities in the knowledge graph networks with different languages, and splicing the three embedded vector matrixes;

Respectively taking two knowledge graph networks with different languages as a knowledge graph network to be aligned and a target knowledge graph network, selecting an entity to be aligned from the knowledge graph networks to be aligned, and calculating the similarity of embedded vectors between the entity to be aligned and each entity in the target knowledge graph network according to a scoring function;

S4, sorting the entities in the target knowledge-graph network according to the similarity of the embedded vectors of each entity and the entity to be aligned in the target knowledge-graph network, and obtaining candidate equivalent entities of the entity to be aligned;

and S5, fusing the knowledge graph networks of the two different languages according to the candidate equivalent entities.

Preferably, in the step S1, the extracting of the first-order sub-graph feature specifically includes:

s1.1, detecting a subgraph from a network structure of a knowledge graph;

S1.2, constructing a sub-graph network based on the detected sub-graph;

and step S1.3, encoding the nodes established in the sub-graph network based on a one-hot encoding method to obtain first-order sub-graph characteristics of the entity.

Preferably, in the step S1.1, the subgraph detected from the network structure of the original knowledge graph is a line.

Preferably, in the step S1.2, the method for constructing the sub-graph network based on the detected sub-graph includes: traversing all sub-graphs, judging whether the two sub-graphs share the same node or link in the knowledge graph network structure, if so, creating a connecting edge between the two sub-graphs to obtain a sub-graph network.

Preferably, in the step S1.3, the method for encoding the node established in the sub-graph network is as follows: judging whether the entity in the knowledge graph network belongs to a certain sub-graph in the node set of the sub-graph network, if yes, marking the coding position of the corresponding sub-graph as 1, and if not, marking the coding position of the corresponding sub-graph as 0.

Preferably, in the step S2, the method for obtaining the splicing result of the three embedded vector matrices of the structural feature, the attribute feature, and the sub-graph feature of the entity is shown in formula 1:

in the method, in the process of the invention, Respectively embedding a structural feature embedded vector matrix, an attribute feature embedded vector matrix and a sub-graph feature embedded vector matrix of the knowledge graph network; p is an adjacency matrix with the size of n multiplied by n, and n is the number of entities in the knowledge-graph network; /(I)I is an identity matrix; /(I)Is/>Is a diagonal node degree matrix of (a); h _s(l)、H_a(l)、H_sgn (l) is the structural feature matrix, the attribute feature matrix, and the first-order sub-graph feature matrix, respectively, of the entity input to the first layer of the pair Ji Moxing; w _s(l)、W_a(l)、W_sgn (l) is the weight matrix of the structural feature, the attribute feature and the sub-image feature of the entity in the first layer of the pair Ji Moxing respectively; [ (r) ]; and represents a concatenation of two matrices; sigma is a nonlinear activation function.

Preferably, in the step S3, the similarity of the embedded vectors between different entities is calculated according to a scoring function, as shown in equation 5:

Wherein D (e _i,e_j) represents the similarity between entities e _i、e_j; d _s、d_a、d_sgn is the dimension of three embedded vectors of the structural feature, the attribute feature and the sub-graph feature respectively; alpha, beta and gamma are all super parameters, alpha+beta+gamma=1.

Preferably, in the step S5, the method for fusing the knowledge graph networks of two different languages according to the candidate equivalent entity includes:

And combining the entities and the relations in the knowledge-graph of two different languages according to the equivalent entities of the unaligned entities in the target knowledge-graph network to realize the fusion of the knowledge-graph networks of the two different languages.

The invention also provides a cross-language knowledge graph alignment and fusion device, which comprises a sub-graph feature extraction module, an entity alignment module and a knowledge graph fusion module which are connected in sequence;

The sub-graph feature extraction module is used for respectively converting two knowledge graph networks with different languages into a first-order sub-graph network, extracting first-order sub-graph features of the entity based on the first-order sub-graph network, and constructing a first-order sub-graph feature matrix of the entity in the knowledge graph networks with different languages;

The entity alignment module is used for constructing an alignment model based on GCN, respectively inputting a structural feature matrix, an attribute feature matrix and a first-order sub-image feature matrix of the entities in the knowledge graph networks of two different languages into the trained alignment model to obtain embedded vector representations of the entities in the knowledge graph networks of the two different languages, calculating the similarity of the embedded vectors between the entity to be aligned and all the entities in the target knowledge graph network, and obtaining candidate equivalent entities of the entity to be aligned;

The knowledge spectrum fusion module is used for fusing the knowledge spectrum networks of two different languages according to the candidate equivalent entities.

The invention also provides a storage medium for storing a program, wherein the program is used for realizing the cross-language knowledge graph alignment and fusion method.

The invention discloses the following technical effects:

The invention discloses a cross-language knowledge graph alignment and fusion method, a device and a storage medium, wherein the method comprises the following steps: converting the original knowledge graph into a first-order sub-graph network and extracting first-order sub-graph features of the entity; the structural features and the attribute features of the original knowledge graph are combined to be used as the input of the graph convolution neural network; selecting proper weights to splice three embedded vectors; calculating the similarity between the entity embedded vectors according to the scoring function; sorting the entities in the map according to the similarity to obtain candidate equivalent entities; and fusing the knowledge maps according to the newly discovered equivalent entity. Because the structures of the cross-language knowledge graph are very similar, the sub-graph features of the knowledge graph entities can be utilized to effectively enhance the structural features of the original graph, improve the representation capability of entity vectors, and construct a new cross-language knowledge graph entity alignment model, so that knowledge graph fusion can be effectively carried out through the new model.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of a cross-language knowledge graph alignment and fusion method of the present invention;

FIG. 2 is a flowchart of a method for extracting first-order sub-graph features according to an embodiment of the present invention;

Fig. 3 is a schematic structural diagram of the cross-language knowledge graph alignment and fusion device of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description.

Referring to fig. 1-2, the present embodiment provides a cross-language knowledge graph alignment and fusion method, which includes the following steps:

The knowledge graph network is a complex network, the entity corresponds to a node of the knowledge graph network, and the relation corresponds to the continuous edge of the knowledge graph network. The extraction of the first-order sub-graph features specifically comprises the following steps:

s1.1, detecting subgraphs from a knowledge graph network, and selecting the most basic subgraphs (namely lines) as subgraphs;

S1.2, constructing a sub-graph network based on the detected sub-graph;

The method comprises the following steps: after enough subgraphs are extracted from the knowledge graph network, constructing a subgraph network among the selected subgraphs according to the following rules; traversing all sub-graphs, judging whether the two sub-graphs share the same node or link in the original knowledge graph network structure, if so, creating a connecting edge between the two sub-graphs to obtain a sub-graph network, wherein the node set of the sub-graph network is G= { G ₁,g₂,...,g_m }.

S1.3, encoding nodes established in a sub-graph network based on a one-hot (one-bit effective) encoding method to obtain first-order sub-graph features of an entity; the method comprises the following steps: the newly established node in the sub-graph network is regarded as a coding object of one-hot (one-bit effective) coding; judging whether the entity in the original knowledge graph network belongs to a certain sub-graph in a node set G= { G ₁,g₂,...,g_m } of the sub-graph network, if yes, marking the coding position of the corresponding sub-graph as 1, otherwise, marking the coding position of the corresponding sub-graph as 0, and finishing the extraction of the first-order sub-graph characteristics.

S2, constructing an alignment model based on GCN (Graph Convolutional Network, graph convolution neural network), training the alignment model, respectively acquiring structural feature matrixes and attribute feature matrixes of entities in knowledge graph networks of two different languages, inputting the structural feature matrixes and the attribute feature matrixes of the entities and the first-order sub-graph feature matrixes extracted in the step S1 into the trained alignment model, acquiring structural feature embedded vector matrixes, attribute feature embedded vector matrixes and sub-graph feature embedded vector matrixes of all the entities in the knowledge graph networks of the two different languages, and selecting proper weights to splice the three embedded vector matrixes;

the training method of the alignment model comprises the following steps:

The training method comprises the steps of obtaining pre-aligned entity pairs from knowledge graph networks of two different languages to form a sample set, and completing training of an alignment model through the sample set pairs, wherein the training of the alignment model is completed by respectively constructing a structural feature embedded vector, an attribute feature embedded vector and an objective function L _s、L_a、L_sgn,L_s、L_a、L_sgn of a sub-image feature embedded vector of an entity, wherein the structural feature embedded vector, the attribute feature embedded vector and the objective function L _s、L_a、L_sgn,L_s、L_a、L_sgn are mutually independent, respectively optimizing the structural feature embedded vector, the attribute feature embedded vector and the objective function L _s、L_a、L_sgn,L_s、L_a、L_sgn by using random gradient descent, and completing training of the alignment model so that the embedded vectors of equivalent entities are as close as possible in a vector space, and the embedded vectors of non-equivalent entities are as far away as possible in the vector space.

The structural feature embedding vector, the attribute feature embedding vector, and the objective function L _s、L_a、L_sgn of the sub-graph feature embedding vector of the entity are as shown in equations (2) to (4):

Wherein [ (₊ ] is an extremum taking function, [ x ] ₊＝max{0,x};f(x,y)＝||x-y||₁; s is a sample set formed by aligned entity pairs; s' is a negative sample set, wherein the negative sample set is obtained by randomly replacing one entity in a prealigned entity pair (e, v), wherein e and v are entities in two knowledge-graph networks with different languages respectively, and the replaced entities are randomly selected from the two knowledge-graph networks; h _s(e)、h_a(e)、h_sgn (e) is the embedded vector of the structural feature, the attribute feature and the sub-graph feature of the entity e respectively; h _s(v)、h_a(v)、h_sgn (v) is the embedded vector of the structural feature, the attribute feature and the sub-graph feature of the entity v respectively; gamma _s、γ_a、γ_sgn is greater than 0, which is a super parameter used to control the alignment degree of positive and negative alignment entity pairs.

The acquisition method of the attribute feature matrix comprises the following steps: and selecting the attribute with the frequency of occurrence of the first 2000 in all attribute triples for coding.

The method for obtaining the splicing results of the three embedded vector matrixes of the structural features, the attribute features and the sub-image features of the entity is shown in the formula (1):

in the method, in the process of the invention, Respectively embedding a structural feature embedded vector matrix, an attribute feature embedded vector matrix and a sub-graph feature embedded vector matrix of the knowledge graph network; p is an adjacency matrix with the size of n multiplied by n, and n is the number of entities in the knowledge-graph network; /(I)I is an identity matrix; /(I)Is/>Is a diagonal node degree matrix of (a); h _s(l)、H_a(l)、H_sgn (l) is the structural feature matrix, the attribute feature matrix, and the first-order sub-graph feature matrix, respectively, of the entity input to the first layer of the pair Ji Moxing; w _s(l)、W_a(l)、W_sgn (l) is the weight matrix of the structural feature, the attribute feature and the sub-image feature of the entity in the first layer of the pair Ji Moxing respectively; [ (r) ]; and represents a concatenation of two matrices; σ is a nonlinear activation function, similar to RELU.

Step S3, respectively taking two knowledge graph networks with different languages as a knowledge graph network to be aligned and a target knowledge graph network, selecting an entity to be aligned from the knowledge graph network to be aligned, traversing embedded vectors of all the entities in the target knowledge graph network, and calculating the similarity of the embedded vectors between the entity to be aligned and each entity in the target knowledge graph network according to a scoring function;

The method specifically comprises the following steps:

S3.1, respectively taking two knowledge graph networks with different languages as a knowledge graph network to be aligned and a target knowledge graph network, selecting entities to be aligned from the knowledge graph network to be aligned, and traversing embedded vectors of all entities in the target knowledge graph network;

step S3.2, calculating the similarity of the embedded vectors between the entity to be aligned and all the entities in the target knowledge graph network according to the scoring function, wherein the similarity is shown in a formula (5):

wherein D (e _i,e_j) represents the similarity between the entities e _i、e_j, and i E [1, n ₁],j∈[1,n₂],n₁、n₂ ] are the number of the entities in the knowledge-graph network to be aligned and the target knowledge-graph network respectively; f (x, y) = |x-y|| ₁;d_s、d_a、d_sgn are structural features attribute characteristics of three embedded vector dimensions of the subgraph feature; alpha, beta and gamma are super parameters for balancing three embedded vectors of structural features, attribute features and sub-graph features, and alpha+beta+gamma=1.

S4, sorting the entities in the target knowledge-graph network according to the similarity of the embedded vectors of each entity in the target knowledge-graph network and the entity e _i to be aligned to obtain candidate equivalent entities of the entity to be aligned;

s5, fusing the knowledge graph networks of two different languages according to the candidate equivalent entities;

The method comprises the following steps: and combining the entities and the relations in the knowledge-graph of two different languages according to the equivalent entities of the unaligned entity in the target knowledge-graph network to realize the fusion of the knowledge-graph networks of the two different languages for the unaligned entity (i.e. the entity not in the prealigned entity pair) in the knowledge-graph network to be aligned.

Referring to fig. 3, the present embodiment further provides a cross-language knowledge graph alignment and fusion device, which specifically includes: the system comprises a sub-graph feature extraction module, an entity alignment module and a knowledge graph fusion module which are connected in sequence;

The sub-graph feature extraction module is used for respectively converting two knowledge graph networks with different languages into a first-order sub-graph network, extracting first-order sub-graph features of the entity based on the first-order sub-graph network, and constructing a first-order sub-graph feature matrix of the entity in the knowledge graph networks with different languages; the entity alignment module builds an alignment model based on GCN, trains the alignment model through pre-alignment entity alignment, respectively inputs a structural feature matrix, an attribute feature matrix and a first-order sub-image feature matrix of entities in two knowledge-graph networks with different languages into the trained alignment model to obtain embedded vector representations of the entities in the knowledge-graph networks with different languages, calculates similarity of the embedded vectors between the entity to be aligned and all the entities in the target knowledge-graph network, and obtains candidate equivalent entities of the entity to be aligned;

the knowledge spectrum fusion module is used for fusing knowledge spectrum networks of two different languages according to the candidate equivalent entities; the method comprises the following steps: and combining the relationship triples and the attribute triples of the entity to be aligned and the candidate equivalent entity to finish the fusion of the cross-language knowledge graph network.

The present embodiment also provides a storage medium for storing a program, which when executed by the cross-language knowledge graph alignment and fusion device, implements the steps of the cross-language knowledge graph alignment and fusion method.

In the embodiment, the structural features of the original atlas are expanded by inputting the sub-image features of the entity into the cross-language knowledge atlas alignment model, and the representation capability of the entity vector is improved, so that the alignment and fusion of the knowledge atlas entity can be completed better.

The technical conception of the invention is as follows: knowledge maps and a set of known alignment entity pairs for two different languages for a given KG ₁ and KG ₂ The GCN model is used for carrying out feature coding on the entities in the atlas, and the entities from different languages are embedded into a unified vector space. The sub-graph feature extraction part extracts a sub-graph network in the original graph by using a first-order sub-graph network method, and then performs sub-graph feature coding on each entity. After training, the distances between the equivalent entities are as close as possible, and finally the candidate entities are ranked through a predefined distance function to find the corresponding equivalent entity of each entity. And finally fusing knowledge maps of two different languages according to the newly discovered equivalent entity pairs.

In this embodiment, taking a medical knowledge graph as an example, the cross-language knowledge graph alignment and fusion method of the present invention is explained:

Knowledge maps store knowledge in the real world in the form of triples, which in this embodiment are divided into two categories, relationship triples and attribute triples. For example, in chinese medical knowledge map CMeKG, (hypoglycemia, medication, hydrocortisone) is a relational triplet in the format of (entity 1, relationship, entity 2); (diabetes, department of science, internal medicine) is an attribute triplet in the form of (entity, attribute value). Formally, the knowledge graph is represented as kg= { E, R, a, T _R,T_A }, where E, R, a represent sets of entities, relationships, and attributes, respectively, T _R E e×r×e represent sets of relationship triples, T _A E e×a×v represent sets of attribute triples, and V represents a set of attribute values.

The entity alignment tasks are described below forAnd/>Knowledge maps of two different languages, the embodiment defines the task of cross-language knowledge map alignment as a collection/>, through existing known entity pairsTo find a new alignment entity pair in KG and then to find a new alignment entity pair according to a distance function.

The alignment and fusion technology of the cross-language knowledge patterns is beneficial to linking and fusing the knowledge patterns with personalized knowledge of multiple nations and nations worldwide, and realizes barrier-free cross-language information retrieval, natural language processing and the like. Taking the medical knowledge graph as an example, the Chinese medical science case knowledge graph provided by the Chinese medical information research institute of the national academy of Chinese medical science and the English antibiotic medicine knowledge graph IASO are aligned and fused, so that the knowledge graph combining the Chinese and Western medicine can be constructed, and more comprehensive and effective medical knowledge can be provided for doctors and patients.

The knowledge graph of the traditional Chinese medical case extracts clinical knowledge from the medical case to construct the knowledge graph, so that a user can know the characteristic therapy of the traditional Chinese medical science, and the clinical manifestation, the related therapy, the related health care method and the like of diseases (such as chronic gastritis). The English antibiotic medicine medical knowledge map IASO is an English medicine medical knowledge map developed in a man-machine combination mode based on large-scale medical text data by utilizing natural language processing and text mining technology. It covers 507 infectious diseases and their treatment methods, 332 different infection sites, 936 systematic related symptoms, 371 complications and other knowledge. In the medical knowledge graph, disease names, therapeutic drugs and symptom names are the most basic entities, and most of the knowledge can be fused only by aligning the three types of entities. The above embodiments are only illustrative of the preferred embodiments of the present invention and are not intended to limit the scope of the present invention, and various modifications and improvements made by those skilled in the art to the technical solutions of the present invention should fall within the protection scope defined by the claims of the present invention without departing from the design spirit of the present invention.

Claims

1. The cross-language knowledge graph alignment and fusion method is characterized by comprising the following steps of:

the method for acquiring the splicing results of the three embedded vector matrixes of the structural feature, the attribute feature and the sub-image feature of the entity is shown in the formula 1:

in the method, in the process of the invention, Respectively embedding a structural feature embedded vector matrix, an attribute feature embedded vector matrix and a sub-graph feature embedded vector matrix of the knowledge graph network; p is an adjacency matrix with the size of n multiplied by n, and n is the number of entities in the knowledge-graph network; /(I)I is an identity matrix; /(I)Is/>Is a diagonal node degree matrix of (a); h _s(l)、H_a(l)、H_sgn (l) is the structural feature matrix, the attribute feature matrix, and the first-order sub-graph feature matrix, respectively, of the entity input to the first layer of the pair Ji Moxing; w _s(l)、W_a(l)、W_sgn (l) is the weight matrix of the structural feature, the attribute feature and the sub-image feature of the entity in the first layer of the pair Ji Moxing respectively; [ (r) ]; and represents a concatenation of two matrices; sigma is a nonlinear activation function;

Step S3, using two knowledge graph networks with different languages as a knowledge graph network to be aligned and a target knowledge graph network respectively, selecting an entity to be aligned from the knowledge graph networks to be aligned, and calculating the similarity of the embedded vectors between the entity to be aligned and each entity in the target knowledge graph network according to a scoring function;

2. The method for aligning and fusing cross-language knowledge graphs according to claim 1, wherein in the step S1, the extracting of the first-order sub-graph features specifically includes:

s1.1, detecting a subgraph from a network structure of a knowledge graph;

S1.2, constructing a sub-graph network based on the detected sub-graph;

3. The method for aligning and fusing cross-language knowledge graph according to claim 2, wherein in step S1.1, the sub-graph detected from the network structure of the original knowledge graph is a line.

4. The method for aligning and fusing cross-language knowledge graph according to claim 2, wherein in the step S1.2, the method for constructing the sub-graph network based on the detected sub-graph comprises: traversing all sub-graphs, judging whether the two sub-graphs share the same node or link in the knowledge graph network structure, if so, creating a connecting edge between the two sub-graphs to obtain a sub-graph network.

5. The method for aligning and fusing cross-language knowledge graph according to claim 2, wherein in the step S1.3, the method for encoding the nodes established in the sub-graph network is as follows: judging whether the entity in the knowledge graph network belongs to a certain sub-graph in the node set of the sub-graph network, if yes, marking the coding position of the corresponding sub-graph as 1, and if not, marking the coding position of the corresponding sub-graph as 0.

6. The method for aligning and fusing cross-language knowledge graphs according to claim 1, wherein in the step S3, the similarity of the embedded vectors between different entities is calculated according to a scoring function, as shown in equation 5:

7. The method for aligning and fusing knowledge-graph networks of different languages according to claim 1, wherein in step S5, the method for fusing knowledge-graph networks of two different languages according to candidate equivalent entities comprises:

8. The device for aligning and fusing the cross-language knowledge graph according to any one of claims 1 to 7, comprising a sub-graph feature extraction module, an entity alignment module and a knowledge graph fusion module which are connected in sequence;

the entity alignment module is used for constructing an alignment model based on GCN, respectively inputting a structural feature matrix, an attribute feature matrix and a first-order sub-image feature matrix of the entities in the knowledge graph networks of two different languages into the trained alignment model to obtain embedded vector representations of the entities in the knowledge graph networks of the two different languages, calculating the similarity of the embedded vectors between the entity to be aligned and all the entities in the target knowledge graph network, and obtaining candidate equivalent entities of the entity to be aligned; the method for acquiring the splicing results of the three embedded vector matrixes of the structural feature, the attribute feature and the sub-image feature of the entity is shown in the formula 1:

9. A storage medium storing a program for implementing the cross-language knowledge graph alignment and fusion method of any one of claims 1 to 7.