CN117408336A

CN117408336A - Entity alignment method for structure and attribute attention mechanism

Info

Publication number: CN117408336A
Application number: CN202311483359.4A
Authority: CN
Inventors: 李忠阳; 王淑营; 丁国富; 张海柱; 蒋敏
Original assignee: Southwest Jiaotong University
Current assignee: Southwest Jiaotong University
Priority date: 2023-11-07
Filing date: 2023-11-07
Publication date: 2024-01-16

Abstract

The application discloses an entity alignment method of a structure and attribute attention mechanism. Specifically, the map information is divided according to attribute triples and relationship triples, the relationship triples are utilized to form initial data of the structure channel module, and meanwhile, the relationship triples and the attribute triples are utilized to form initial data of the attribute channel module, and network models are respectively built to obtain enhanced entity representation; constructing a map prealignment module, respectively representing attribute channels and structure channels into a unified vector space, and realizing map prealignment based on entity alignment of the attribute channels and the structure channels; and constructing a dual-channel feature fusion module, and determining the influence of the weight ratio of the two channels on the final entity alignment effect by using the dual-channel feature fusion module to realize entity alignment. The method solves the problem that most entity alignment methods are realized based on graph embedding, and alignment is performed by considering structure and attribute information, but interaction and the problem that the isomerism of a knowledge graph lead to poor entity alignment effect are not well processed.

Description

Entity alignment method for structure and attribute attention mechanism

Technical Field

The present application relates to a method for entity alignment, and in particular, to a method for entity alignment of a structure and an attribute attention mechanism.

Background

Because the knowledge graph has the functions of semantic representation, reasoning and the like, the related technology of the knowledge graph is rapidly developed in recent years, and a large number of knowledge graphs are presented. The different patterns typically contain a lot of complementary information, and fusing these patterns helps to improve the utilization of knowledge. However, the knowledge maps of different sources have the problems of isomerism, incompleteness, data redundancy and the like. Therefore, the knowledge fusion technology is needed to align and combine redundant information in the atlas to form global unified knowledge representation and association. Entity Alignment (EA) is a key technology in the knowledge fusion process, aiming at associating equivalent/matching entities in knowledge maps of different sources that are directed to the same object in the real world.

Early researchers used various features of character strings for entity alignment. With the rapid development of knowledge representation learning technology in recent years, researchers have proposed many entity alignment methods based on representation learning, using machine learning technology to represent map data as low-dimensional dense vectors, reflecting semantic relationships between objects by the distance between the vectors. For each triplet (h, r, t), the relation is regarded as a connection of head and tail entities, and the relation has a direction to learn the vector representation of the entity. With the advent of deep learning, researchers began to learn vector representation of entities by using a graph neural network (Graph Neural Networks, GNN), and entity alignment was performed by learning structural information in a map and attribute information of nodes themselves, but currently, a model for realizing entity alignment based on the graph neural network still exists: because different knowledge maps have structural heterogeneity, entities equivalent to each other often have different neighborhood information, which results in a problem of poor entity alignment effect.

Disclosure of Invention

Aiming at the problem of poor entity alignment effect caused by different knowledge patterns with structural heterogeneity and different neighborhood information of equivalent entities among the patterns in the prior art, the invention provides an entity alignment method of a structure and attribute attention mechanism, which comprises the following specific technical scheme:

the application provides a method for aligning a structure with an entity of an attribute attention mechanism, which comprises the following steps:

step S1: dividing the map data to form initial data of an attribute channel module and a structure channel module;

step S2: the attribute channel module and the structure channel module respectively build a network model to obtain enhanced entity characteristics;

step S3: constructing an atlas prealignment module, respectively representing the attribute channel and the structure channel to the same vector space, respectively realizing entity alignment of the attribute channel and the structure channel, and realizing atlas prealignment;

step S4: and constructing a dual-channel feature fusion module, and determining the influence of the weight ratio of the two channels on the final entity alignment effect by using the dual-channel feature fusion module to realize final entity alignment.

Further, in the step S1, the attribute channel module and the structure channel module divide information in the spectrum according to the attribute triples and the relationship triples by giving a set of source knowledge spectrum Gs, target knowledge spectrum Gt and entity alignment seed pairs, and build different network models for two channels so as to learn information in different dimensions in the spectrum.

Further, the step S2 specifically includes:

s21: processing the structural channel module;

s22: processing the attribute channel module;

the step S21 specifically includes:

step S211: because entity names are important information when entity alignment is carried out, the semantic information of the entity is encoded by using a pre-trained word embedding model, so that model parameters are not randomly initialized any more, and model training cost is reduced. The BERT model is trained on a large-scale multilingual data set, language features in large-scale text data are captured, and good initial entity feature representation can be obtained, so that the initial entity feature representation h E R is obtained by encoding entity name information by using the BERT model ^n×m Where n represents the number of entities, m represents the entity embedding dimension,

step S212: because the first-order neighborhood has smaller isomerism than the distant neighbors, the first-order neighborhood information is not needed to be aggregated by using a attention mechanism, and therefore, the first-order neighborhood information aggregation is realized by adopting a mean value aggregator mode, and the formula is as follows:wherein l represents the number of layers of the network and also represents the number of hops of the adjacency points each vertex can aggregate,/-, for each vertex>An embedded representation of the first layer representing vertex e,/->Embedded representation of layer 1 representing vertex e,/i>Embedding the first-1 layer of the neighbor vertex j with the vertex being e, N (e) representing all neighbor nodes with the vertex being e, MEAN (& gt) representing a MEAN vector, realizing the stitching of the first-1 layer vectors of the target vertex and the neighbor vertex, then carrying out the operation of solving the MEAN value on each dimension of the vectors, W _l Representing a layer i learnable weight;

step S213: since the feature representation of an entity includes not only its own attribute information, but also relationships between entities that reflect special effects between entities, it is necessary to consider the importance of the relationships and weight the relationships to obtain a richer and more accurate node feature representation,

firstly, mapping entity characteristics to a relation characteristic space, respectively calculating the representation of the relation under the action of a head entity and a tail entity by using an attention mechanism, and obtaining a relation characteristic representation by adding, wherein the corresponding mathematical formula is as follows:

wherein the method comprises the steps ofRepresenting the relation r _l At head entity e _i And tail entity e _j Attention coefficient under action exp represents an exponential function based on natural constant e, beta ₁ Is a vector for dimension reduction, h _i And h _j Representing initial characteristics of head and tail entities, H representing a head entity set of relation connection, T representing a tail entity set of relation connection, i 'representing any head entity of relation connection, j' representing any tail entity of relation connection, W _i And W is _j Represent a learnable parameter, h _i Embedded representation representing vertex i, +.>Representing a relational expression under the effect of the head entity, +.>Representing the relationship representation under the action of tail entity, r _l Representing a final relationship representation in which the relationship representation under the action of the head entity and the relationship representation under the action of the tail entity are added to each other.

Obtaining the relation characteristic representation r through the operation _l Next, the acquired relation feature representation needs to be aggregated onto node features, first, an initial entity feature h is acquired according to a head node index and a tail node index to acquire a head node feature x_h and a tail node feature x_t, and the entity feature representation x is acquired through the following formula: x= [ x_h; x_t; r is (r) _l ]

Next, x is mapped into an attention weight tensor through linear change, and attention coefficient alpha is obtained by using a softmax activation function on the attention weight, which corresponds to a mathematical formula as shown in the specification:wherein alpha is _i Attention coefficient representing vertex i, exp represents exponential function based on natural constant e, W represents weight that can be learned, x _i Characteristic representation with vertex i, b represents bias of neuron, n represents all vertex sets in the map;

and carrying out weighted aggregation operation on the entity characteristics according to the acquired attention coefficients, wherein the corresponding mathematical formula is as follows:wherein aggre represents the aggregated feature vector, h _i Representing initial physical characteristics of the input, alpha _i For the obtained attention coefficient, the feature vector is added to the original entity feature vector h to obtain an entity feature vector h of the aggregation relation information _r The corresponding formula is as follows: h is a _r ＝aggre+h；

And then splicing the first-order neighbor information processed by the mean value aggregator and the vector aggregated by the relation information to obtain an entity characteristic representation shown by a formula: h is a _{e_r} ＝[h _e ；h _r ]Wherein h is _e Represents first-order neighborhood information aggregated by means of an average aggregator, h _r Information aggregated by relationships;

step S214: is thatError accumulation is avoided, a Highway Networks network is adopted to balance self node characteristics and first-order neighborhood relation information, and the corresponding formula is as follows: h is a _highway ＝gate·h _{e_r} ++ (1-gate) h, where h _{e_r} The first-order neighborhood relation information for vector stitching is that h represents the input initial entity characteristics, and the formula corresponding to gate is as follows: gate=σ (h·w+b), where σ represents a Sigmoid activation function, W is a learnable weight, and b represents a bias;

step S215: in order to realize the consideration of more heterogeneous information, the entity features fuse more neighbor information, so as to enlarge the information sensing range, therefore, the second-order neighborhood information needs to be learned, if the information aggregation is directly carried out by adopting a mean value aggregator or a GCN layer, more noise information is caused, and therefore, in order to find the second-order neighborhood with positive effect on the alignment entity, the node features h are adopted _highway Input into a single-layer graph attention network to obtain an entity representation h of aggregated second-order neighborhood information _{2_ner} The corresponding mathematical expression is as follows:wherein index is _ij Representing an edge index for indicating whether an edge exists between nodes, and +.>Represents the number of incomings of node j for normalizing the attention coefficient, +.represents element-wise multiplication, & represents matrix multiplication, & lt/EN & gt>Representing attention coefficients corresponding to different entities;

step S216: aiming at the defects that the attention mechanism of the graph excessively focuses on local information and the overall topological structure and the overall characteristics of the graph cannot be fully utilized, after second-order neighborhood information aggregation is completed, a transducer encoder is introduced to encode information in the graph again, so that the input of each position can be interacted and associated with other positions in a sequence, and the overall information and the sequence are fully utilizedThe column context feature, its corresponding formula is shown below: h is a _te ＝LayerNorm(h _{2_ner} +a+y), layerNorm Layer Normalization for normalizing the input, a corresponds to the formula:

K＝h _{2_ner} ·W _k

V＝h _{2_ner} ·W _ν

Q＝h _{2_ner} ·W _q

wherein W is _k ，W _v ，W _q Represent a learnable weight, d _k Represents h _{2_ner} The dimension of the physical feature divided by the number of heads of the multi-head attention,

the formula corresponding to Y is: y=relu (h) _{2_ner} ·W ₁ +b ₁ )·W ₂ +b ₂ Wherein W is ₁ ，W ₂ Representing the weights of the feedforward neural network, b ₁ ，b ₂ The method is correspondingly biased, after being processed by a transducer encoder, information between nodes is selectively transferred or filtered again by utilizing high network, so that a model is helped to better capture the correlation between the nodes, balance the information between the entity and the adjacent entity, and finally obtain the structural channel enhanced entity characteristic representation.

Further, in the step S22, since different attribute information should have different weights for the same node feature, in order to find the attribute type and attribute value that are more significant for the node embedded representation, it is necessary to capture the weight coefficients of different attributes of the entity by using the attribute attention after completing the embedded representation of the attribute type and attribute value, and the calculation formula of the learning entity embedded representation is as follows:

wherein the method comprises the steps ofRepresenting entity attribute representation processed by entity attribute encoder, alpha _k Representing attribute feature sequences, v _k Representing a characteristic sequence of attribute values,/-, for>Entity characteristic representation representing output through attribute attention layer,/->Representing the normalized entity attribute attention coefficient; w (W) ₁ And q ^T The method is a parameter matrix which can be learned, and after the aggregation of the attribute information is completed, the map structure information is aggregated by using a mean value aggregator, so that the final enhanced entity characteristic representation of the attribute channel is obtained.

Further, in the step S3, the map pre-alignment reduces the distance between the pairs of seed equivalent entities, so as to represent the two sub-images of each channel into a unified vector space, and for the pre-alignment of each channel, the alignment loss function is:

wherein dis (e) _i ，e _j ) Representing Manhattan distance, also known as L1 norm, is used to find the absolute value of an element in a vectorSum of the values e _i And e _j Representing two different entities that are to be combined,and->Enhanced feature representation representing structure channel and attribute channel acquisitions, (e) _i ，e _j ) Representing aligned seed-entity pairs, T representing a set of seed-entity aligned pairs, (e) _i ，e _j ) And representing a negative sampling seed pair, wherein T' is an entity pair set obtained through negative sampling, lambda is a super parameter, and an entity similarity matrix is obtained and stored.

Further, in the step S4, since different information in the atlas is learned from the attribute and the structural channel respectively, different channels need to be aggregated, and the influence of the weight ratio occupied by different channels on the alignment effect of the final entity is determined, where the learning of the weight information is considered as a classification problem, specifically, the label 0 indicates misalignment, the label 1 indicates alignment, and BCEWithLogitsLoss is adopted as a loss function, and the mathematical expression is as follows:

where N is the number of samples, y _i Is the real label of the ith sample, takes the value of 0 or 1, x _i Is the predicted value of the i-th sample:

setting up a four-layer full-connection layer to realize learning of different channel weights, inputting similarity matrix information learned for two channels, taking aligned pairs of seed entities as positive samples and expanding negative sample data to divide and train, testing a set and a verification set, searching and learning loss in different weights by utilizing grids and calculating a hit@1 value in the corresponding weights, selecting output corresponding to the maximum hit@1 as a weight ratio, multiplying the similarity matrix obtained by the original channel by the weight ratio, and directly adding the same positions to obtain an alignment matrix so as to output an alignment map.

The application has the following beneficial effects:

1. the method comprises the steps of dividing attribute triples and relation triples to form attribute and structure channels, utilizing information of different dimensions in different channel learning patterns to relieve influence caused by structural heterogeneity, and building a full-connection network to perform double-channel information fusion to achieve final entity alignment;

2. the first-order neighborhood information and the second-order neighborhood information are considered in the structural channel, the consideration of the relation is increased, a transducer encoder is introduced to capture dependence among different distances, the method helps to better capture the association information among entities, the characteristics of the balance nodes of the high network and the learned relation neighborhood characteristics are increased, and a better entity representation result is obtained.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure.

Fig. 1 is a flowchart of overall structure of a model network according to an embodiment of the present application.

Fig. 2 is a fusion diagram of attributes and structural channels provided by an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention more clear, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. It will be apparent that the described embodiments are some, but not all, embodiments of the invention.

Thus, the following detailed description of the embodiments of the invention is not intended to limit the scope of the invention, as claimed, but is merely representative of some embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

It should be noted that, under the condition of no conflict, the embodiments of the present invention and the features and technical solutions in the embodiments may be combined with each other.

The invention discloses an entity alignment method of a structure and attribute attention mechanism, in particular to an entity alignment method which is realized based on graph embedding in most of the prior art, and is aligned by considering structure and attribute information, but does not well deal with the interaction problem between the structure and the attribute information, and meanwhile, the entity alignment effect is poor due to the isomerism of a knowledge graph. In order to solve the above problems, a method for entity alignment of a fusion structure and an attribute attention mechanism model (Fused Structural and Attribute Attention Mechanism Model, fsaam) is provided. Referring to fig. 1, firstly, different source map information is divided into attribute channels and structure channels according to attribute triplets and relationship triplets, the attribute channels learn weight information of different attributes through an attribute attention mechanism and acquire entity characteristic representations, the structure channels acquire the entity characteristic representations through learning of relationship information and first-order neighborhood information, then learn second-order neighborhood characteristics through a graph attention network, introduce a transducer encoder to capture dependence among different distances, and therefore the method is beneficial to capturing association information among entities better, and acquire better entity representations through balancing the learned relationship neighborhood characteristics and self characteristics through a Highway Networks. After the two channels respectively learn different dimensional information in the atlas, the prealignment of the two channels atlas is realized by reducing the distance between the equivalent seed entity pairs. Referring to fig. 2, in order to effectively utilize the results of entity alignment of attribute channels and structural channels, the final entity alignment is achieved by building a fully connected network to obtain weight information of different channels. The proposed model verifies on three sub-data sets of the public data set DBP 15K. Compared with a baseline model, the experiment result shows that the Hits@1 is respectively improved by 2.7%,4.3% and 1.7%, which shows that the accuracy of the entity pair can be effectively improved.

The invention is illustrated by the following examples, which specifically include the following steps:

s1: as experimental dataset a cross-language dataset DBP15K was used, which is a subset of the large-scale knowledge graph dbpetia. Wherein ZH-EN (medium-english), JA-EN (daily-english), FR-EN (method-english) are included, each dataset contains 15000 alignment entity pairs in knowledge maps of two different languages as training, validation and test sets for the experiments herein. Dividing information in the source knowledge graph and the target knowledge graph according to the attribute triples and the relation triples, forming initial data of the structure channel module by using the relation triples, and forming initial data of the attribute channel module by using the relation triples and the attribute triples.

Step S2 includes S21 and S22, where S21 implements processing of the structural channel module and S22 implements processing of the attribute channel module.

Step S21 may be divided into steps S211 to S216:

s211: and constructing a BERT model to process the entity names, so as to obtain better initial entity characteristic representation.

S212: and constructing a mean value aggregator to aggregate the first-order neighborhood information of the entity.

S213: and a relation information aggregation module is constructed to realize the processing of relation information between the entities.

S214: the Highway Networks are constructed to realize the balance between the entity information and the aggregation first-order neighborhood and relation information, so that error accumulation is avoided.

S215: and a second-order graph attention mechanism is constructed to realize the consideration of more heterogeneous information, so that the entity features fuse more neighbor information, and the information perception range is enlarged.

S216: constructing a transducer encoder avoids the defect that the diagram annotating force mechanism excessively focuses on local information and cannot fully utilize the whole topological structure and the global characteristics of the diagram.

S22: and constructing an attribute attention mechanism, and finding attribute types and attribute values which are more meaningful for node embedding representation.

S3: and the constructed map prealignment module respectively aligns the attribute channels and the structure channel entities by expressing the two subgraphs of each channel to a unified vector space.

S4: and constructing a dual-channel feature fusion module, so as to realize aggregation of different channels and determine the influence of the weight ratio of the different channels on the final entity alignment effect, thereby learning the influence of different dimensional information in the map on the final entity alignment effect.

Evaluation index commonly used in entity alignment models: the hit@N and the average reciprocal rank (MRR) are used as evaluation indexes. The hits@N is calculated by ranking the proportion of the top N correctly aligned entities, and the MRR refers to the average reciprocal ranking of all the correct entities. The higher these two indices, the better the entity alignment model. The formulas are calculated as follows:

where Seed represents the test entity pair, rank _e Representing the entity prediction rank aligned with entity e.

Where N is the alignment entity pair, rank _e Representing the entity prediction rank aligned with entity e.

To test the effectiveness of the method proposed in the present invention, the present invention selects some competitive entity alignment models for comparison. The method mainly comprises the following steps of:

BootEA: in 2018 published in the IJCAI international conference, aiming at the problem that embedding representation is inaccurate due to limited training sets, the model marks training data by using an iterative strategy, so that the accuracy of entity embedding is improved;

HGCN: in 2019 published in EMNLP international conference, the model obtains the representation of the relationship according to the entity representation, and the entity representation is integrated with the relationship information to promote the entity alignment by utilizing the relationship information; RDGCN: published in IJCAI international conference in 2019, explored by modeling interactions between original and dual graphs, capturing neighbor structures with Highway-gated GCN layers to learn better entity representations;

attgnn: the method is published in EMNLP international conference in 2020, and entity alignment is realized based on the joint learning of entity structure and attribute embedding of the graph neural network method;

RNM:2021 published in AAAI international conference, using neighborhood matching and exploring useful information from connection relations to achieve entity alignment;

DATTI:2022 published on ACL conference, a high-efficiency entity alignment decoding algorithm based on third-order tensor isomorphism (DATTI) is provided, and the adjacency and self phenomenon in the knowledge graph are effectively utilized to enhance the decoding process of entity alignment;

NAMN:2022 is published in computer engineering, and proposes that according to the characteristic that importance of neighbors of each hop to a central entity is different, hierarchical ideas are adopted to process neighborhood information of each hop in a distinguishing way, and gathering is carried out through a gating mechanism so as to learn representation of a graph structure;

STEA:2023 issued to ACM, proposed to use the dependency between entities for self-training to achieve entity alignment.

In this experiment 30% of the pairs of seed entities were used as training sets and the remaining 70% of the pairs were used as test sets. Searching the optimal parameter configuration of the entity alignment of the structure and the attribute channel through grid search by adopting an Adagrad optimizer, wherein the search value of the learning rate is {0.001,0.004,0.007}, and the search value of the L2 regularization parameter is {0, 10 } ^-3 ，10 ^-4 }. 200 epochs are used per channel training, with a negative sampling period set to 5 epochs. Meanwhile, in order to utilize entity name information, google translation is firstly used for translating Chinese, french and Japanese entity information in a data set into English, and then node characteristic pre-embedding is realized. The optimizer adopted when the fusion of different channels is Adam, the learning rate and the searching value of the L2 regularization parameter are kept the same as the previous setting, and 100 epochs are used for training to realize weight information learning.

The comparison of the entity alignment results of different models is shown in the following table:

the model provided herein has the Hits@1 improved by 2.7%,4.3%,1.7% respectively on the three data sets compared with the results of the baseline model thickening marking in the table, and the Hits@10 and the MRR are also improved to a certain extent. The model provided by the invention realizes comprehensive learning of multiple information such as attributes, neighborhoods and relations in the atlas, and different information in the atlas is learned by using double channels, so that correct entities can be found out from a potential entity set, and the hit rate of entity alignment is improved. It was verified that the methods presented herein are effective for knowledge-graph entity alignment.

The above embodiments are only for illustrating the present invention and not for limiting the technical solutions described in the present invention, and although the present invention has been described in detail in the present specification with reference to the above embodiments, the present invention is not limited to the above specific embodiments, and thus any modifications or equivalent substitutions are made to the present invention; all technical solutions and modifications thereof that do not depart from the spirit and scope of the invention are intended to be included in the scope of the appended claims.

Claims

1. A method of entity alignment of a structure with an attribute attention mechanism, comprising:

2. The method for entity alignment of a structure and an attribute attention mechanism according to claim 1, wherein in the step S1, the attribute channel module and the structure channel module divide information in a spectrum according to an attribute triplet and a relationship triplet by giving a set of source knowledge spectrum, a target knowledge spectrum and an entity alignment seed pair, and build different network models for two channels so as to learn information in different dimensions in the spectrum.

3. A method of physical alignment of a structure with an attribute attention mechanism as in claim 2 and wherein: the step S2 specifically includes:

s21: processing the structural channel module;

s22: processing the attribute channel module;

the step S21 specifically includes:

step S211: encoding entity name information by using BERT model to obtain initial entity characteristic representation h E R ^n×m Where n represents the number of entities, m represents the entity embedding dimension,

step S212: the first-order neighborhood information aggregation is realized by adopting a mean value aggregator mode, and the formula is as follows:wherein l represents the number of layers of the network and also represents the number of hops of the adjacency points each vertex can aggregate,/-, for each vertex>An embedded representation of the first layer representing vertex e,/->Embedded representation of layer 1 representing vertex e,/i>Embedding the first-1 layer of the neighbor vertex j with the vertex being e, N (e) representing all neighbor nodes with the vertex being e, MEAN (& gt) representing a MEAN vector, realizing the stitching of the first-1 layer vectors of the target vertex and the neighbor vertex, then carrying out the operation of solving the MEAN value on each dimension of the vectors, W _l Representing a layer i learnable weight;

step S213: because the characteristic representation of the entity not only comprises the attribute information of the entity but also comprises the relation between the entity and the entity, the relation between the entities is subjected to weight learning, and the node characteristic representation is further obtained;

wherein the method comprises the steps ofRepresenting the relation r _l At head entity e _i And tail entity e _j Attention coefficient under action exp represents an exponential function based on natural constant e, beta ₁ Is a vector for dimension reduction, h _i And h _j Representing initial characteristics of head and tail entities, H representing a head entity set of relation connection, T representing a tail entity set of relation connection, i 'representing any head entity of relation connection, j' representing any tail entity of relation connection, W _i And W is _j Represent a learnable parameter, h _i Embedded representation representing vertex i, +.>Representing a relational expression under the effect of the head entity, +.>Representing the relationship representation under the action of tail entity, r _l Representing a final relationship representation in which the relationship representation under the action of the head entity and the relationship representation under the action of the tail entity are added up to the corresponding positions;

obtaining a relational characteristic representation r _l Next, aggregating the obtained relation feature representation onto node features, firstly obtaining a head node feature x_h and a tail node feature x_t by an initial entity feature h according to a head node index and a tail node index, and obtaining an entity feature representation x= [ x_h by the following formula; x_t; r is (r) _l ]；

Mapping x into an attention weight tensor through linear change, and obtaining an attention coefficient alpha by using a softmax activation function on the attention weight, wherein the attention coefficient alpha corresponds to a mathematical formula as shown in the specification:wherein alpha is _i Attention coefficient representing vertex i, exp represents exponential function based on natural constant e, W represents weight that can be learned, x _i Characteristic representation with vertex i, b represents bias of neuron, n represents all vertex sets in the map;

performing weighted aggregation operation on the entity characteristics according to the acquired attention coefficient, wherein the corresponding formula is as followsWherein aggre represents the aggregated feature vector, h _i Representing initial physical characteristics of the input, alpha _i For the obtained attention coefficient, the feature vector is added to the original entity feature vector h to obtain an entity feature vector h of the aggregation relation information _r The corresponding formula is h _r ＝aggre+h；

step S214: in order to avoid error accumulation, a Highway Networks network is adopted to balance the node characteristics and the first-order neighborhood relation information, and the corresponding formula is h _highway ＝gate·h _{e_r} ++ (1-gate) h, where h _{e_r} The method is first-order neighborhood relation information for vector stitching, h represents an input initial entity characteristic, a formula corresponding to gate is gate=sigma (h.W+b), wherein sigma represents a Sigmoid activation function, W is a learnable weight, and b represents bias;

step S215: in order to realize the consideration of more heterogeneous information, the entity features fuse more neighbor information, so as to enlarge the information sensing range, therefore, the second-order neighborhood information needs to be learned, if the information aggregation is directly carried out by adopting a mean value aggregator or a GCN layer, more noise information is caused, and therefore, in order to find the second-order neighborhood with positive effect on the alignment entity, the node features h are adopted _highway Input into a single-layer graph attention network to obtain an entity representation h of aggregated second-order neighborhood information _{2_ner} The corresponding expression isWherein index is _ij Representing an edge index for indicating whether an edge exists between nodes, and +.>Represents the number of incomings of node j for normalizing the attention coefficient, +.represents element-wise multiplication, & represents matrix multiplication, & lt/EN & gt>Representing attention coefficients corresponding to different entities;

step S216: aiming at the defects that the attention mechanism of the graph excessively focuses on local information and the overall topological structure and the global characteristic of the graph cannot be fully utilized, after second-order neighborhood information aggregation is completed, a transducer encoder is introduced to encode information in the graph again, so that the input of each position can be interacted and associated with other positions in the sequence, and the method comprises the following steps ofAnd fully utilizes the global information and the context characteristics of the sequence, and the corresponding formula is h _te ＝LayerNorm(h _{2_ner} +A+Y), layerNorm represents LayerNormalization, and is used for normalizing the input, A corresponds to the formula:

K＝h _{2_ner} ·W _k

V＝h _{2_ner} ·W _ν

Q＝h _{2_ner} ·W _q

4. A method for aligning a structure with an entity of an attribute-attention mechanism according to claim 3, wherein in the step S22, since different attribute information has different weights for the same node feature, after the embedding of the attribute type and the attribute value is completed, the attribute-attention is used to capture the weight coefficients of different attributes of the entity, and the entity embedding representation is learned, and the calculation formula is:

wherein the method comprises the steps ofRepresenting entity attribute representation processed by entity attribute encoder, alpha _k Representing attribute feature sequences, v _k Representing a characteristic sequence of attribute values,/-, for>Entity characteristic representation representing output through attribute attention layer,/->Representing the normalized entity attribute attention coefficient; w (W) ₁ And q ^T Is a parameter matrix which can be learned, and after the aggregation of the attribute information is completed, the graph structure information is aggregated by using a mean value aggregator, so that the final enhanced entity characteristic representation of the attribute channel is obtained.

5. The method of claim 4, wherein in step S3, the graph pre-alignment reduces the distance between pairs of seed equivalent entities, thereby representing two sub-graphs of each channel into a unified vector space, and the alignment loss function is:

wherein dis (e) _i ，e _j ) Representing Manhattan distance for summing absolute values of elements in the vector, e _i And e _j Representing two different entities that are to be combined,and->Enhanced feature representation representing structure channel and attribute channel acquisitions, (e) _i ，e _j ) Representing aligned seed entity pairs, T representing a set of seed pairs with entity alignment, (e' _i ，e′ _j ) And representing a negative sampling seed pair, wherein T' is an entity pair set obtained through negative sampling, lambda is a super parameter, and an entity similarity matrix is obtained and stored.

6. The method for aligning a structure with an entity of an attribute attention mechanism according to claim 5, wherein in the step S4, since different information in a map is learned from the attribute and the structure channel respectively, the different channels need to be aggregated, and the influence of the weight ratio occupied by the different channels on the alignment effect of the final entity is determined, where the learning of the weight information is considered as a classification problem, specifically, the label 0 indicates misalignment, the label 1 indicates alignment, and BCEWithLogitsLoss is adopted as a loss function, and the mathematical expression is:

where N is the number of samples, y _i Is the real label of the ith sample, takes the value of 0 or 1, x _i Is the predicted value of the i-th sample;