CN114372505A

CN114372505A - Unsupervised network alignment method and system

Info

Publication number: CN114372505A
Application number: CN202111500307.4A
Authority: CN
Inventors: 侯家琛; 战德成; 杨林瑶; 徐延才; 李小双; 王晓; 王飞跃; 张俊
Original assignee: Qingdao Academy Of Intelligent Industries
Current assignee: Qingdao Academy Of Intelligent Industries; State Grid Zhejiang Electric Power Co Ltd
Priority date: 2021-12-09
Filing date: 2021-12-09
Publication date: 2022-04-19

Abstract

The invention provides an unsupervised network alignment method and a system, wherein the technical scheme of the method comprises an embedded structure calculation step, an adjacent matrix of a source network and a target network is obtained, and the embedded structures of the source network and the target network are learned through two graph neural networks respectively; calculating a projection matrix, namely calculating the projection matrix of the source network and the target network according to the adjacent matrix, the embedded structure and the hyper-parameter of the source network and the target network; and a node alignment calculation step, namely calculating the similarity of the projection matrix to obtain a similarity matrix, and performing node matching of the source network and the target network according to the similarity matrix. The invention solves the problems of high cost, long time consumption and poor effect of the existing network alignment method.

Description

Unsupervised network alignment method and system

Technical Field

The invention belongs to the technical field of machine learning, and particularly relates to an unsupervised network alignment method and an unsupervised network alignment system.

Background

Identifying Network Alignments (NA) of cross-network node correspondences is critical to integrating knowledge from multiple networks while providing a comprehensive view of network analysis. The most advanced NA approach is based on labeled cross-network anchor linking, modeling the similarity of nodes to low-dimensional embedding. However, the labeling of anchor links is highly labor-dependent, costly, time-consuming, and difficult to implement due to privacy and security issues. While some unsupervised network alignment methods attempt to align the underlying space of different networks based on an antagonistic mapping model by learning a mapping function, they rely on highly discriminating attributes to learn embedding, which is difficult to converge to an optimal solution.

Disclosure of Invention

The embodiment of the application provides an unsupervised network alignment method and system, and aims to at least solve the problems of high cost, long time consumption and poor effect of the conventional network alignment method.

In a first aspect, an embodiment of the present application provides an unsupervised network alignment method, including: an embedded structure calculation step, namely acquiring an adjacent matrix of a source network and a target network, and respectively learning the embedded structures of the source network and the target network through two graph neural networks; calculating a projection matrix, namely calculating the projection matrix of the source network and the target network according to the adjacent matrix, the embedded structure and the hyper-parameter of the source network and the target network; and a node alignment calculation step, namely calculating the similarity of the projection matrix to obtain a similarity matrix, and performing node matching of the source network and the target network according to the similarity matrix.

In some of these embodiments, the projection matrix calculating step further comprises: and when the projection matrix is calculated, carrying out forward projection and backward projection restoration in sequence.

In some of these embodiments, the projection matrix calculating step further comprises: the loss function is calculated by the sinkhorn distance.

In some of these embodiments, the projection matrix calculating step further comprises: the projection matrix is optimized by adopting a cycle consistent countermeasure model based on a CycleGAN framework.

In some of these embodiments, the embedded structure calculating step further comprises: and calculating the embedded structure by adopting a DGI machine learning model.

In some of these embodiments, the embedded structure calculating step further comprises: and respectively adding and averaging the embedded structure values of all the nodes in the source network and the target network to generate the embedded structure of the whole graph of the source network and the target network.

In some of these embodiments, the embedded structure calculating step further comprises: and training the DGI machine learning model by using binary cross entropy loss.

In some of these embodiments, the node alignment calculating step further comprises: and the node matching adopts collective corresponding distribution based on a greedy strategy.

In some of these embodiments, the node alignment calculating step further comprises: and calculating the Euclidean distance of the projection matrix of the source network and the target network to obtain a distance matrix, and calculating according to the distance matrix to obtain the node distribution dictionary.

In a second aspect, an embodiment of the present application provides an unsupervised network alignment system, which is applicable to the above unsupervised network alignment method, and includes: the embedded structure calculation module is used for acquiring an adjacent matrix of the source network and the target network and respectively learning the embedded structures of the source network and the target network through two graph neural networks; the projection matrix calculation module is used for calculating to obtain a projection matrix of the source network and the target network according to the adjacent matrix, the embedded structure and the hyper-parameter of the source network and the target network; and the node alignment calculation module is used for calculating the similarity of the projection matrix to obtain a similarity matrix and carrying out node matching of the source network and the target network according to the similarity matrix.

Compared with the related art, the unsupervised network alignment method provided by the embodiment of the application has the following advantages:

1. the invention provides an unsupervised method which effectively utilizes structural information and a countermeasure model with periodic consistency to learn node representation and optimize a mapping function of a network alignment problem.

2. The present invention initializes the mapping function of the countermeasure model by iteratively solving the warerstein-Prokrustes problem to facilitate the learning of the countermeasure model.

3. The present invention optimizes the confrontation mapping model by minimizing the Sinhonn distance between the translation source and target embeddings, which helps to alleviate the pattern collapse problem.

4. The invention is based on a greedy strategy and collectively and efficiently distributes node correspondences with one-to-one constraints.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:

FIG. 1 is a flow chart of an unsupervised network alignment method of the present invention;

FIG. 2 is a block diagram of an unsupervised network alignment system of the present invention;

FIG. 3 is a block diagram of an electronic device according to an embodiment of the present invention;

FIG. 4 is a general diagram of the unsupervised network alignment method of the present invention;

FIG. 5 is a diagram illustrating the experimental effect of the unsupervised network alignment method of the present invention;

in the above figures:

1. an embedded structure calculation module; 2. a projection matrix calculation module; 3. a node alignment calculation module; 60. A bus; 61. a processor; 62. a memory; 63. a communication interface.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be described and illustrated below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments provided in the present application without any inventive step are within the scope of protection of the present application.

It is obvious that the drawings in the following description are only examples or embodiments of the present application, and that it is also possible for a person skilled in the art to apply the present application to other similar contexts on the basis of these drawings without inventive effort. Moreover, it should be appreciated that in the development of any such actual implementation, as in any engineering or design project, numerous implementation-specific decisions must be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which may vary from one implementation to another.

Reference in the specification to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the specification. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of ordinary skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments without conflict.

Unless defined otherwise, technical or scientific terms referred to herein shall have the ordinary meaning as understood by those of ordinary skill in the art to which this application belongs. Reference to "a," "an," "the," and similar words throughout this application are not to be construed as limiting in number, and may refer to the singular or the plural. The present application is directed to the use of the terms "including," "comprising," "having," and any variations thereof, which are intended to cover non-exclusive inclusions; for example, a process, method, system, article, or apparatus that comprises a list of steps or modules (elements) is not limited to the listed steps or elements, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

The method includes the steps that firstly, a network is embedded based on an unsupervised depth map information (DGI) model, global and local structural features of nodes are captured, a Wasserstein Procrustes problem is solved through iteration based on an embedded matrix and an adjacent matrix, a linear mapping function is initialized, and embedding is converted into a common vector space. A pair of countermeasure networks is then applied to improve the linear mapping function with the cyclic consistency constraint. And finally, calculating the distance between the source embedding and the target embedding after conversion, and carrying out one-to-one alignment based on a collective correspondence allocation algorithm.

The nodes are expressed into the public vector space through cross-network mapping of the embedding space and based on a confrontation model learning mapping function with a fine initialization period consistent. According to the structure information of the network, the network is embedded, and a mapping function between different embedding spaces is learned through iterative calculation of an equivalent Wasserstein-Prokrusters problem, so that countertraining is better initialized. And then, the mapping function is refined through a cyclic consistent countermeasure model, the embedding distribution difference based on cyclic countermeasure learning is further reduced, and the Sinhoen (Sinkhorn) distance is added into the loss function to relieve the mode collapse problem. And performing one-to-one constraint efficient distribution on the accurate and robust corresponding nodes by utilizing the distance between the mapping embedded points based on a set corresponding distribution algorithm.

Embodiments of the invention are described in detail below with reference to the accompanying drawings:

referring to fig. 1 and 4, the unsupervised network alignment method of the present invention includes the following steps:

s1: and acquiring an adjacency matrix of the source network and the target network, and learning the embedded structures of the source network and the target network respectively through two graph neural networks.

Optionally, the embedded structure is calculated by using a DGI machine learning model.

Optionally, the embedded structure values of all nodes in the source network and the target network are respectively added and averaged to generate the embedded structure of the whole graph of the source network and the target network.

Optionally, the DGI machine learning model is trained using binary cross entropy loss.

In a specific implementation, a DGI machine learning model is adopted in the step, the model is an unsupervised learning model, and the embedded structure can be learned without label attributes.

In a multi-network system, networks are paired one-to-one, denoted as G1 ═ V1, E1 and G2 ═ V2, E2, where V1, V2 are node sets and E1, E2 are corresponding links between nodes. Given a node in G1, the task of Unsupervised Network Alignment (UNA) is to find its corresponding node in G2, without observing any node correspondence.

In general, if both networks have n nodes, each network Gi learns a d-dimensional embedding matrix

Due to different embedding spaces of different networks and heterogeneous characteristics thereof, a linear transformation is learned to project one embedding space to another, and when the node correspondence is known, an orthogonal Procrustes (Procrustes) method is applied. When the transformation matrix is known, the mapping matrix is found by minimizing the squared Waterstein (Wasserstein) distance. When any node correspondence cannot be known in advance and the linear transformation matrix cannot be known at the same time. Then, a Wasserstein-Procrustes problem analysis method is applied, that is, the learning-optimized node correspondence and the corresponding linear transformation.

The goal of the DGI model is to maximize the local mutual information, i.e., maximize the joint probability of pairs of sets of blocks (patch). Thus, the model is trained with a binary cross entropy loss:

where D is the discriminator and D (-) is the positive probability score for a given set of blocks (patch-sum) pair.

Is a negative sample obtained by preserving the original adjacency matrix while generating the impairment characteristics by a line-by-line transformation of X. m is the number of negative samples. The model maximizes the objective by updating the parameters of the block encoder and discriminator through gradient descent training. A linear transformation (LA) of the adjacency matrix is employed based on security and privacy concerns. Based on the DGI model, both global and local structural features of the nodes remain in their embedding, which helps to effectively capture the structural similarity of the nodes.

S2: and calculating to obtain a projection matrix of the source network and the target network according to the adjacent matrix, the embedded structure and the hyper-parameter of the source network and the target network.

Optionally, when the projection matrix is calculated, forward projection and backward projection restoration are performed in sequence.

Alternatively, the loss function is calculated by the sinkhorn distance.

Optionally, a cyclic gan framework-based period consistent confrontation model is used to optimize the projection matrix.

In the implementation, due to the source network G₁And a target network G₂Their nodes may be represented in different embedding spaces. The application aligns their embedding spaces to compute the similarity between different network nodes, i.e. learns a linear mapping function to project one embedding space to another.

By learning an initial linear transformation, it is based on iterative calculations between the Wasserstein Procrustes (Wasserstein Procrustes) problem. First, Wasser initializes Wasser-Prokruslite by solving a classical orthogonal regularized quadratic graph matching problemstein Procrustes) problem. The first component of the goal is to learn a permutation matrix to maximize G₁And G₂And the second component is to ensure G₁Each node in (2) is assigned to G only₂One node in (b).

In a specific implementation, the initial projection matrix is calculated from an orthogonal Procruster (Procrustes) analysis, using the following algorithm:

inputting: node embedding Z₁、Z₂Adjacent matrix A₁、A₂Of a hyperparameter λ₀、μ、η

And (3) outputting: q

Initializing a permutation matrix:

initialization of a transformation matrix: q ═ UV^T

When T is 1 → T, run

U∑V^T＝SVD(Q-ηG_t)；Q＝UV^T

Thereafter, in each iteration t, P is first calculated by solving the walerstein (Wasserstein) problem by using the simhonen (Sinkhorn) algorithm and regularization based on the current transformation matrix Q_t. Then, from Wasserstein-Proklusters (Wasserstein products)

Calculating a gradient G for Q_tAnd uses it to update the transformation matrix, thereby solving the probukast (Procustes) problem. The mapping initialization procedure based on the Wasserstein-Procrustes problem solution is shown in the above algorithm. Based on the algorithm, Z is₁Is transformed into Z₂The mapping function of the embedding space of (2) is initialized to the transformation matrix. Similarly, slave Z may also be initialized₂To Z₁The mapping function of (2).

The method adopts a cyclic GAN framework-based period consistency countermeasure model to optimize a projection matrix so as to promote a mapping function to generate the invariant embedding of the distributed network. The Sinhoen (Sinkhorn) distance between the map structure embeddings is used to guide the training of the period-consistent confrontation model. Given G₁And G₂The model learns two generators: f₁₂：Z₁→Z₂And F₂₁：Z₂→Z₁The aim is to project one vector space into another in the distribution layer. For example, if G is tried₁Conversion to G₂Then by the generator F₁₂Is responsible for the space Z₁Is converted to Z₂The aim being to convert Z₁And Z₂Is the smallest.

In a specific implementation, the projection matrix and the result are adjusted by a recurrent neural network. And learning two mapping matrixes, and restoring by one forward projection and one backward projection, so that the difference loss of the two projections is reduced as much as possible, and the points of the two networks are aligned one by one. And calculates the loss function by the sinkhorn distance.

The Sinhoen (Sinkhorn) distance is a useful distance measure for probability distributions. Z 'is measured by using the Sinkhorn distance'₁And Z₂And Z'₂And Z₁Divergence between them.

Formally, the Sinkhorn distance between matrices X and Y is defined as:

wherein<·，·>Representing a Frobenius dot product, M being a matrix of distances between X and Y, U_a(r, c) is a transmission polyhedron with entropy constraints defined as:

where h (-) is the entropy function, r and c are the sample weights in the source and target domains, and α is the hyper-parameter. Various distance measurements may be used to calculate the distance matrix M, using the formula:

wherein x_iAnd y_jAre the ith and jth vectors of X and Y. Obviously, the entries of the distance matrix are limited to [0, 2 ]]Within the range. Then, the Sinkhorn distance is calculated according to the following algorithm, wherein lambda₁Is an entropy constrained lagrange multiplier.

Representing the inner product calculation. r and c are set to be proportional to the degree of the node.

Calculation of Sinkhorn distance

Inputting: m, r, c, lambda₁，T

And (3) outputting: d_sh(X，Y)

When T is 1 → T, run

μ＝r./K_v；

v＝c./K^Tμ；

After the Sinkhorn distance is defined, the generator F₁₂The challenge loss function of (a) is defined as:

generator F₂₁For coupling Z₂Conversion to Z₂The antagonism loss function of (a) is calculated as follows:

with these competing losses, the generator will attempt to convert the embedding of the network to be similar to the embedding of another network. To further reduce the possible mapping space and guide the model to learn one-to-one mappings, the method jointly trains two generators by adding a cycle consistency constraint and a reconstruction penalty. G₁Is then embedded by F₂₁Transformation generating reconstruction node embedding Z₁Similarly, G₂Reconstructed node embedding slave F₁₂To obtain the compound. The reconstruction loss function is then calculated as:

wherein | · | purple₁Represents L₁Distance.

Finally, the integrity loss function of the generator is as follows:

wherein λ₂Is the relative weight of the reconstruction penalty.

The linear generator is jointly optimized with an Adam optimizer by minimizing the above-mentioned loss function. Using the initialized mapping function, the training of the cyclic consistent countermeasure model is faster and does not require attribute embedding. Under the continuous supervision of the Sinkhorn distribution distance and the cycle consistency constraint, the model is converged to an optimal solution, and a one-to-one mapping function is learned.

S3: and calculating the similarity of the projection matrix to obtain a similarity matrix, and performing node matching of the source network and the target network according to the similarity matrix.

Optionally, the node matching adopts collective corresponding allocation based on a greedy policy.

Optionally, the euclidean distance between the projection matrices of the source network and the target network is calculated to obtain a distance matrix, and the node distribution dictionary is calculated according to the distance matrix.

In specific implementation, similarity is calculated for the projected embedded structure to obtain a similarity matrix, and then node matching is performed on the similarity matrix. The node alignment is calculated using two network similarities.

In the specific implementation, the following collective correspondence allocation algorithm based on the greedy strategy is adopted:

inputting: maximum value of network distance S

And (3) outputting: node distribution dictionary A

When t is 1 → n, run

Finding the minimum term of S and recording the horizontal and vertical indexes i and j of the minimum term;

distribution G₁Ith node of (2) and G₂The jth node of (a) is added to a in pairs;

changing the ith row and jth column element of S to infinity;

in an implementation, Z is calculated₁And Z₂The distance matrix S is obtained.

It should be noted that the steps illustrated in the above-described flow diagrams or in the flow diagrams of the figures may be performed in a computer system, such as a set of computer-executable instructions, and that, although a logical order is illustrated in the flow diagrams, in some cases, the steps illustrated or described may be performed in an order different than here.

The embodiment of the application provides an unsupervised network alignment system, which is suitable for the unsupervised network alignment method. As used below, the terms "unit," "module," and the like may implement a combination of software and/or hardware of predetermined functions. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware or a combination of software and hardware is also possible and contemplated.

Fig. 2 is a block diagram of an unsupervised network alignment system according to the present invention, please refer to fig. 2, which includes:

embedded structure calculation module 1: and acquiring an adjacency matrix of the source network and the target network, and learning the embedded structures of the source network and the target network respectively through two graph neural networks.

In a specific implementation, the DGI machine learning model is adopted by the module, is an unsupervised learning model, and can learn the embedded structure without label attributes.

Projection matrix calculation module 2: and calculating to obtain a projection matrix of the source network and the target network according to the adjacent matrix, the embedded structure and the hyper-parameter of the source network and the target network.

Alternatively, the loss function is calculated by the sinkhorn distance.

By learning an initial linear transformation, it is based on iterative calculations between the Wasserstein Procrustes (Wasserstein Procrustes) problem. The permutation matrix in the walserstein-procklusters (Wasserstein Procrustes) problem is first initialized by solving the classical orthogonal regularized quadratic graph matching problem. The first component of the goal is to learn a permutation matrix to maximize G₁And G₂And the second component is to ensure G₁Each node in (2) is assigned to G only₂One node in (b).

And (3) outputting: q

Initializing a permutation matrix:

initialization of a transformation matrix: q ═ UV^T

When T is 1 → T, run

U∑V^T＝SVD(Q-ηG_t)；Q＝UV^T

The system adopts a cyclic GAN framework-based period consistent countermeasure model to optimize a projection matrix so as to promote a mapping function to generate the distributed network invariant embedding. The Sinhoen (Sinkhorn) distance between the map structure embeddings is used to guide the training of the period-consistent confrontation model. Given G₁And G₂The model learns two generators: f₁₂：Z₁→Z₂And F₂₁：Z₂→Z₁The aim is to project one vector space into another in the distribution layer. For example, if G is tried₁Conversion to G₂Then by the generator F₁₂Is responsible for the space Z₁Is converted to Z₂The aim being to convert Z₁And Z₂Is the smallest.

Formally, the Sinkhorn distance between matrices X and Y is defined as:

wherein x_iAnd y_jAre the ith and jth vectors of X and Y. Obviously, the entries of the distance matrix are limited to [0, 2 ]]Within the range. Then, S is calculated according to the following algorithminkhorn distance, where λ₁Is an entropy constrained lagrange multiplier.

Calculation of Sinkhorn distance

Inputting: m, r, c, lambda₁，T

And (3) outputting: d_sh(X，Y)

When T is 1 → T, run

μ＝r./K_v；

v＝c./K^Tμ；

with these competing losses, the generator will attempt to convert the embedding of the network to be similar to the embedding of another network. To further reduceThe system jointly trains two generators by adding cycle consistency constraints and reconstruction losses. G₁Is then embedded by F₂₁Transformation generates reconstruction node embedding Z ″)₁Similarly, G₂Reconstructed node embedding slave F₁₂To obtain the compound. The reconstruction loss function is then calculated as:

wherein | · | purple₁Represents L₁Distance.

Finally, the integrity loss function of the generator is as follows:

wherein λ₂Is the relative weight of the reconstruction penalty.

The node alignment calculation module 3: and calculating the similarity of the projection matrix to obtain a similarity matrix, and performing node matching of the source network and the target network according to the similarity matrix.

inputting: maximum value of network distance S

And (3) outputting: node distribution dictionary A

When t is 1 → n, run

changing the ith row and jth column element of S to infinity;

In summary, an unsupervised network alignment method and an unsupervised network alignment system of the present application first learns different network embeddings based on an unsupervised depth map information (DGI) model that preserves useful global and local structural proximity in the embeddings. Then, the mapping function is initialized based on iterative optimization of the Wasserstein Procrustes problem to facilitate learning of the countermeasure model. Then, using a round robin consistent counterlearning model, the goal is to minimize the Sinkhorn distance between the mapped embedding spaces. Through the above steps, the mapping function between different embedding spaces is learned. Finally, in order to effectively learn the one-to-one correspondence of a large-scale network, a collective correspondence allocation method is used.

The present application helps identify more accurate corresponding nodes by preventing many-to-one alignment caused by a central problem that disrupts alignment accuracy. As shown in FIG. 5, experiments at different noise levels prove that the method has superiority in the unsupervised network supervision problem. Based on the knowledge fusion of the alignment network, accurate prediction and prescription of the networks can be realized through a parallel learning method, so that a network system is more intelligent and controllable. The performance of the present application based on an embedded baseline is better than that based on optimization because the embedding effectively captures the proximity between the structure nodes. The present application is highly robust to noise. The performance degradation over time is less for accuracy than other similar methods. The application can be extended to many types of networks and different types of information (e.g., relationship information) can be fused into the network embedding to improve it. Furthermore, combining unsupervised domain switching with downstream network migration learning tasks would be another direction of the present application.

Additionally, an unsupervised network alignment method described in connection with fig. 1 may be implemented by an electronic device. Fig. 3 is a block diagram of an electronic device according to an embodiment of the invention.

The electronic device may comprise a processor 61 and a memory 62 in which computer program instructions are stored.

Specifically, the processor 61 may include a Central Processing Unit (CPU), or A Specific Integrated Circuit (ASIC), or may be configured to implement one or more Integrated circuits of the embodiments of the present Application.

Memory 62 may include, among other things, mass storage for data or instructions. By way of example, and not limitation, memory 62 may include a Hard Disk Drive (Hard Disk Drive, abbreviated HDD), a floppy Disk Drive, a Solid State Drive (SSD), flash memory, an optical Disk, a magneto-optical Disk, tape, or a Universal Serial Bus (USB) Drive or a combination of two or more of these. Memory 62 may include removable or non-removable (or fixed) media, where appropriate. The memory 62 may be internal or external to the data processing apparatus, where appropriate. In a particular embodiment, the memory 62 is a Non-Volatile (Non-Volatile) memory. In particular embodiments, Memory 62 includes Read-Only Memory (ROM) and Random Access Memory (RAM). The ROM may be mask-programmed ROM, Programmable ROM (PROM), Erasable PROM (EPROM), Electrically Erasable PROM (EEPROM), Electrically rewritable ROM (EAROM), or FLASH Memory (FLASH), or a combination of two or more of these, where appropriate. The RAM may be a Static Random-Access Memory (SRAM) or a Dynamic Random-Access Memory (DRAM), where the DRAM may be a Fast Page Mode Dynamic Random-Access Memory (FPMDRAM), an Extended data output Dynamic Random-Access Memory (EDODRAM), a Synchronous Dynamic Random-Access Memory (SDRAM), and the like.

The memory 62 may be used to store or cache various data files that need to be processed and/or used for communication, as well as possible computer program instructions executed by the processor 61.

The processor 61 implements any of the above embodiments of the unsupervised network alignment method by reading and executing computer program instructions stored in the memory 62.

In some of these embodiments, the electronic device may also include a communication interface 63 and a bus 60. As shown in fig. 3, the processor 61, the memory 62, and the communication interface 63 are connected via a bus 60 to complete communication therebetween.

The communication port 63 may be implemented with other components such as: the data communication is carried out among external equipment, image/data acquisition equipment, a database, external storage, an image/data processing workstation and the like.

The bus 60 includes hardware, software, or both to couple the components of the electronic device to one another. Bus 60 includes, but is not limited to, at least one of the following: data Bus (Data Bus), Address Bus (Address Bus), Control Bus (Control Bus), Expansion Bus (Expansion Bus), and Local Bus (Local Bus). By way of example, and not limitation, Bus 60 may include an Accelerated Graphics Port (AGP) or other Graphics Bus, an Enhanced Industry Standard Architecture (EISA) Bus, a Front-Side Bus (FSB), a Hyper Transport (HT) Interconnect, an ISA (ISA) Bus, an InfiniBand (InfiniBand) Interconnect, a Low Pin Count (LPC) Bus, a memory Bus, a microchannel Architecture (MCA) Bus, a PCI (Peripheral Component Interconnect) Bus, a PCI-Express (PCI-X) Bus, a Serial Advanced Technology Attachment (SATA) Bus, a Video Electronics Bus (audio Electronics Association), abbreviated VLB) bus or other suitable bus or a combination of two or more of these. Bus 60 may include one or more buses, where appropriate. Although specific buses are described and shown in the embodiments of the application, any suitable buses or interconnects are contemplated by the application.

The electronic device may perform an unsupervised network alignment method in the embodiments of the present application.

In addition, in combination with the unsupervised network alignment method in the foregoing embodiments, the embodiments of the present application may provide a computer-readable storage medium to implement. The computer readable storage medium having stored thereon computer program instructions; the computer program instructions, when executed by a processor, implement any of the above-described embodiments of an unsupervised network alignment method.

And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. An unsupervised network alignment method, comprising:

an embedded structure calculation step, namely acquiring an adjacent matrix of a source network and a target network, and respectively learning the embedded structures of the source network and the target network through two graph neural networks;

calculating a projection matrix, namely calculating the projection matrix of the source network and the target network according to the adjacency matrix, the embedded structure and the hyper-parameter of the source network and the target network;

and a node alignment calculation step, namely calculating the similarity of the projection matrix to obtain a similarity matrix, and performing node matching of the source network and the target network according to the similarity matrix.

2. The unsupervised network alignment method of claim 1, wherein the projection matrix calculating step further comprises:

and when the projection matrix is calculated, carrying out forward projection and backward projection reduction in sequence.

3. The unsupervised network alignment method of claim 2, wherein the projection matrix calculating step further comprises:

the loss function is calculated by the sinkhorn distance.

4. The unsupervised network alignment method of claim 3, wherein the projection matrix calculating step further comprises:

and optimizing the projection matrix by adopting a cycle consistent countermeasure model based on a cycleGAN framework.

5. The unsupervised network alignment method of claim 1, wherein the embedding structure calculating step further comprises:

computing the embedded structure using a DGI machine learning model.

6. The unsupervised network alignment method of claim 5, wherein the embedding structure calculating step further comprises:

and respectively adding and averaging the embedded structure values of all nodes in the source network and the target network to generate the embedded structure of the whole graph of the source network and the target network.

7. The unsupervised network alignment method of claim 5, wherein the embedding structure calculating step further comprises:

training the DGI machine learning model using binary cross entropy loss.

8. The unsupervised network alignment method of claim 1, wherein the node alignment calculation step further comprises:

and the node matching adopts collective corresponding distribution based on a greedy strategy.

9. The unsupervised network alignment method of claim 8, wherein the node alignment calculation step further comprises:

and calculating the Euclidean distance between the projection matrixes of the source network and the target network to obtain a distance matrix, and calculating according to the distance matrix to obtain a node distribution dictionary.

10. An unsupervised network alignment system, comprising:

the embedded structure calculation module is used for acquiring an adjacent matrix of a source network and a target network and learning the embedded structures of the source network and the target network through two graph neural networks respectively;

the projection matrix calculation module is used for calculating to obtain a projection matrix of the source network and the target network according to the adjacency matrix, the embedded structure and the hyper-parameter of the source network and the target network;

and the node alignment calculation module is used for calculating the similarity of the projection matrix to obtain a similarity matrix and performing node matching of the source network and the target network according to the similarity matrix.