CN115511076A - Network representation learning method, device, equipment and storage medium - Google Patents

Network representation learning method, device, equipment and storage medium Download PDF

Info

Publication number
CN115511076A
CN115511076A CN202211196991.6A CN202211196991A CN115511076A CN 115511076 A CN115511076 A CN 115511076A CN 202211196991 A CN202211196991 A CN 202211196991A CN 115511076 A CN115511076 A CN 115511076A
Authority
CN
China
Prior art keywords
node
network
target
neural network
representation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211196991.6A
Other languages
Chinese (zh)
Inventor
张春会
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BOE Technology Group Co Ltd
Original Assignee
BOE Technology Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BOE Technology Group Co Ltd filed Critical BOE Technology Group Co Ltd
Priority to CN202211196991.6A priority Critical patent/CN115511076A/en
Publication of CN115511076A publication Critical patent/CN115511076A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/002Biomolecular computers, i.e. using biomolecules, proteins, cells
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Organic Chemistry (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Chemical & Material Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present disclosure provides a network representation learning method, apparatus, device and storage medium, wherein the network representation learning method utilizes a target neural network to perform network representation learning in a target field; the target neural network comprises an N-level first neural network and an N-1 level second neural network; the output of the first neural network of the t level is the input of the second neural network of the t level; the output of the (t-1) th stage second neural network is the input of the t stage first neural network and the input of the t stage second neural network; the network representation learning method comprises the following steps: acquiring heterogeneous network data; inputting heterogeneous network data into a target neural network to obtain a first network representation; determining training data based on the heterogeneous network data and the first network representation; and training the logistic regression model and the target neural network based on the training data, and when the training of the logistic regression model and the target neural network is completed, using a second network representation output by the target neural network as a target network representation of the target field task.

Description

Network representation learning method, device, equipment and storage medium
Technical Field
The disclosure belongs to the technical field of network representation learning and graph neural networks, and particularly relates to a network representation learning method, device, equipment and storage medium.
Background
A biological network is a way of representing a biological system in a graph. The nodes in the biological network are elements in the biological system, and the edges in the biological network represent the incidence relation among the elements. For example, in a protein-protein interaction network, proteins constitute nodes of the biological network, and interactions between proteins constitute edges of the biological network. The biological network reflects the complex and intricate association in the biological system, is of great help for understanding the biological system, and can also be used for solving various tasks in the life science neighborhood, so that the biological network has high application value for researching and analyzing the biological network. The key of the research on the biological network is how to extract the feature information of the biological network, and the network representation learning has high potential in the aspect of network feature extraction.
Disclosure of Invention
The present disclosure is directed to at least one of the technical problems in the prior art, and provides a network representation learning method, apparatus, device and storage medium.
On the first hand, the technical scheme adopted for solving the technical problem of the disclosure is a network representation learning method, which utilizes a pre-established target neural network to learn network representation in the target field; the target neural network comprises an N-level first neural network and an N-1 level second neural network; wherein, the middle node of the first neural network output of the t-th level is represented as the input of the second neural network of the t-th level corresponding to the middle node; target hidden state information output by the second neural network of the (t-1) th stage is input to the first neural network of the t stage and input to the second neural network of the t stage respectively; n is more than or equal to 2,2 and less than or equal to t, N-1,N is obtained, and t is rounded;
the network representation learning method comprises the following steps:
acquiring heterogeneous network data in a target field; the heterogeneous network data comprises data of each node in the heterogeneous network;
inputting the heterogeneous network data into the target neural network, and obtaining a first network representation output by the target neural network through processing of the first neural networks at all levels and the second neural networks at all levels;
determining training data based on the heterogeneous network data and the first network representation;
and training a logistic regression model and the target neural network based on the training data, wherein when the logistic regression model and the target neural network are trained, a second network representation output by the target neural network is used as a target network representation of the target field task.
In some embodiments, the obtaining heterogeneous network data in the target domain includes:
acquiring at least one network in a target field from a preset database; the network in the target field comprises a plurality of different types of nodes;
and integrating at least one acquired network in the target field based on the type of each node in the network in each target field, building a heterogeneous network corresponding to the target field task, and acquiring heterogeneous network data.
In some embodiments, determining an intermediate node representation of the first neural network output at stage t comprises:
determining type-level information corresponding to the type of each node in the heterogeneous network by adopting a double-layer graph attention machine mechanism based on the heterogeneous network data, determining node level information based on type level information corresponding to the type of each node;
acquiring predetermined structural information of each node; the structure information comprises incidence relation information among all nodes;
based on the node level information and the structural information, target attention information between nodes is determined to determine an intermediate node representation of the first neural network output at level t.
In some embodiments, determining type level information corresponding to a target type includes:
determining a first node representation of each neighbor node adjacent to a particular node based on target hidden state information output by the second neural network of level (t-1);
determining a target type representation of a target type based on a first node representation of each neighbor node adjacent to a particular node, a type of each of the neighbor nodes, and an adjacency matrix of the heterogeneous network data;
determining a type-level attention score corresponding to the target type based on the target type representation and a second node representation of the particular node;
and determining type level information corresponding to the target type based on the type level attention scores respectively corresponding to the types and the type level attention score corresponding to the target type.
In some embodiments, the determining the node-level information for one target neighbor node adjacent to the particular node comprises:
determining a node-level attention score for the target neighbor node based on type-level information corresponding to a first node representation of each of the neighbor nodes that are adjacent to the particular node, a second node representation of the particular node, and a type of each of the neighbor nodes;
determining node-level information for the target neighbor node based on the node-level attention scores of the neighbor nodes adjacent to the particular node and the node-level attention score of the target neighbor node.
In some embodiments, the step of determining the structural information of each node comprises:
carrying out isomorphism processing on the heterogeneous network to obtain an isomorphism network;
and determining the structural similarity among the nodes in the homogeneous network by a preset similarity algorithm based on the third node representation of each node in the homogeneous network so as to obtain the structural level information of each node.
In some embodiments, the determining target attention information between nodes based on the node level information and the structure level information comprises:
acquiring a first fusion weight of the node-level information and a second fusion weight of the structure-level information;
determining target attention information between nodes based on the node-level information, the first fusion weight of the node-level information, the structure-level information, and the second fusion weight of the structure-level information.
In some embodiments, said determining an intermediate node representation of said first neural network output at stage t comprises:
determining node representation matrixes respectively corresponding to various types based on target hidden state information output by the (t-1) th-level second neural network;
determining intermediate node representation output by the first neural network at the t-th level based on target attention information among nodes and node representation matrixes respectively corresponding to the types; the node representation matrix includes node representations of respective nodes under the corresponding type.
In some embodiments, the second neural network is a gated cyclic unit GRU;
determining target hidden state information output by the second neural network at the t-th stage, including:
determining update gate data and reset gate data of a t-th stage of the GRU based on intermediate node information output by the first neural network of the t-th stage and target hidden state information output by the GRU of the (t-1) th stage received by the GRU of the t-th stage;
determining candidate hidden state information of the GRU of the t-th stage based on the reset gate data, the intermediate node information output by the first neural network of the t-th stage, and the target hidden state information output by the GRU of the (t-1) th stage;
determining target hidden state information for the GRU output of the t-th stage based on the candidate hidden state information, the update gate data, and the target hidden state information for the GRU output of the (t-1) th stage.
In some embodiments, said determining training data based on said heterogeneous network data and said first network representation comprises:
screening out a preset number of node pairs from the heterogeneous network, judging whether an association relationship exists between nodes in the node pairs based on the heterogeneous network data, and setting preset labels for the node pairs with the association relationship;
based on the first network representation, tagged node pair data is determined and used as training data.
In some embodiments, the training a logistic regression model and the target neural network based on the training data, and the training the logistic regression model and the target neural network being completed, the training the target neural network outputting a second network representation as a target network representation of the target domain task includes:
inputting the training data into the logistic regression model to perform link prediction to obtain a link prediction result;
constructing a weighted loss value based on the link prediction result, and training the logistic regression model and the target neural network by carrying out weighted back propagation on the weighted loss value until the weighted loss value is converged;
and when the training of the logistic regression model and the target neural network is completed, the second network representation output by the target neural network is used as the target network representation of the target field task.
In a second aspect, the present disclosure also provides a network representation learning apparatus, including: the system comprises a data acquisition module, a target neural network, a sample determination module and a logistic regression model; the target neural network comprises an N-level first neural network and an N-1 level second neural network; wherein, the middle node of the first neural network output of the t stage is represented as the input of the second neural network of the t stage corresponding to the middle node; target hidden state information output by the second neural network of the (t-1) th stage is input by the first neural network of the t stage and input by the second neural network of the t stage respectively; n is more than or equal to 2,2 and less than or equal to t, N-1,N and t are rounded;
the data acquisition module is configured to acquire heterogeneous network data in a target field; the heterogeneous network data comprises data of each node in the heterogeneous network;
the target neural network is configured to receive the heterogeneous network data, and output a first network representation after being processed by the first neural networks at all levels and the second neural networks at all levels;
the sample determination module configured to determine training data based on the heterogeneous network data and the first network representation;
and the logistic regression model is configured to train the target neural network and the self based on the training data, and when the training of the target neural network and the self is completed, a second network representation output by the target neural network is used as a target network representation of the target field task.
In a third aspect, an embodiment of the present disclosure further provides a computer device, where the computer device includes: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating over the bus when a computer device is running, the machine-readable instructions when executed by the processor performing the steps of the network representation learning method as in any one of the above embodiments.
In a fourth aspect, the disclosed embodiments also provide a computer non-transitory readable storage medium, where the computer non-transitory readable storage medium stores thereon a computer program, and the computer program is executed by a processor to perform the steps of the network representation learning method according to any one of the foregoing embodiments.
Drawings
Fig. 1 is a schematic diagram of a network architecture of a target neural network provided in an embodiment of the present disclosure;
fig. 2 is a schematic flowchart of a network representation learning method according to an embodiment of the present disclosure;
fig. 3 is a schematic diagram of an exemplary network integration provided by an embodiment of the present disclosure;
FIG. 4 is a schematic diagram of a network representation learning apparatus according to an embodiment of the disclosure;
fig. 5 is a schematic structural diagram of a computer device according to an embodiment of the present disclosure.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions of the embodiments of the present disclosure will be described clearly and completely with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are only a part of the embodiments of the present disclosure, not all of the embodiments. The components of the embodiments of the present disclosure, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present disclosure, presented in the figures, is not intended to limit the scope of the claimed disclosure, but is merely representative of selected embodiments of the disclosure. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the disclosure without making creative efforts, shall fall within the protection scope of the disclosure.
Unless otherwise defined, technical or scientific terms used herein shall have the ordinary meaning as understood by one of ordinary skill in the art to which this disclosure belongs. The use of "first," "second," and the like in this disclosure is not intended to indicate any order, quantity, or importance, but rather is used to distinguish one element from another. Also, the use of the terms "a," "an," or "the" and similar referents do not denote a limitation of quantity, but rather denote the presence of at least one. The word "comprising" or "comprises", and the like, means that the element or item listed before the word covers the element or item listed after the word and its equivalents, but does not exclude other elements or items. The terms "connected" or "coupled" and the like are not restricted to physical or mechanical connections, but may include electrical connections, whether direct or indirect. "upper", "lower", "left", "right", and the like are used merely to indicate relative positional relationships, and when the absolute position of the object being described is changed, the relative positional relationships may also be changed accordingly.
Reference to "a plurality or a number" in this disclosure means two or more. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.
In the related art, existing heterogeneous network representation learning methods are mainly divided into two main categories, one is a method based on random walk, for example, a method for machine learning by using matatath 2vec, HERec, HIN2vec and other algorithms; the other is a method based on a graph neural network, mainly a method based on a heteromorphic graph attention network, such as a method combining graph attention mechanism by using HAN, HGAT, hetGNN, MAGNN and the like. The method based on the heteromorphic graph attention network follows a message transfer paradigm, namely a paradigm of aggregating neighbor node information to update central node information, and is generally set to be T iterations, and information aggregation and updating are performed on all nodes once in each iteration.
However, the inventors have found that at least the following problems exist in the related art: 1) When the existing graph attention network carries out information aggregation on nodes, only a first-order neighborhood of the nodes is considered, and the structural information in the graph is not fully utilized, because the structural information of the nodes in the neighborhood is lost in the process of carrying out the information aggregation on the nodes. 2) If the high-order neighborhood is considered for deep stacking, the problem of over-smoothing will result. The reason for this is that: the graph attention network is essentially used for aggregating the information of neighbor nodes when the node is subjected to information aggregation, and for any node in the graph, the information of higher-order neighbor nodes is aggregated every time the characteristics of the node are updated. If the order of the highest neighbor node is referred to as the aggregation radius of the node, it can be found that as the number of layers of the graph attention network increases, the aggregation radius of the node also increases, and once a certain threshold is reached, the node covered by the node is almost consistent with the node of the whole graph, i.e. is over-smooth. 3) In the related art, only the result of information aggregation of the previous layer can be used as the input of the next layer to obtain a new aggregation result, and the fusion of the features of the graphs extracted by different information aggregation layers is not considered.
Based on this, the present disclosure provides, among other things, a network representation learning method that substantially obviates one or more of the problems due to limitations and disadvantages of the related art. Specifically, the network representation learning method utilizes a target neural network which is built in advance to learn the network representation in the target field. Fig. 1 is a schematic diagram of a network architecture of a target neural network provided by an embodiment of the present disclosure, as shown in fig. 1, wherein each of the first N-1 neural networks except the first neural network of the nth stage is in one-to-one correspondence with each of the second neural networks of the first N-1 stages, respectively; the middle node of the output of the first neural network of the t level is represented as the input of the second neural network of the t level corresponding to the middle node; target hidden state information output by the (t-1) th-stage second neural network is input by the t-stage first neural network and input by the t-stage second neural network respectively; n is more than or equal to 2,2 and less than or equal to t, N-1,N and t are rounded.
In the network representation learning method provided in the embodiment of the present disclosure, the network representation learning method provided in the embodiment of the present disclosure is described in detail below, and fig. 2 is a schematic flow diagram of the network representation learning method provided in the embodiment of the present disclosure, as shown in fig. 2, including steps S11 to S14:
s11, heterogeneous network data in the target field are obtained.
The heterogeneous network data comprises data of each node in the heterogeneous network.
In this step, the target domain may be, for example, a biological domain, and the target domain tasks may include link prediction tasks in the biological domain, such as prediction of co-morbid associations between disease nodes, mining of disease-related genes, mining of potential associations between diseases and mirnas, and the like. Alternatively, the target domain task may also include a link prediction task in another domain, for example, a task of performing classification prediction on a node in a heterogeneous network, classification prediction on the heterogeneous network, and association prediction on a node in the heterogeneous network, which is not specifically limited nor listed in this embodiment of the present disclosure.
For convenience of understanding, the following description will take an example of a heterogeneous network integrated based on various biological networks in the existing biological field as an example.
The heterogeneous network data comprises data of each node in the heterogeneous network, and mainly comprises content information and structure information of the nodes, wherein the content information comprises characteristic vectors of the nodes, and the structure information comprises incidence relation information among the nodes.
And S12, inputting the heterogeneous network data into the target neural network, and obtaining a first network representation output by the target neural network through the processing of the first neural networks at all levels and the second neural networks at all levels.
As shown in fig. 1, heterogeneous network data G is input into a target neural network, specifically, the heterogeneous network data is input into a first-stage first neural network and a first-stage second neural network, respectively. Wherein, the first-stage first neural network carries out logic processing based on the heterogeneous network data G, and the intermediate node of the output heterogeneous network data G represents H 1 (ii) a Thereafter, the intermediate node represents H 1 As input to a first-stage second neural network that represents H based on intermediate nodes 1 Performing logic processing to output target hidden state information h 1 . By analogy, the t-th level first neural network hides the state information h based on the target t-1 Performing logic processing to output intermediate node representation H t (ii) a Class t second neural network represents H based on intermediate nodes t And target hidden state information h output by the (t-1) th level second neural network t-1 Performing logic processing to output target hidden state information h t . Finally, the Nth-level first neural network hides the state information h based on the target N-1 Performing logic processing to output a first network representation H N . And the first network represents H N Is calculated based on the following formula Y. First network representation H N Including a node representation of each node.
And S13, determining training data based on the heterogeneous network data and the first network representation.
In the step, partial node pairs are screened out from the heterogeneous network data G, and for a given pair of node pairs P (i,j) . According to P (i,j) Whether an edge exists between the nodes is marked with 1 or 0, so that the label of the corresponding node pair is obtained, and H is represented based on the first network N Tagging node pair data with tagAs training data.
And S14, training the logistic regression model and the target neural network based on the training data, and when the logistic regression model and the target neural network are trained, using a second network representation output by the target neural network as a target network representation of the target field task.
Specifically, training data is used as input of the logistic regression model, the loss of the logistic regression model and the loss of the target neural network, which are the whole models, can be calculated by using a cross entropy loss function, and based on the calculated loss, parameter updating is performed through inverse propagation to obtain an optimal result.
It should be noted that the processes from S12 to S14 are cyclic reciprocating processes, that is, training the logistic regression model and the target neural network based on training data; then updating the logistic regression model and learnable parameters in the target neural network through inverse propagation based on the calculated loss; and then, performing a new cycle of circulation process, namely based on the logistic regression model and the target neural network after the parameters are updated, executing S12 to obtain a new first network representation, executing S13 to obtain new training data, executing S14 based on the new training data, continuing to train the logistic regression model and the target neural network, and circulating the process until the training of the logistic regression model and the target neural network is completed.
The embodiment of the disclosure applies the second neural network on a graph level, sets the first neural network and the second neural network which are in one-to-one correspondence, and represents the intermediate node output by the t-th level first neural network as the input of the t-th level second neural network corresponding to the first neural network, and the target hidden state information output by the (t-1) -th level second neural network is the input of the t-th level first neural network and the input of the t-th level second neural network respectively, so that the fusion of multi-level graph features (namely the intermediate node representation output by each level first neural network) is realized, the node representation of the whole graph is updated by utilizing the fused multi-level graph features, the advantages of different level graph features are complemented, and the defect that the fusion of different level graph features is not considered in the related technology is made up. Meanwhile, the graph features obtained by the low-level second neural network are kept to the high-level graph neural network, so that the problem of over-smoothing of the second neural network can be relieved to a certain extent.
The first neural network in the embodiment of the present disclosure may be a Heterogeneous Graph Attention network (ISHGAT) that introduces Structural Information. The second neural network may be a Gated Recurrent Unit (GRU).
Each of the above steps S11 to S14 will be described in detail.
For step S11, specifically, the network in at least one target domain may be acquired from a preset database. The network in the target domain comprises a plurality of different types of nodes. And integrating the acquired networks in at least one target field based on the types of all nodes in the network in each target field to build a heterogeneous network corresponding to the target field task and obtain heterogeneous network data.
Taking the Target domain as the biological domain as an example, the preset Database can be a public Database containing various biological networks, such as Pharmacogenetics and Pharmacogenomics Knowledge Base (Pharmacogenetics and Pharmacogenetics Knowledge Base, pharmGKB), human Protein Resource Database (HPRD), experimentally verified miRNA Target gene Database (MicroRNA-Target Interactions, MTI), that is, miRTarBase Database, miR2Disease Database.
The biological network can include at least four networks, a disease-gene association network, a gene-gene interaction network, a gene-microribonucleic acid (miRNA) association network, a miRNA-disease association network, and the like.
Specifically, the disease-gene association network can be obtained from the PharmGKB database; protein-protein interaction networks can be obtained from the HPRD database, and the nodes in the protein-protein interaction networks essentially correspond to genes, so that the protein-protein interaction networks can be converted into gene-gene interaction networks; a gene-miRNA association network can be obtained from a mirtarBase database; the miRNA-Disease association network can be obtained from the miR2Disease database. The four biological networks include three types of nodes, namely, disease, gene and miRNA. And then, integrating 4 biological networks by taking the same node under the same type as a link to form a heterogeneous network.
Exemplarily, fig. 3 is a schematic diagram of an exemplary network integration provided in an embodiment of the present disclosure, as shown in fig. 3, where a represents a disease, B represents a gene, and C represents miRNA, and a disease type may include a plurality of different nodes, for example, a node A1, a node A2, and a node A3; the gene type may comprise a plurality of different nodes B1 and B2; miRNA types may contain multiple different nodes C1 and C2. The same nodes in the disease-gene association network, the gene-gene interaction network, the gene-miRNA association network and the miRNA-disease association network are integrated into one node, so that the connection among different biological networks is realized.
The biological network includes three types of nodes, i.e., disease, gene, and miRNA, and in one possible implementation, the embodiments of the present disclosure may embed the word of the disease in the medical encyclopedia into a vector as a feature vector of the disease node. The sequence information of the gene is encoded into a vector as a feature vector of a gene node with Kemr as a basic unit. Similarly, the sequence information of the miRNA is encoded into a vector with Kemr as a basic unit as a feature vector of the miRNA node. And taking the feature vector of each node in the integrated heterogeneous network as the feature vector of the node of the heterogeneous network data G.
For step S12, a specific node is given to the heterogeneous network data G, and is denoted as a specific node v. Different types of adjacent nodes may have different effects on the particular node v. Neighboring nodes of the same type may also have different importance, and thus it is desirable to capture the different importance of both the node level and the type level. The embodiment of the disclosure adopts a double-layer graph attention mechanism, namely type-level attention and node-level attention, to calculate the final attention score of the first neural network, namely the target attention information theta ij
In specific implementation, the description will be given by taking an example of determining an intermediate node of the output of the first neural network of one of the multiple stages of the first neural network, for example, taking an example of determining an intermediate node of the output of the first neural network of the t-th stage. In the processing procedure of the first-stage first neural network, since the first-stage first neural network does not have the upper-stage second neural network, the input data of the first-stage first neural network is the heterogeneous network data G. The principle of the processing procedure of each stage of the first neural network is the same, and for the processing procedure of the first stage of the first neural network, the following description of the processing procedure of the t-th stage of the first neural network may be referred, and repeated details are not repeated.
Taking the processing procedure of the t-th-stage first neural network as an example, the input data of the t-th-stage first neural network is the output data of the (t-1) -th-stage second neural network, and refer to the data transmission procedure shown in fig. 1 specifically. The method for determining the intermediate node representation of the output of the t-th-stage first neural network specifically comprises the following steps S12-1 to S12-3:
s12-1, based on the heterogeneous network data, determining type level information corresponding to the type of each node in the heterogeneous network by adopting a double-layer graph attention machine mechanism, and determining the node level information of each node based on the type level information corresponding to the type of each node.
In the step, the t-th level first neural network receives the target hidden state information h output by the (t-1) th level second neural network t-1 . It should be noted that, the input of each first neural network and the input of each second neural network are both obtained by logic processing based on the input in the target neural network. Therefore, target hidden state information h output by the (t-1) th-level second neural network is obtained based on heterogeneous network data t-1 . Receiving target hidden state information h by a t-level first neural network t-1 A double-layer graph attention machine mechanism is adopted, wherein one layer is used for learning the weights of different types of neighbor nodes, namely determining type level information; the other layer is used to capture the importance of different neighboring nodes, i.e. to determine node level information based at least on the determined type level information.
The type-level information characterizes the influence of the type of the neighbor node u adjacent to the specific node v on the specific node v. The node-level information characterizes the effect of a neighboring node u adjacent to the particular node v on the particular node.
The nodes in the heterogeneous network include a number of different types, denoted as E. The method specifically includes, for determining type level information corresponding to one target type e, steps S201 to S204:
s201, determining first node representation of each neighbor node adjacent to the specific node based on target hidden state information output by the (t-1) th-level second neural network.
In this step, the target hides the state information h t-1 Including the node representation of each node, for a given specific node v, determining the neighbor nodes m adjacent to the specific node v, and forming a neighbor node set N m
Hidden state information h hidden in target t-1 The first node representation, denoted h, comprising a neighbor node m m
S202, determining target type representation of the target type based on first node representation of each neighbor node adjacent to the specific node, the type of each neighbor node and an adjacency matrix of heterogeneous network data.
Based on the type of each neighbor node m of the specific node v, selecting neighbor nodes with the type of a target type e from each neighbor node m of the specific node v to form a neighbor node set with the type of e
Figure BDA0003870052760000123
Neighbor node set
Figure BDA0003870052760000124
The first node representation of the neighbor node u in (1), denoted as h u . Target type representation of target type e as first node representation h of a type e neighbor node u of a particular node v u And (4) summing. Determining a target type representation h of a target type e according to the formula one e Specifically, the method comprises the following steps:
Figure BDA0003870052760000121
wherein the content of the first and second substances,
Figure BDA0003870052760000122
is a normalized adjacency matrix, a' = a + I denotes a self-connected adjacency matrix including a specific node v, a denotes an adjacency matrix of heterogeneous network data G, and I is an identity matrix. M represents the degree matrix of all nodes.
S203, determining a type level attention score corresponding to the target type based on the target type representation and the second node representation of the specific node.
Target hidden state information h t-1 Including node representations of respective nodes, in particular a second node representation h comprising a particular node v v . Determining a type-level attention score a corresponding to the target type e according to the following formula II e
a e =σ(μ e ·[h v ||h e ]) … … … … … … … … … … equation two
Wherein, | | represents connecting two representation vectors; mu.s e Learnable parameters corresponding to the type level attention, namely attention vector parameters representing the target type e; σ denotes the activation function. Equation two which is essentially by the learnable parameter μ e And the activation function sigma, two representation vectors are calculated (i.e. the second node representation h) v With target type representation h e ) The similarity between them.
S204, determining type level information corresponding to the target type based on the type level attention scores corresponding to the types respectively and the type level attention scores corresponding to the target type.
The type level attention scores a corresponding to the types (namely E' epsilon to the types in E) respectively e′ (ii) a Type level attention score a corresponding to target type e e . In specific implementation, normalization processing may be performed on the type-level attention scores of all types by using a normalization index function softmax, and type-level information corresponding to each type is determined. The following formula III determines the type level information alpha corresponding to the target type e e The determination method of the type-level information corresponding to other types is the same.
Figure BDA0003870052760000131
The method specifically includes, for determining node-level information of a target neighbor node adjacent to a specific node v, steps S301 to S302:
s301, determining node-level attention scores of target neighbor nodes based on first node representations of the neighbor nodes adjacent to the specific node, second node representations of the specific node and type-level information corresponding to types of the neighbor nodes.
The target neighbor node is any one of neighbor nodes adjacent to the specific node. In the embodiment of the present disclosure, the node-level information of one neighbor node in each neighbor node adjacent to the specific node is determined as an example, and the determination manner of the node-level information of each other neighbor node in each neighbor node adjacent to the specific node is the same as that in steps S301 to S302, and the repeated process is not repeated.
Specifically, given a specific node v of type e, it may then be determined from S201 that the first node representation h of each neighboring node m neighboring the specific node v is m Form a neighbor node set N m . And determining type level information corresponding to the type of the target neighbor node according to the type level information corresponding to the type of each neighbor node. A node-level attention score for the target neighbor node is determined based on type-level information corresponding to the type of the target neighbor node, a first node representation of the target neighbor node, and a second node representation of the particular node.
For a target neighbor node m 'adjacent to a particular node, the node-level attention score b for the target neighbor node m' may be determined according to the following equation four vm′
b vm′ =σ(γ·α e″ [h v ||h m′ ]) … … … … … … … … … … equation four
Wherein h is v Second node representation h representing a particular node v v 。h m′ The first node representation representing the target neighbor node m'. | | denotes connecting two representation vectors. Alpha is alpha e″ Type level information corresponding to type e "representing the target neighbor node m'. γ is a learnable parameter corresponding to the node level attention, i.e. a parameter representing the attention vector. σ is the activation function. The above process is essentially performed by the learnable parameters γ, μ e And a function sigma to calculate the similarity between the two nodes.
S302, node-level information of the target neighbor node is determined based on the node-level attention scores of the neighbor nodes adjacent to the specific node and the node-level attention score of the target neighbor node.
A neighbor node set N composed of neighbor nodes m adjacent to a specific node v is known m . In specific implementation, the node-level attention scores of the neighbor nodes can be normalized by using a softmax function, and the node-level information corresponding to each neighbor node is determined.
Taking the determination of the node-level information of the target neighbor node as an example, the following formula five determines the node-level information beta of the target neighbor node m vm′ The determination method of the type level information corresponding to other neighbor nodes is the same.
Figure BDA0003870052760000141
The above step S12-1 can obtain the content information of each node based on the two-layer graph attention mechanism calculation. The disclosed embodiment further introduces structure level information of the node on the basis of the content information of the determined node, see step S12-2.
S12-2, acquiring predetermined structure level information of each node; the structure level information includes association relationship information between the nodes.
Specifically, the heterogeneous network is first subjected to isomorphism processing to obtain an isomorphism network. And determining the structural similarity between the nodes in the homogeneous network by a preset similarity algorithm based on the third node representation of each node in the homogeneous network so as to obtain the structural level information of each node.
In a possible implementation mode, isomorphism processing is carried out on a heterogeneous network to obtain an isomorphism network; then, a representation containing structural information of each node, that is, a third node representation, can be obtained on the isomorphic network by using the node2vec model. Here, there are two parameters p and q in the node2vec model, and the tendency of random walk can be controlled by controlling p and q. In specific implementation, the node2vec model is less likely to return to a node which is traveled in the previous step when the p is larger, so that the node2vec model is more likely to travel farther, and the node2vec model is more likely to return to the node in the previous step when the p is smaller, so that the node2vec model tends to be explored around a starting point; the larger q is, the more likely the node2vec model walks to explore around the previous node, and the smaller q is, the more likely the node2vec model walks to explore a node farther away from the previous node. Therefore, the node2vec model can be used to enable the node2vec model to have the opportunity to capture the structural similarity between nodes in a wider area. Then, the similarity of the third node representation (i.e., the representation containing the structural information) of the nodes can be calculated through a cosine similarity function, so as to obtain a structural similarity matrix S, i.e., the structural level information of each node. Each element s in the matrix ij Representing the similarity of structural information between node i and node j.
And S12-3, determining target attention information among the nodes based on the node level information and the structure level information so as to determine intermediate node representation of the output of the t-th level first neural network.
Determining target attention information between nodes, in particular, node-level information β representing similarity of content information ij And structure level information s indicating similarity of structure information ij Carrying out weighted summation, and taking the weighted summation result as final target attention information theta between the node i and the node j ij
In one possible implementation mode, a first fusion weight of node-level information and a second fusion weight of structural information are obtained; based on node level information beta ij First of node level informationFusing weight lambda and structural information s ij And a second fusion weight (1-lambda) of the structural information, determining target attention information theta between the nodes ij See, in particular, the following equation six:
θ ij =λβ ij +(1-λ)s ij … … … … … … … … equation six
Here, the first fusion weight and the second fusion weight may be obtained based on prediction model learning, or may also be obtained based on experience of an actual application scenario, and a specific fusion weight value is not limited in this embodiment of the present disclosure.
An intermediate node representation of the first neural network output is determined based at least on the target attention information between the nodes determined in S12-3. The following description will be made in detail by taking an example of the intermediate node for determining the output of the t-th stage first neural network. Specifically, node representation matrixes respectively corresponding to various types are determined based on target hidden state information output by the (t-1) th-level second neural network. Here, the node representation matrix includes node representations of respective nodes under the corresponding types. Target hidden state information h t-1 Including the node representation of each node, based on which the node representation matrix corresponding to each type (i.e. each type in E' E) to which the node belongs can be obtained
Figure BDA0003870052760000151
Node representations (i.e., feature vectors) of nodes of type e 'are included, wherein one row of data is a feature vector of a node of type e'. Thereafter, based on the target attention information θ between the nodes ij And a node representation matrix corresponding to each type
Figure BDA0003870052760000152
Determining an intermediate node representation H of a first neural network output of a t-th stage t Referring specifically to the formula seven below, the intermediate node represents H t The node representation matrix is obtained by aggregating node representation matrixes of corresponding types by utilizing learnable variation matrixes corresponding to different types.
Figure BDA0003870052760000161
Wherein the target attention information theta between the nodes ij Can form an attention moment array
Figure BDA0003870052760000162
Representing target attention information θ between nodes of type e ij The composed attention matrix.
Figure BDA0003870052760000163
Is a learnable transformation matrix corresponding to type e'. Note that, when t =1, the initial value is set to be
Figure BDA0003870052760000164
And a node representation matrix, namely a node feature vector matrix, representing the node with the type e' in the heterogeneous network data G, and containing the feature vectors of all the nodes.
The embodiment of the disclosure adopts a double-layer graph attention mechanism to determine the content information of the node, and introduces the structure information of the node on the basis of the content information of the node, that is, adds the similarity of the structure information on the basis of the similarity of the content information as the final target attention information. In the embodiment of the present disclosure, the first neural network may be a heterogeneous graph attention network HGAT, that is, a graph neural network, to which structural information is added. The heterogeneous graph attention network added with the structural information is used as each level of graph neural network in the multi-level graph neural network to aggregate node information of each layer, and the problem that node structural information in a neighborhood is lost in the related technology is solved.
In the embodiment of the present disclosure, the first neural network is a heterogeneous graph attention network added with structural information, that is, a graph neural network, and the second neural network is a gated cyclic unit GRU. In the embodiment of the disclosure, the GRU units are introduced between the neural networks of all levels to form the neural network of all levels (namely, the target neural network), so that the fusion of features of different levels of graphs is realized. The GRU unit receives intermediate node information output by the graph neural network at a level corresponding to the GRU unit and target hidden state information output by the GRU unit at a previous level, and outputs the target hidden state information at the current level, and the specific execution process is as follows in steps S401 to S403:
s401, based on the intermediate node information output by the first neural network of the t level and the target hidden state information output by the GRU of the (t-1) level, which are received by the GRU of the t level, the updating gate data and the resetting gate data of the GRU of the t level are determined.
Update gate u in t-th stage GRU unit t And a reset gate r t See formula eight and formula nine for the calculation process of (1):
u t =ρ(W u [H t ,h t-1 ]+b u ) … … … … … … … … equation eight
r t =ρ(W r [H t ,h t-1 ]+b r ) … … … … … … … … equation nine
Wherein u is t Indicating the update gate data, controlling how much information of the (t-1) th level and information of the t-th level is to be continuously transferred to the future, for example, subsequent levels (i.e., (t + 1) th level to nth level); r is t Information indicating reset gate data, controlling how much past information is to be forgotten, e.g. information in a previous level; []Representing that two vectors are connected; w u A learnable parameter representing an update gate calculation process in the GRU unit; w r A learnable parameter representing a reset gate calculation process in the GRU unit; h t An intermediate node representation representing the output of the first neural network of stage t; h is a total of t-1 Target hidden state information representing a (t-1) th stage GRU output; ρ represents a simoid function by which data can be transformed into a numerical value in the range of 0-1 to serve as a gating signal; b u An offset representing an update gate calculation process; b is a mixture of r Indicating the offset during the reset gate calculation.
Here, the t-stage GRU unit obtains the control states of the update gate and the reset gate by acquiring the intermediate node information output from the t-stage first neural network and the target hidden state information output from the (t-1) -stage GRU.
Note that, if t =1, the initial value isValue h 0 Is heterogeneous network data G, which includes node representations of the various nodes.
S402, determining candidate hidden state information of the t-th stage GRU based on reset gate data, intermediate node information output by the t-th stage first neural network and target hidden state information output by the (t-1) th stage GRU.
Candidate hidden state information c of t-th-stage GRU t See formula ten:
c t =tanh(W c [H t ,(r t ×h t-1 )]+b c ) … … … … … … … … equation ten
Wherein, c t Representing candidate hidden state information, including intermediate node representation H of the first neural network input of the t-th stage t And hiding the state information h from the target input in the (t-1) th-level second neural network t-1 Retention of (2); tan h is a hyperbolic tangent function; x represents the product of two vectors; w is a group of c Representing a learnable parameter in the GRU unit when candidate hidden state information is calculated; b c Representing the bias in computing candidate hidden-state information.
S403, determining target hidden state information output by the t-th stage GRU based on the candidate hidden state information, the updating gate data and the target hidden state information output by the (t-1) -th stage GRU.
Target hidden state information h output by t-th stage GRU t See formula eleven:
h t =(1-u t )×h t-1 +u t ×c t … … … … … … … … formula eleven implements processing of each level of the first neural network and each level of the second neural network in the target neural network according to the processing procedures of the above steps. Finally, the entire target neural network can be represented as
Figure BDA0003870052760000183
Wherein the adjacency matrix is normalized
Figure BDA0003870052760000184
And all the sectionsTaking the point initial characteristic vector matrix X as the input of a target neural network model; mu.s e And gamma is a learnable parameter corresponding to type level and node level attention; w t The transformation matrix parameters of the first neural network of each level can be learnt; w u ,W r ,W c The learnable parameters of the GRU units, respectively.
For step S13, specifically, a preset number of node pairs may be screened from the heterogeneous network, and based on the heterogeneous network data, whether an association relationship exists between nodes in the node pairs is determined, and a preset label is set for the node pairs having the association relationship. Based on the first network representation, tagged node pair data is determined and used as training data.
Obtaining a first network representation H of the target neural network output N In this case, a certain number of node pairs are randomly extracted from the heterogeneous network data G, and for a given pair of node pairs P (i,j) According to P (i,j) Whether an edge exists between the nodes is marked with 1 or 0, so that the label of the corresponding node pair is obtained, and H is represented based on the first network N And (3) representing the nodes of each node, namely using the node pair data with the labels as training data, namely using the node representation of the node pairs with the labels as the training data, and inputting the training data into the logistic regression model to train the logistic regression model and the target neural network.
The whole large model composed of the logistic regression model and the target neural network can be specifically expressed as
Figure BDA0003870052760000181
Where ω represents a learnable parameter of the logistic regression model and Z represents a prediction result of the entire model.
Aiming at the step S14, model training, specifically, inputting training data into a logistic regression model for link prediction to obtain a link prediction result Z; constructing a weighted loss value based on a link prediction result, and training a logistic regression model and a target neural network by performing weighted back propagation on the weighted loss value until the weighted loss value is converged; and when the training of the logistic regression model and the target neural network is completed, the second network representation output by the target neural network is used as the target network representation of the target field task.
Illustratively, the training may utilize a cross-entropy Loss function Loss to construct a weighted Loss value based on the link prediction results Z, e.g.
Figure BDA0003870052760000182
Wherein n is the number of training samples, t lk Is a sign function, if the true type of the l sample is k, the value is 1; otherwise, the value is 0; z lk The output of the logistic regression model, i.e. the probability that the l-th sample belongs to type k, is represented.
Here, the entire model composed of the logistic regression model and the target neural network constructs a weighted Loss value Loss based on the link prediction result Z obtained by calculation, and performs parameter update, specifically, μ update, through back propagation e 、γ、W t 、W u 、W r 、W c And omega until the weighted loss value converges, and obtaining the optimal model output result Z. At this time, the second network representation output by the target neural network is the target network representation of the target domain task.
For example, taking a target domain task as a link prediction task in the biological domain as an example, first, a biological network representation learning needs to be performed, and specifically, the biological network representation learning method includes: firstly, acquiring and constructing a heterogeneous biological network; then, designing a multi-level graph neural network model, namely a target neural network (a multi-level heterogeneous graph attention network with structural information and multi-level GRU units); and then training the target neural network + logistic regression model by using the training data, and when the logistic regression model and the target neural network are trained, using a second network representation output by the target neural network as a target network representation of the target field task. After obtaining the target network representation of the target domain task, the target network representation may be applied to the target domain task.
Based on the same inventive concept, the embodiment of the present disclosure further provides a network representation learning device corresponding to the network representation learning method, and as the principle of solving the problem of the network representation learning device in the embodiment of the present disclosure is similar to the network learning method in the embodiment of the present disclosure, the implementation of the device may refer to the implementation of the method, and the repeated parts are not described again.
Fig. 4 is a schematic diagram of a network representation learning apparatus provided in an embodiment of the present disclosure, in which a large model composed of a target neural network 42 and a logistic regression model 44 is integrated on the network representation learning apparatus, as shown in fig. 4, the network representation learning apparatus includes a data acquisition module 41, a target neural network 42, a sample determination module 43, and a logistic regression model 44. Wherein the target neural network 42 includes a first neural network of level N and a second neural network of level N-1; the middle node of the output of the first neural network of the t level is represented as the input of the second neural network of the t level corresponding to the middle node; target hidden state information output by the (t-1) th-stage second neural network is input by the t-stage first neural network and input by the t-stage second neural network respectively; n is more than or equal to 2,2 and less than or equal to t, N-1,N and t are rounded.
The data acquisition module 41 is configured to acquire heterogeneous network data in a target domain; the heterogeneous network data comprises data of each node in the heterogeneous network; the target neural network 42 is configured to receive heterogeneous network data, and output a first network representation through processing of the first neural networks at each stage and the second neural networks at each stage; the sample determination module 43 is configured to determine training data based on the heterogeneous network data and the first network representation; the logistic regression model 44 is configured to train the own and target neural networks 42 based on the training data, and when the training of the own and target neural networks 42 is completed, the target neural network 42 outputs a second network representation as the target network representation of the target domain task.
The network representation learning device provided by the embodiment of the disclosure is used for network representation learning in a target field. The learned target network representation may be applied to target neighborhood tasks. The network represents a framework of a network model integrated in a learning device, specifically, a second neural network is applied to a graph level, a first neural network and a second neural network which are in one-to-one correspondence are arranged, a middle node output by a t-level first neural network is represented as an input of a t-level second neural network corresponding to the middle node, target hidden state information output by a (t-1) -level second neural network is respectively the input of the t-level first neural network and the input of the t-level second neural network, fusion of multi-level graph features (namely, the middle node representation output by each level first neural network) is realized, the node representation of the whole graph is updated by utilizing the fused multi-level graph features, advantages of different level graph features are complemented, and the defect that fusion of different level graph features is not considered in related technologies is made up. Meanwhile, the graph features obtained by the low-level second neural network are kept to the high-level graph neural network, so that the problem of over-smoothing of the second neural network can be relieved to a certain extent.
In some embodiments, the data acquisition module 41 is specifically configured to obtain at least one network in the target domain from a preset database; the network in the target field comprises a plurality of different types of nodes; and integrating the acquired networks in at least one target field based on the types of all nodes in the network in each target field to build a heterogeneous network corresponding to the target field task and obtain heterogeneous network data. For a specific implementation process of the data acquisition module 41, reference may be made to the above description for step S11, and repeated details are not described again.
In some embodiments, taking an example of determining an intermediate node output by the t-th-level first neural network in the target neural network 42, the t-th-level first neural network is specifically configured to determine type-level information corresponding to the type of each node in the heterogeneous network based on heterogeneous network data by using a double-layer graph attention mechanism, and determine node-level information based on the type-level information corresponding to the type of each node; acquiring predetermined structural information of each node; the structure information comprises incidence relation information among all nodes; target attention information between nodes is determined based on the node level information and the structural information to determine an intermediate node representation of the first neural network output at the t-th level. For a specific implementation process of the t-th-stage first neural network, reference may be made to the above description of steps S12-1 to S12-3, and repeated descriptions are omitted.
Optionally, for the t-th level, the first neural network is configured to determine type level information corresponding to a target type, and specifically, based on target hidden state information output by the (t-1) -th level second neural network, determine first node representations of respective neighbor nodes adjacent to the specific node; determining a target type representation of a target type based on a first node representation of each neighbor node adjacent to a specific node, a type of each neighbor node, and an adjacency matrix of heterogeneous network data; determining a type-level attention score corresponding to the target type based on the target type representation and a second node representation of the specific node; and determining type level information corresponding to the target type based on the type level attention scores corresponding to the types respectively and the type level attention scores corresponding to the target types.
Optionally, for the t-th level first neural network is configured to determine node level information of one target neighbor node adjacent to the particular node, in particular, determine a node level attention score of the target neighbor node based on type level information corresponding to a first node representation of each neighbor node adjacent to the particular node, a second node representation of the particular node, and a type of each neighbor node; node-level information for the target neighbor node is determined based on the node-level attention scores of the respective neighbor nodes adjacent to the particular node and the node-level attention score of the target neighbor node.
Optionally, the network representation learning apparatus further includes a network isomorphism module, where the network isomorphism module is configured to perform isomorphism processing on the heterogeneous network to obtain an isomorphism network; and determining the structural similarity between the nodes in the homogeneous network by a preset similarity algorithm based on the third node representation of each node in the homogeneous network so as to obtain the structural level information of each node. For the t-th level, the first neural network is configured to determine target attention information among nodes, and specifically, a first fusion weight of node level information and a second fusion weight of structure level information are obtained; target attention information between nodes is determined based on the node level information, the first fusion weight of the node level information, the structure level information, and the second fusion weight of the structure level information.
Optionally, for the t-th level first neural network, the intermediate node representation is configured to be output, and specifically, based on the target hidden state information output by the (t-1) -th level second neural network, node representation matrixes respectively corresponding to the types are determined; determining intermediate node representation output by the t-th-level first neural network based on target attention information among the nodes and node representation matrixes respectively corresponding to the types; the node representation matrix includes node representations of respective nodes under the corresponding type.
Optionally, the second neural network is a gated cyclic unit GRU; for a t-stage GRU unit configured to output target hidden state information, specifically, determine update gate data and reset gate data based on received intermediate node information output by a t-stage first neural network and target hidden state information output by a (t-1) -stage GRU; determining candidate hidden state information based on reset gate data, intermediate node information output by a first neural network at a t level and target hidden state information output by a GRU at a (t-1) level; target hidden state information is determined based on the candidate hidden state information, the update gate data, and the target hidden state information output by the (t-1) -th stage GRU.
In some embodiments, the sample determining module 43 is specifically configured to screen a preset number of node pairs from the heterogeneous network, determine whether an association relationship exists between nodes in the node pairs based on heterogeneous network data, and set a preset label for the node pairs having the association relationship; based on the first network representation, tagged node pair data is determined and used as training data. For a specific implementation process of the sample determining module 43, reference may be made to the above description for step S13, and repeated descriptions are omitted.
In some embodiments, the logistic regression model 44 is configured to perform link prediction based on the received training data, resulting in a link prediction result; constructing a weighted loss value based on the link prediction result, and training the weighted loss value and the target neural network 42 by performing weighted back propagation on the weighted loss value until the weighted loss value is converged; upon completion of the training of the logistic regression model 44 and the target neural network, the second network representation output by the target neural network serves as the target network representation for the target domain task. The specific implementation process of the logistic regression model 44 can refer to the above description for step S14, and repeated descriptions are omitted.
The embodiment of the disclosure also provides computer equipment. Fig. 5 is a schematic structural diagram of a computer device according to an embodiment of the present disclosure, and as shown in fig. 5, a computer device according to an embodiment of the present disclosure includes: one or more processors 501, memory 502, and one or more I/O interfaces 503. The memory 502 has stored thereon one or more programs that, when executed by the one or more processors, cause the one or more processors to implement the environment detection alarm method as in any of the above embodiments; one or more I/O interfaces 503 couple between the processor and the memory and are configured to enable information interaction between the processor and the memory.
The processor 501 is a device with data processing capability, and includes but is not limited to a Central Processing Unit (CPU) and the like; memory 502 is a device having data storage capabilities including, but not limited to, random access memory (RAM, more specifically SDRAM, DDR, etc.), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), FLASH memory (FLASH); an I/O interface (read/write interface) 503 is connected between the processor 501 and the memory 502, and can realize information interaction between the processor 501 and the memory 502, which includes but is not limited to a data Bus (Bus) and the like.
In some embodiments, the processor 501, memory 502, and I/O interface 503 are connected to each other, and thus to other components of the computing device, by a bus 504.
According to an embodiment of the present disclosure, there is also provided a non-transitory computer readable medium. The non-transitory computer readable medium has stored thereon a computer program, wherein the program when executed by a processor implements the steps in the environment detection alarm method as in any of the above embodiments.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a machine-readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication section, and/or installed from a removable medium. The above-described functions defined in the system of the present disclosure are performed when the computer program is executed by a Central Processing Unit (CPU).
It should be noted that the non-transitory computer readable medium shown in the present disclosure may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer-readable signal medium may include a propagated data signal with computer-readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any non-transitory computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a non-transitory computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
It is to be understood that the above embodiments are merely exemplary embodiments that are employed to illustrate the principles of the present disclosure, and that the present disclosure is not limited thereto. It will be apparent to those skilled in the art that various changes and modifications can be made therein without departing from the spirit and scope of the disclosure, and these are to be considered as the scope of the disclosure.

Claims (14)

1. A network representation learning method is characterized in that a pre-built target neural network is used for learning network representation in a target field; the target neural network comprises an N-level first neural network and an N-1 level second neural network; wherein, the middle node of the first neural network output of the t stage is represented as the input of the second neural network of the t stage corresponding to the middle node; target hidden state information output by the second neural network of the (t-1) th stage is input by the first neural network of the t stage and input by the second neural network of the t stage respectively; n is more than or equal to 2,2 and less than or equal to t, N-1,N is obtained, and t is rounded;
the network representation learning method comprises the following steps:
acquiring heterogeneous network data in a target field; the heterogeneous network data comprises data of each node in the heterogeneous network;
inputting the heterogeneous network data into the target neural network, and obtaining a first network representation output by the target neural network through processing of the first neural networks at all levels and the second neural networks at all levels;
determining training data based on the heterogeneous network data and the first network representation;
and training a logistic regression model and the target neural network based on the training data, and taking a second network representation output by the target neural network as a target network representation of the target field task when the logistic regression model and the target neural network are trained.
2. The method according to claim 1, wherein the acquiring heterogeneous network data in a target domain comprises:
acquiring at least one network in a target field from a preset database; the network in the target field comprises a plurality of different types of nodes;
and integrating at least one acquired network in the target field based on the type of each node in the network in each target field, building a heterogeneous network corresponding to the target field task, and acquiring heterogeneous network data.
3. The network representation learning method of claim 1, wherein determining an intermediate node representation of the first neural network output at level t comprises:
determining type-level information corresponding to the type of each node in the heterogeneous network by adopting a double-layer graph attention mechanism based on the heterogeneous network data, and determining node-level information based on the type-level information corresponding to the type of each node;
acquiring predetermined structural information of each node; the structure information comprises incidence relation information among all nodes;
based on the node level information and the structural information, target attention information between nodes is determined to determine an intermediate node representation of the first neural network output at level t.
4. The network representation learning method of claim 3, wherein the determining type level information corresponding to a target type comprises:
determining a first node representation of each neighbor node adjacent to a particular node based on target hidden state information output by the second neural network of level (t-1);
determining a target type representation of a target type based on a first node representation of each neighbor node adjacent to a particular node, a type of each of the neighbor nodes, and an adjacency matrix of the heterogeneous network data;
determining a type-level attention score corresponding to the target type based on the target type representation and a second node representation of the particular node;
and determining type level information corresponding to the target type based on the type level attention scores respectively corresponding to the types and the type level attention score corresponding to the target type.
5. The network representation learning method of claim 4, wherein for determining the node-level information of one target neighbor node adjacent to the particular node, comprising:
determining a node-level attention score for the target neighbor node based on a first node representation of each of the neighbor nodes that are adjacent to the particular node, a second node representation of the particular node, and type-level information corresponding to a type of each of the neighbor nodes;
determining node-level information for the target neighbor node based on the node-level attention scores of the neighbor nodes adjacent to the particular node and the node-level attention score of the target neighbor node.
6. The network representation learning method according to claim 3, wherein the step of determining the structural information of each node comprises:
carrying out isomorphism processing on the heterogeneous network to obtain an isomorphism network;
and determining the structural similarity between the nodes in the homogeneous network by a preset similarity algorithm based on the third node representation of each node in the homogeneous network so as to obtain the structural level information of each node.
7. The method of claim 3, wherein determining target attention information between nodes based on the node level information and the structure level information comprises:
acquiring a first fusion weight of the node-level information and a second fusion weight of the structure-level information;
determining target attention information between nodes based on the node-level information, the first fusion weight of the node-level information, the structure-level information, and the second fusion weight of the structure-level information.
8. The method of claim 3, wherein said determining an intermediate node representation of said first neural network output at said t-th stage comprises:
determining node representation matrixes respectively corresponding to various types based on target hidden state information output by the (t-1) th-level second neural network;
determining intermediate node representation output by the first neural network at the t-th level based on target attention information among nodes and node representation matrixes respectively corresponding to the types; the node representation matrix includes node representations of respective nodes under the corresponding type.
9. The net representation learning method of claim 1, wherein the second neural network is a gated round robin unit GRU;
determining target hidden state information output by the second neural network at the t-th stage, including:
determining update gate data and reset gate data of a t-th stage of the GRU based on intermediate node information output by the first neural network of the t-th stage and target hidden state information output by the GRU of the (t-1) th stage received by the GRU of the t-th stage;
determining candidate hidden state information of the GRU of a t-th stage based on the reset gate data, the intermediate node information output by the first neural network of the t-th stage, and the target hidden state information output by the GRU of the (t-1) th stage;
determining target hidden state information for the GRU output of the t-th stage based on the candidate hidden state information, the update gate data, and the target hidden state information for the GRU output of the (t-1) -th stage.
10. The network representation learning method of claim 1, wherein the determining training data based on the heterogeneous network data and the first network representation comprises:
screening out a preset number of node pairs from the heterogeneous network, judging whether an association relationship exists between nodes in the node pairs based on the heterogeneous network data, and setting preset labels for the node pairs with the association relationship;
based on the first network representation, tagged node pair data is determined and used as training data.
11. The method according to claim 1, wherein the training a logistic regression model and the target neural network based on the training data, and when the training of the logistic regression model and the target neural network is completed, the training of the logistic regression model and the target neural network using a second network representation output by the target neural network as a target network representation of the target domain task comprises:
inputting the training data into the logistic regression model to perform link prediction to obtain a link prediction result;
constructing a weighted loss value based on the link prediction result, and training the logistic regression model and the target neural network by performing weighted back propagation on the weighted loss value until the weighted loss value is converged;
and when the training of the logistic regression model and the target neural network is completed, the second network representation output by the target neural network is used as the target network representation of the target field task.
12. A network representation learning apparatus, comprising: the system comprises a data acquisition module, a target neural network, a sample determination module and a logistic regression model; the target neural network comprises an N-level first neural network and an N-1 level second neural network; wherein, the middle node of the first neural network output of the t-th level is represented as the input of the second neural network of the t-th level corresponding to the middle node; target hidden state information output by the second neural network of the (t-1) th stage is input to the first neural network of the t stage and input to the second neural network of the t stage respectively; n is more than or equal to 2,2 and less than or equal to t, N-1,N is obtained, and t is rounded;
the data acquisition module is configured to acquire heterogeneous network data in a target field; the heterogeneous network data comprises data of each node in the heterogeneous network;
the target neural network is configured to receive the heterogeneous network data, and output a first network representation through processing of the first neural networks at all levels and the second neural networks at all levels;
the sample determination module configured to determine training data based on the heterogeneous network data and the first network representation;
and the logistic regression model is configured to train the target neural network and the self based on the training data, and when the training of the target neural network and the self is completed, a second network representation output by the target neural network is used as a target network representation of the target field task.
13. A computer device, comprising: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating over the bus when a computer device is run, the machine-readable instructions when executed by the processor performing the steps of the network representation learning method of any one of claims 1 to 11.
14. A computer-non-transitory readable storage medium, wherein a computer program is stored thereon, which, when being executed by a processor, performs the steps of the network representation learning method according to any one of claims 1 to 11.
CN202211196991.6A 2022-09-28 2022-09-28 Network representation learning method, device, equipment and storage medium Pending CN115511076A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211196991.6A CN115511076A (en) 2022-09-28 2022-09-28 Network representation learning method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211196991.6A CN115511076A (en) 2022-09-28 2022-09-28 Network representation learning method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115511076A true CN115511076A (en) 2022-12-23

Family

ID=84507929

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211196991.6A Pending CN115511076A (en) 2022-09-28 2022-09-28 Network representation learning method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115511076A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116129992A (en) * 2023-04-17 2023-05-16 之江实验室 Gene regulation network construction method and system based on graphic neural network

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116129992A (en) * 2023-04-17 2023-05-16 之江实验室 Gene regulation network construction method and system based on graphic neural network

Similar Documents

Publication Publication Date Title
Koumakis Deep learning models in genomics; are we there yet?
Kriegeskorte et al. Neural network models and deep learning
Albaradei et al. Machine learning and deep learning methods that use omics data for metastasis prediction
Zhou et al. A survey on evolutionary construction of deep neural networks
US20190279088A1 (en) Training method, apparatus, chip, and system for neural network model
WO2022012407A1 (en) Neural network training method and related device
CN112364880B (en) Omics data processing method, device, equipment and medium based on graph neural network
US20190042911A1 (en) System and method for learning the structure of deep convolutional neural networks
WO2022068623A1 (en) Model training method and related device
Ceci et al. Semi-supervised multi-view learning for gene network reconstruction
CN110633786A (en) Techniques for determining artificial neural network topology
Peng et al. Hierarchical Harris hawks optimizer for feature selection
US11967436B2 (en) Methods and apparatus for making biological predictions using a trained multi-modal statistical model
US20210406686A1 (en) Method and system for balanced-weight sparse convolution processing
Zhou et al. A priori trust inference with context-aware stereotypical deep learning
WO2019178291A1 (en) Methods for data segmentation and identification
Böck et al. Hub-centered gene network reconstruction using automatic relevance determination
WO2023051369A1 (en) Neural network acquisition method, data processing method and related device
CN116129992A (en) Gene regulation network construction method and system based on graphic neural network
Li et al. Improved elephant herding optimization using opposition-based learning and K-means clustering to solve numerical optimization problems
CN115511076A (en) Network representation learning method, device, equipment and storage medium
Manoochehri et al. Graph convolutional networks for predicting drug-protein interactions
KR20220107940A (en) Method for measuring lesion of medical image
CN114420201A (en) Method for predicting interaction of drug targets by efficient fusion of multi-source data
Sharma et al. Recent advancement and challenges in deep learning, big data in bioinformatics

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination