CN113515519A

CN113515519A - Method, device and equipment for training graph structure estimation model and storage medium

Info

Publication number: CN113515519A
Application number: CN202011574363.8A
Authority: CN
Inventors: 王啸; 王睿嘉; 牟帅; 石川; 肖万鹏; 鞠奇
Original assignee: Tencent Technology Shenzhen Co Ltd; Beijing University of Posts and Telecommunications
Current assignee: Tencent Technology Shenzhen Co Ltd; Beijing University of Posts and Telecommunications
Priority date: 2020-12-25
Filing date: 2020-12-25
Publication date: 2021-10-19

Abstract

The embodiment of the invention discloses a method, a device, equipment and a storage medium for training a graph structure estimation model, wherein the method comprises the following steps: acquiring an initial graph and label information corresponding to the initial graph; the initial graph comprises a plurality of nodes, the label information is used for indicating the category of the target node in the initial graph, and the target node is any one or more of the plurality of nodes in the initial graph; calling a graph prediction model included in the graph structure estimation model to perform prediction processing on the initial graph to obtain observation information corresponding to the initial graph; calling a graph estimator included in the graph structure estimation model to perform estimation processing based on the label information and the observation information to obtain an estimation graph; predicting the estimation graph by using a graph prediction model to obtain prediction information corresponding to the estimation graph; and optimizing the graph prediction model based on the prediction information and the label information corresponding to the estimation graph. The accuracy of the graph structure estimation model can be improved by adopting the embodiment of the invention.

Description

Method, device and equipment for training graph structure estimation model and storage medium

Technical Field

The present application relates to the field of graph processing, and in particular, to a method, an apparatus, a device, and a storage medium for training a graph structure estimation model.

Background

From chemical and bioinformatics research to image and social network analysis, graphs are ubiquitous. The graph is the most direct tool for describing a community relation chain and is composed of nodes and edges, wherein the nodes represent objects in the community, and the edges represent the degree of closeness of relation between the two objects. Due to the popularity of graphs, it is particularly important to learn an efficient representation of a graph and apply it to downstream tasks. Recently, Graph processing models for Graph representation learning, such as Graph Neural Network (GNN) models, Graph Convolutional Network (GCN) models, and the like, have attracted a lot of attention; taking the graph neural network GNN model as an example, the model roughly follows a recursive message passing mechanism, i.e. neighborhood information is aggregated and passed to neighbors.

The graph processing model currently used is usually obtained by training based on a graph training sample, and it is generally assumed that the graph structure of the graph training sample is correct and conforms to the model property of the graph processing model. However, the graph training samples are generally extracted from a complex interactive system in a practical application, and since some errors exist in the interactive system in the practical application, some missing, meaningless, and even wrong edges may exist in the graph training samples, which causes the graph training samples to be mismatched with the properties of the GNN, thereby affecting the accuracy of the GNN model. Therefore, in the field of graph processing, how to train a model for graph processing to improve the accuracy of the model becomes a hot issue of research.

Disclosure of Invention

The embodiment of the invention provides a method, a device, equipment and a storage medium for training a graph structure estimation model, which can improve the accuracy of the graph structure estimation model.

In one aspect, an embodiment of the present invention provides a method for training a graph structure estimation model, including:

acquiring an initial graph and label information corresponding to the initial graph; the initial graph comprises a plurality of nodes, the label information is used for indicating the category of a target node in the initial graph, and the target node is any one or more of the plurality of nodes of the initial graph;

calling a graph prediction model included in a graph structure estimation model to perform prediction processing on the initial graph to obtain observation information corresponding to the initial graph;

calling a graph estimator included in the graph structure estimation model to perform estimation processing based on the label information and the observation information to obtain an estimation graph; calling the graph prediction model to perform prediction processing on the estimation graph to obtain prediction information corresponding to the estimation graph, wherein the prediction information corresponding to the estimation graph is used for indicating the category of each node in the estimation graph;

and optimizing the graph prediction model based on the prediction information corresponding to the estimation graph and the label information.

In one aspect, an embodiment of the present invention further provides a training device for a graph structure estimation model, including:

the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring an initial graph and label information corresponding to the initial graph; the initial graph comprises a plurality of nodes, the label information is used for indicating the category of a target node in the initial graph, and the target node is any one or more of the plurality of nodes of the initial graph;

the processing unit is used for calling a graph prediction model included in the graph structure estimation model to perform prediction processing on the initial graph to obtain observation information corresponding to the initial graph;

the processing unit is further configured to invoke a graph estimator included in the graph structure estimation model to perform estimation processing based on the label information and the observation information to obtain an estimation graph; calling the graph prediction model to perform prediction processing on the estimation graph to obtain prediction information corresponding to the estimation graph, wherein the prediction information corresponding to the estimation graph is used for indicating the category of each node in the estimation graph;

the processing unit is further configured to optimize the graph prediction model based on prediction information corresponding to the estimation graph and the label information.

In one aspect, an embodiment of the present invention provides a training apparatus for a graph structure estimation model, where the training apparatus includes: a processor adapted to implement one or more computer programs; and a computer storage medium storing one or more computer programs adapted to be loaded by the processor and to perform:

acquiring an initial graph and label information corresponding to the initial graph; the initial graph comprises a plurality of nodes, the label information is used for indicating the category of a target node in the initial graph, and the target node is any one or more of the plurality of nodes of the initial graph; calling a graph prediction model included in a graph structure estimation model to perform prediction processing on the initial graph to obtain observation information corresponding to the initial graph; calling a graph estimator included in the graph structure estimation model to perform estimation processing based on the label information and the observation information to obtain an estimation graph; calling the graph prediction model to perform prediction processing on the estimation graph to obtain prediction information corresponding to the estimation graph, wherein the prediction information corresponding to the estimation graph is used for indicating the category of each node in the estimation graph; and optimizing the graph prediction model based on the prediction information corresponding to the estimation graph and the label information.

In one aspect, an embodiment of the present invention provides a computer storage medium, where a computer program is stored, and when executed by a processor, the computer program is configured to perform:

In one aspect, an embodiment of the present invention provides a computer program product or a computer program, where the computer program product includes a computer program, and the computer program is stored in a computer storage medium; a processor of the model processing device reads the computer program from the computer storage medium, and the processor executes the computer program to cause the model processing device to execute:

In the embodiment of the invention, a new model for processing a graph, namely a graph structure estimation model is provided, and the graph structure estimation model consists of a graph prediction model and a graph estimator. In the process of optimizing the graph structure estimation model, a graph prediction model in the graph structure estimation model can be used for carrying out prediction processing on an initial graph to obtain observation information corresponding to the initial graph, wherein the observation information comprises prediction information corresponding to the initial graph; then, the calling graph estimator carries out estimation processing based on the label information and the observation information to obtain an estimation graph; further, calling the graph prediction model to perform prediction processing on the estimation graph to obtain prediction information corresponding to the estimation graph; and optimizing the graph prediction model based on the prediction information corresponding to the estimation graph and the label information corresponding to the initial graph.

Through the process, the graph prediction model is optimized not only based on the initial graph and the label information corresponding to the initial graph, but also based on the estimation graph. The estimated map is obtained by estimating observation information obtained by predicting the initial map by the map estimator based on the map prediction model. In other words, the estimation map is observed from the perspective of the map prediction model, that is, the estimation map is more matched with the properties of the map prediction model than the initial map, that is, more consistent with the properties of the map structure estimation model. Therefore, the accuracy of the graph prediction model can be improved by optimally training the graph prediction model based on the estimation graph, and the accuracy of the graph structure estimation model can be improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a schematic diagram of a social network diagram provided by an embodiment of the present invention;

FIG. 2 is a diagram of a graph structure estimation model provided by an embodiment of the invention;

FIG. 3 is a flowchart illustrating a method for training a graph structure estimation model according to an embodiment of the present invention;

FIG. 4a is a schematic illustration of an initial diagram provided by an embodiment of the present invention;

FIG. 4b is a diagram of an adjacency matrix corresponding to an initial graph according to an embodiment of the present invention;

FIG. 5 is a flowchart illustrating another method for training a graph structure estimation model according to an embodiment of the present invention;

FIG. 6 is a diagram illustrating a method for determining an estimated adjacency matrix according to an embodiment of the present invention;

FIG. 7 is a diagram illustrating the training of a graph structure estimation model according to an embodiment of the present invention;

FIG. 8a is a box-type graph of the predicted values of the GCN model and the GEN model for each node according to the embodiment of the present invention;

FIG. 8b is a graph showing the variation of true-positive probability and false-positive probability of a GEN model in two different data sets according to an embodiment of the present invention;

FIG. 9a is a graph illustrating accuracy curves of a GEN model and other graph processing models according to an embodiment of the present invention;

FIG. 9b is an initial view and an estimated view of a visualization provided by an embodiment of the invention;

FIG. 9c is a probability matrix of social intervals of an initial graph and an estimated graph according to an embodiment of the present invention;

FIG. 9d is a graph of a first number and a relationship between node pairs provided by embodiments of the present invention;

FIG. 9e is a normalized histogram of edge confidence for different communities over a training data set, a validation data set, and a test data set used to train a GEN model according to an embodiment of the present invention;

FIG. 10 is a schematic structural diagram of a training apparatus for a graph structure estimation model according to an embodiment of the present invention;

fig. 11 is a schematic structural diagram of a training apparatus for a graph structure estimation model according to an embodiment of the present invention.

Detailed Description

The technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention.

The graph is the most direct tool for describing the community relationship chain and is composed of nodes and edges, wherein the nodes represent objects in the community, and the edges represent the degree of closeness of relationship between the two objects. In a narrow sense, communities have certain interactive relations and common cultural relationships, and groups of people related to each other in a certain field form a community and an interactive area thereof, for example, an empty channel club can be regarded as a community, and a company can be regarded as a community; in a broad sense, a community may include social networks, biological networks, and infrastructure networks such as energy, traffic, the internet, and communications. The graph may include a directed graph in which each edge is directed and an undirected graph in which each edge is undirected.

For example, referring to fig. 1, a diagram of a social network provided by an embodiment of the present invention may be a directed graph. Objects of the social network may include users, schools, and companies, which serve as nodes of the graph; if there is an association between two objects, there is an edge between the two objects, such as a friend relationship between user 101 and user 102, then there is an edge 100 between user 101 and user 102; as another example, user 101 is working at company 103, and there is an edge 104 between user 101 and company 103. By analogy, a diagram of a social network as shown in FIG. 1 results.

In today's society, efficient and accurate processing of graphs is particularly important to downstream tasks due to the ubiquitous nature of graphs. Based on the continuous development of deep learning, a graph processing model for processing the graph is generated. The graph processing model in the prior art refers to a model constructed based on a Graph Neural Network (GNN). In the past few years, the graph neural network GNN has had great success in solving the graph machine learning problem, and most of the current graph neural network models can be divided into two types, one type based on a graph processing model of a spectral method and the other type based on a graph processing model of a spatial method.

The graph processing model based on the spectrum method is represented by using nodes in a graph theory learning graph. For example, some studies propose a spectral domain-based convolutional network extension on the graph using fourier bases; in still other studies, graph convolution based on chebyshev polynomials has been used to eliminate computationally intensive Laplacian feature decomposition; furthermore, a typical graph neural network based on the spectral method is the graph convolution network GCN, which further simplifies the graph convolution based on the chebyshev polynomial by using its first order approximation.

The graph processing model based on the spatial approach is to define graph convolution directly in the spatial domain to gather and transform local information. For example, some studies learn to aggregate neighbor information by sampling and aggregating it; some are under study to assign different edge weights during aggregation according to node characteristics; there are also studies in which importance sampling is performed on each layer to sample a fixed number of nodes in order to improve efficiency.

In the graph processing models, all the graph processing models are obtained by training a large number of graph training samples and label information corresponding to the graph training samples, and each graph training sample is assumed to be correct in the training process. However, the graph training samples are all from complex interactive systems, because in practical application, uncertainty or error is usually contained in various interactive systems, for example, in a graph describing protein interaction, traditional experimental error is a main source of the graph error, and another reason is data loss; as another example, an Internet graph is determined by examining routing or tracing routing paths, whereas a set of routing tables and tracing routing paths only give a subset of edges. This leads to the above assumption being false, i.e. the training pattern for the training is not necessarily correct.

Training a graph processing model based on incorrect graph training samples may result in limited representation capability of the graph processing model, thereby affecting the accuracy of the graph processing model in practical applications. A typical example is that the performance of graph processing models can be greatly degraded on graphs that are poorly homogenous (i.e., nodes within the same community tend to connect to each other). In short, missing, meaningless, and even erroneous edges are prevalent in the graph training samples, which results in a mismatch with the properties of the graph processing model, thereby affecting the accuracy of the graph processing model.

Based on this, the inventors found that it is desirable to explore graph training samples suitable for graph processing models. However, currently, it is technically challenging to efficiently learn graph training samples suitable for graph processing models. It has been demonstrated in many literature of network science that graph generation may be subject to certain principle-based constraints, such as configuration models. Considering these principles, the learned graph can be driven from the root to maintain a regular global structure and be more robust to noise in actual observations. However, most of the above methods parameterize each edge of the graph, and do not consider the underlying generation mechanism of the graph and the global structure. Thus, the learned graph is less tolerant to noise and sparsity. In addition, learning a graph from one information source inevitably leads to bias and uncertainty, with the reasonable assumption that if an edge exists over multiple measurements, the confidence that the edge exists will be greater.

In summary, the embodiment of the present invention provides a new graph processing model, which is referred to as a graph structure estimation model. Referring to fig. 2, a schematic structural diagram of a graph structure estimation model provided in an embodiment of the present invention is shown, where a graph processing model 200 shown in fig. 2 may also be referred to as a graph structure estimation neural network (GEN), the graph processing model 200 shown in fig. 2 includes a graph prediction model 201 and a graph estimator 202, the graph prediction model 201 is used to predict a category to which each node in any graph input into the graph prediction model belongs, and the graph prediction model may be a model constructed based on the graph neural network, for example, the graph prediction model is a model constructed based on a graph convolution network GCN in the graph neural network; the map estimator 202 is configured to estimate according to observation information input to the map estimator to obtain an estimated map. The input to the graph estimator 202 is derived during the prediction process for any graph according to a graph prediction model.

Based on the new graph structure estimation model shown in fig. 2, an embodiment of the present invention provides a model processing scheme for training the graph structure estimation model 200 shown in fig. 2. Specifically, a graph prediction model 201 in the graph structure estimation model 200 is called to perform prediction processing on an initial graph (the initial graph may be referred to as a graph training sample), so as to obtain observation information observed by the graph prediction model 201 on the initial graph; then calling a graph estimator 202 in the graph structure estimation model 200 to estimate according to the observation information and label information corresponding to the initial graph to obtain an estimation graph; further, the estimation map is input to the map prediction model 201 to be predicted to obtain prediction information, and then the map prediction model 201 is optimized based on the prediction information and the label information. And if the training ending instruction is detected, combining the optimized graph prediction model and the graph estimator into an optimized graph structure estimation model.

Subsequently, if the graph to be recognized is detected to be input into the optimized graph structure estimation model, a graph prediction model 201 in the graph structure estimation model is called to perform prediction processing on the graph to be recognized, and observation information is obtained; further, calling a graph estimator 202 in the graph structure estimation model to perform estimation processing based on the observation information to obtain an estimation graph; then, the estimated map is subjected to prediction processing using a map prediction model, and a processing result is output.

It can be seen that, in the process of training the new graph structure estimation model by using the model processing scheme in the embodiment of the present invention, the graph prediction model is optimized not only based on the initial graph and the label information corresponding to the initial graph, but also based on the estimation graph. The estimated map is obtained by estimating observation information obtained by predicting the initial map by the map estimator based on the map prediction model, in other words, the estimated map is obtained by observing the initial map from the perspective of the map prediction model, that is, the estimated map is more matched with the property of the map prediction model, that is, more matched with the property of the map structure estimation model, than the initial map. Therefore, the accuracy of the graph prediction model can be improved by optimally training the graph prediction model based on the estimation graph, and the accuracy of the graph structure estimation model can be improved.

Based on the new graph structure estimation model and the model processing scheme, the embodiment of the invention provides a model processing method. Referring to fig. 3, which is a schematic flowchart of a model processing method according to an embodiment of the present invention, the model processing method shown in fig. 3 may be executed by a model processing device, and specifically may be executed by a processor of the model processing device. The model processing device may refer to a terminal or a server, where the terminal may be a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, and the like, but is not limited thereto; the server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as cloud service, a cloud database, cloud computing, a cloud function, cloud storage, network service, cloud communication, middleware service, domain name service, security service, CDN, and a big data and artificial intelligence platform. The model processing method shown in fig. 3 may include the steps of:

step S301, acquiring an initial graph and label information corresponding to the initial graph.

In one embodiment, the initial graph may be any graph, and the initial graph may include a plurality of nodes. Alternatively, assume that the initial graph can be represented as

Wherein

Representing a set of nodes comprised by the initial graph, assuming that the initial graph comprises N nodes

ε represents the set of edges in the initial graph, X represents the node feature matrix of the initial graph, and X can be represented as:

x_irefers to node v_iI is 1 or more and N or less.

Optionally, the label information corresponding to the initial graph is used to indicate a category to which the target node belongs in the initial graph, and the target node may refer to any one or more of a plurality of nodes included in the initial graph. For example, the target node in the initial graph is represented as

v₁The corresponding tag information may be represented as y₁，v₂The corresponding tag information may be represented as y₂By analogy, the label information corresponding to the initial graph can be represented as

As can be seen from the foregoing, edges in the initial graph are used to describe relationships between nodes. In addition, the relationship between the nodes in the initial graph can also be represented by an adjacent matrix corresponding to the initial graph, wherein each row and column in the adjacent matrix are used for representing the vertex in the initial graph and are stored in v_iLine v_jThe matrix elements at the column positions represent nodes v_iAnd node v_jWhether an edge exists between the two, j is more than or equal to 1 and less than or equal to N. Thus, the initial graph can be constructed based on the adjacency matrix, and therefore, in the graph processing field, any graph can also be represented by the adjacency matrix corresponding to the graph.

For example, assume that if there is an edge between two nodes, the matrix element at the intersection of the two nodes is 1; if there is no edge between two nodes, the matrix element at the intersection of the two nodes is 0. Referring to fig. 4a, a schematic diagram of an initial graph provided by the embodiment of the present invention, assuming that the initial graph is an undirected graph, a adjacency matrix corresponding to the initial graph may be as shown in 401 in fig. 4 b. As can be seen from FIG. 4a, there is an edge between node v0 and node v1, and therefore, in the adjacency matrix shown in FIG. 4b, the matrix element at the intersection 402 of node v0 and node v1 is 1; there is no edge between node v1 and node v5, so in the adjacency matrix shown in fig. 4b, the matrix element at the intersection 403 of node v1 and node v5 is 0.

Step S302, a graph prediction model included in the graph structure estimation model is called to carry out prediction processing on the initial graph, so that observation information corresponding to the initial graph is obtained, and the observation information includes prediction information corresponding to the initial graph.

As can be seen from the foregoing, the graph structure estimation model may be composed of a graph prediction model and a graph estimator, where the graph prediction model is configured to perform prediction processing on an initial graph input into the graph prediction model to obtain prediction information corresponding to the initial graph, and the prediction information is used to represent a category to which each node in the initial graph belongs; in addition, the graph prediction model generates a neighbor graph set corresponding to the initial graph in the process of predicting the initial graph, and the neighbor graph set and the prediction information form observation information observed by the graph prediction model on the initial graph. The graph estimator is used for estimating observation information observed by the initial graph according to the graph prediction model to obtain an estimation graph corresponding to any graph.

Alternatively, the graph prediction model may be constructed based on a graph neural network, such as a graph convolution neural network GCN, or a graph neural network GNN. In the embodiment of the invention, the graph prediction model is preferably constructed based on the graph convolution neural network GCN,

in one embodiment, the graph prediction model may include a plurality of convolutional layers, each convolutional layer for performing convolutional processing on data input to the convolutional layer, the output of a previous convolutional layer being input to a subsequent convolutional layer. In the embodiment of the invention, the working principle of the graph prediction model is explained by taking the graph prediction model comprising two convolution layers as an example, and the number of the convolution layers of the graph prediction model can be set according to actual requirements in practical application.

Specifically, the graph prediction model includes a first convolution layer and a second convolution layer, and the graph prediction model included in the call graph structure estimation model performs prediction processing on the initial graph to obtain observation information corresponding to the initial graph, including:

acquiring a node characteristic matrix of the initial graph, inputting the node characteristic matrix into the first convolution layer, and performing convolution operation based on a first weight parameter corresponding to the first convolution layer to obtain a first node representation matrix; inputting the first node representation matrix into the second convolution layer for convolution so as to carry out convolution operation based on a second weight parameter corresponding to the second convolution layer to obtain a second node representation matrix, and carrying out normalization processing on the second node representation matrix to obtain prediction information corresponding to the initial graph; constructing a first neighbor graph based on the first node representation matrix, and constructing a second neighbor graph based on the second node representation matrix, and grouping the first neighbor graph and the second neighbor graph into the neighbor graph set.

The following is specifically described: the graph prediction model of the embodiment of the invention is constructed based on a GCN network, the GCN follows a neighborhood aggregation strategy, a node expression matrix is updated iteratively by aggregating node expressions of neighbors, formally, the kth convolutional layer aggregation rule of the GCN can be shown as a formula (1):

in the formula (1), the first and second groups,

representing the normalized adjacency matrix for the original graph,

the degree matrix corresponding to the initial graph is represented, and matrix elements on opposite corners of the degree matrix are summations of rows where the adjacent matrixes are located, and can be specifically represented by the following formula (2):

in the formula (1), W^(k)And representing a weight matrix of the kth convolutional layer, wherein the weight matrix of each convolutional layer forms a model parameter of the graph prediction model, and the value of k is greater than or equal to 1 and less than or equal to the number of convolutional layers included in the graph prediction model. Each convolution layer in the graph prediction model corresponds to a weight matrix, and the weight matrix corresponding to each convolution layer forms model parameters of the graph prediction model. Optionally, let us assume that the weight matrix corresponding to the first convolutional layer is represented as W⁽¹⁾The weight matrix corresponding to the second convolutional layer is represented as W⁽²⁾Then the model parameters of the graph prediction model can be expressed as: Θ ═ W⁽¹⁾，W⁽²⁾)。

In equation (1), σ represents the activation function, H^(k-1)Represents a node representation matrix corresponding to the (k-1) th convolutional layer, H^(k)And a node representation matrix corresponding to the kth convolutional layer. As a specific example, the node feature matrix X corresponding to the initial graph may also be represented in the form of a node representation matrix, such as H⁽⁰⁾X. The prediction information corresponding to the initial graph is obtained by normalizing the node representation matrix output by the last convolutional layer, and if the graph prediction model has only one convolutional layer, the prediction information corresponding to the initial graph can be represented as Z ═ H⁽¹⁾。

After the node representation matrix corresponding to the first convolutional layer and the node representation matrix corresponding to the second convolutional layer are obtained based on the above description, a neighbor graph set is further constructed based on each node representation matrix. Specifically, assuming that the node representation matrix corresponding to each convolution layer and the node feature matrix of the initial graph form a node representation set

As can be seen from the foregoing, H⁽⁰⁾X denotes a node feature matrix corresponding to the initial graph. Based on H⁽⁰⁾Constructed ofThe adjacency matrix of the neighbor graph (kNN) is denoted as O⁽⁰⁾The adjacency matrix of the neighbor graph constructed based on the first node representation matrix is represented as O⁽¹⁾By analogy, the neighbor graph set constructed based on the node representation set is represented as

It should be understood that the initial graph is also an important external observation used by the graph prediction model to observe the initial graph, and therefore, in the embodiment of the present invention, the initial graph and the above-mentioned neighbor graph set are used as an observation graph set for observing the initial graph. Further, the observation graph set corresponding to the initial graph and the prediction information corresponding to the initial graph are input into the graph estimator as observation information.

In an embodiment, the graph prediction model included in the graph structure estimation model in step S302 may be a model that is trained in advance and has reached convergence, and in the embodiment of the present invention, after the initial graph is obtained, the graph prediction model does not need to be trained, and step S302 is directly performed. Alternatively, in order to improve the degree of adaptation between the graph prediction model and the initial graph, even if the graph prediction model reaches convergence in the beginning step S302, the embodiment of the present invention may further optimize the graph prediction model based on the initial graph and the label information corresponding to the initial graph.

In other embodiments, if the graph prediction model is a model that is not trained to converge before performing step S302, the model processing device in the embodiment of the present invention needs to train and optimize the graph prediction model based on the initial graph and the label information corresponding to the initial graph.

In an embodiment, if it is required to optimize the graph prediction model based on the initial graph and the label information corresponding to the initial graph, after the graph prediction model performs the prediction processing on the initial graph for the first time to obtain the neighbor graph set, the method may further include: acquiring the prediction information corresponding to the target node from the prediction information corresponding to the initial graph, and determining the difference information between the prediction information corresponding to the target node and the tag information; constructing a loss function corresponding to the graph prediction model according to the difference information;

updating the first weight parameter and the second weight parameter in a direction of decreasing the value of the penalty function; and updating the prediction information corresponding to the initial graph and the neighbor graph set respectively based on the updated first weight parameter and the updated second weight parameter.

Wherein, the loss function corresponding to the graph prediction model may be a cross entropy function, if the initial graph is represented by the adjacency matrix a corresponding to the initial graph, the node feature matrix corresponding to the initial graph is X, and the label information identifier corresponding to the initial graph is X

The corresponding loss function of the graph prediction model can be expressed as shown in the following formula (3):

in the formula (3), the first and second groups,

is a mapping function learned by the graph prediction model,

is node v_iThe corresponding information of the prediction is used,

for measuring the difference between the prediction information of any one node and the label information of the node.

In summary, if the graph prediction model does not need to be optimized based on the initial graph and the label information corresponding to the initial graph in the embodiment of the present invention, step S302 is obtained by a primary prediction process of the graph prediction model. That is, the initial graph is input into the graph prediction model for processing, so that observation information corresponding to the initial graph can be output; if the model processing device needs to optimize the graph prediction model based on the initial graph and the label information corresponding to the initial graph in the embodiment of the invention, step S302 is obtained by performing multiple prediction processes on the graph prediction model, the model parameters of the graph prediction model used in each prediction process are optimized based on the prediction information obtained in the last prediction process and the label information corresponding to the initial graph, and in each prediction process, the prediction information corresponding to the initial graph and the neighbor graph set are updated; and when the graph prediction model reaches convergence, writing the prediction information and the neighbor graph set corresponding to the initial graph of the last prediction processing into the observation information.

Step S303, a graph estimator included in the graph structure estimation model is called to perform estimation processing based on the label information and the observation information to obtain an estimation graph, and the graph prediction model is called to perform prediction processing on the estimation graph to obtain prediction information corresponding to the estimation graph.

After obtaining the observation information corresponding to the initial graph through step S302, the problems that we need to solve are: based on these observations, what is the estimated graph matching the graph prediction model? These observations reveal the estimated maps that match the map prediction model from different perspectives, but they may be unreliable or incomplete and have no a priori knowledge to determine the accuracy of any one observation. In this case, it is not easy to answer the above-mentioned questions directly, but it is relatively easy to answer the opposite questions. Assuming that an estimated map matching the map prediction model corresponding to the initial map has been generated, the model processing device may calculate a probability of mapping the estimated map to the above-described observation information. If the point is mastered, inversion can be carried out according to Bayesian inference, and posterior distribution of the graph structure is calculated, so that the purpose of obtaining an estimation graph matched with the graph prediction model is achieved.

That is to say, after obtaining the observation information corresponding to the initial graph in the embodiment of the present invention, the graph estimator first estimates a plurality of candidate graphs corresponding to the initial graph; then determining the probability of the existence of the observation information if any candidate graph is taken as an estimation graph matched with the graph prediction model; and then, estimating the probability of the existence of the observation information based on the probability of each candidate graph as an estimation graph and the corresponding generation probability of each candidate graph to obtain an estimation graph matched with the graph prediction model.

Specifically, the graph estimator may include a structure submodel and an observation submodel, where multiple candidate graphs corresponding to the initial graph are executed by invoking the structure submodel, and when any candidate graph is used as an estimation graph matched with the graph prediction model, the probability of existence of the observation information is executed by invoking the observation submodel; specifically, the map estimator included in the call map structure estimation model performs estimation processing based on the label information and the observation information to obtain an estimated map, and includes:

calling the structure submodel to generate N candidate graphs based on the label information and the prediction information in the observation information, and determining the generation probability corresponding to each candidate graph based on a first parameter of the structure submodel and an adjacent matrix corresponding to each candidate graph in the N candidate graphs; calling the observation submodel to calculate the existence probability of the observation information corresponding to the corresponding candidate graph based on the second parameter of the observation submodel, the observation information and the adjacent matrix corresponding to each candidate graph; wherein, the existence probability of the observation information corresponding to the candidate map m is used for representing the existence probability of the observation information when the candidate map m is taken as the estimation map, and m is greater than or equal to 1 and less than or equal to N; and carrying out graph estimation based on the generation probability corresponding to each candidate graph and the observation information existence probability corresponding to each candidate graph to obtain an estimated adjacency matrix, and generating the estimated graph according to the estimated adjacency matrix.

In one embodiment, the graph estimator may be trained when constructing the graph structure estimation model, and in the embodiment of the present invention, the graph estimator does not need to be trained again; or, in order to obtain an estimated map that is more matched with the map prediction model in the embodiment of the present invention, optimization processing may also be performed on the map estimator in the process of performing estimation processing based on the label information and the observation information. This section is described in detail in the following examples.

And S304, optimizing the graph prediction model based on the prediction information and the label information corresponding to the estimation graph.

After the estimation graph is obtained, the graph prediction model is called to carry out prediction processing on the estimation graph to obtain prediction information corresponding to the estimation graph, and then the graph prediction model is optimized based on the prediction information corresponding to the estimation graph and the label information corresponding to the initial graph. It should be understood that the estimated graph is estimated based on the observed information of the initial graph observed by the graph prediction model, and compared with the initial graph, the characteristics of the estimated graph and the graph prediction model are more matched, and the accuracy of the graph prediction model can be improved by optimizing the graph prediction model based on the estimated graph.

In one embodiment, optimizing the graph prediction model based on the prediction information and the label information corresponding to the estimation graph may include: acquiring the prediction information corresponding to the target node from the prediction information corresponding to the estimation graph; determining difference information between the prediction information and the label information corresponding to the target node; constructing a loss function corresponding to the graph prediction model according to the difference information; and updating the first weight parameter and the second weight parameter of the graph prediction model according to the direction of reducing the value of the loss function so as to optimize the graph prediction model.

It should be understood that the above steps S301 to S304 only illustrate the first training process of the graph structure estimation model, and if after step S304, the training-ending event is not detected, the observation information corresponding to the estimation graph is obtained during the prediction processing process of the estimation graph by calling the graph prediction model; calling the graph estimator to perform estimation processing based on the label information and observation information corresponding to the estimation graph to obtain a new estimation graph, and calling the optimized graph prediction model to perform prediction processing on the new estimation graph to obtain prediction information corresponding to the new estimation graph; and optimizing again based on the prediction information corresponding to the new estimation graph and the graph prediction model after the label information corresponding optimization. The training end event may refer to that the training process is performed for a preset time, or the training end event may refer to that an instruction or operation for ending training is received.

In the embodiment of the invention, a new graph processing model is provided, which is called a graph structure estimation model, and the graph structure estimation model consists of a graph prediction model and a graph estimator. In the process of optimizing the graph structure estimation model, a graph prediction model in the graph structure estimation model can be used for carrying out prediction processing on an initial graph to obtain observation information corresponding to the initial graph, wherein the observation information comprises prediction information corresponding to the initial graph; then, the calling graph estimator carries out estimation processing based on the label information and the observation information to obtain an estimation graph; further, calling the graph prediction model to perform prediction processing on the estimation graph to obtain prediction information corresponding to the estimation graph; and optimizing the graph prediction model based on the prediction information corresponding to the estimation graph and the label information corresponding to the initial graph.

Based on the above model processing method, an embodiment of the present invention provides another model processing method, and referring to fig. 5, a schematic flow diagram of another model processing method provided in an embodiment of the present invention is provided. The model processing method shown in fig. 5 may be executed by a model processing apparatus, and specifically may be executed by a processor of the model processing apparatus. The model processing device may refer to a terminal or a server, where the terminal may be a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, and the like, but is not limited thereto; the server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as cloud service, a cloud database, cloud computing, a cloud function, cloud storage, network service, cloud communication, middleware service, domain name service, security service, CDN, and a big data and artificial intelligence platform. The model processing method shown in fig. 5 may include the steps of:

step S501, an initial graph and label information corresponding to the initial graph are obtained.

Step S502, calling a graph prediction model included in the graph structure estimation model to perform prediction processing on the initial graph to obtain observation information corresponding to the initial graph, wherein the observation information includes prediction information corresponding to the initial graph.

In an embodiment, some possible implementations included in step S501 and step S502 may refer to descriptions of related steps in the embodiment of fig. 3, and are not described herein again.

Step S503, calling a graph estimator included in the graph structure estimation model to perform estimation processing based on the label information and the observation information to obtain an estimation graph.

In an embodiment, as can be seen from the embodiment shown in fig. 3, the map estimator included in the call graph structure estimation model performs estimation processing based on the label information and the observation information to obtain an estimated map, including the following steps:

calling the structure submodel to generate N candidate graphs based on the label information and the prediction information in the observation information, and determining the generation probability corresponding to each candidate graph based on a first parameter of the structure submodel and an adjacent matrix corresponding to each candidate graph in the N candidate graphs;

invoking the observation submodel to calculate the existence probability of the observation information corresponding to the corresponding candidate graph based on the second parameter of the observation submodel, the observation information and the adjacent matrix corresponding to each candidate graph; wherein, the existence probability of the observation information corresponding to the candidate map m is used for representing the existence probability of the observation information when the candidate map m is taken as the estimation map, and m is greater than or equal to 1 and less than or equal to N;

carrying out image estimation based on the generation probability corresponding to each candidate image and the observation information existence probability corresponding to each candidate image to obtain an estimated adjacency matrix, and generating the estimated image according to the estimated adjacency matrix.

As can be known from the foregoing description, the graph prediction model in the embodiment of the present invention is constructed based on the GCN network, and in consideration of the local smooth characteristic of the GCN network, a random block model SBM may be used as a structure sub-model, and the random block model may be widely used for community detection and is suitable for modeling a graph with a relatively strong community structure. The intra-community and inter-community parameter values in the random block model SBM may constrain the homogeneity of the estimation map.

Optionally, assuming that the N candidate maps include a first candidate map, the first candidate map may be any one of the N candidate maps, and a specific implementation manner of the first step is described below by taking the first candidate map as an example. The generation probability corresponding to the first candidate graph is determined based on the first parameter of the structure sub-model and the adjacency matrix corresponding to the first candidate graph, and the generation probability corresponding to the first candidate graph can be expressed as shown in the following formula (4):

wherein the content of the first and second substances,

in the formula (4), the first and second groups,

may represent the probability of generation of the candidate map m, where it is assumed that the candidate map m is the first candidate map, G is the adjacency matrix corresponding to the first candidate map, Z represents the predicted value included in the observation information,

representing label information corresponding to the initial graph; Ω denotes a first parameter of the structure submodel, which assumes that the probability of existence of an edge between nodes depends only on its community. For example, Community c_iNode v of_iAnd community c_jNode v of_jStorage of borderIn the probability expression of

The generation probability corresponding to each candidate graph can be obtained according to the above formula (4) and formula (5).

After obtaining the N candidate maps and the generation probability corresponding to each candidate map in the N candidate maps, what is to do next is to introduce an observation submodel to describe how each candidate map is mapped to the observation information. Taking the first candidate map of the N candidate maps as an example, the implementation of the second step is described in detail below. In the following description it is assumed that the observation of edges is an independent, uniformly distributed Bernoulli random variable, conditioned on the presence or absence of edges in the candidate graph. Specifically, the method for calculating the existence probability of the observation information corresponding to each candidate graph by calling the observation submodel based on the second parameter of the observation submodel, the observation information and the adjacency matrix corresponding to each candidate graph comprises the following steps:

s51: acquiring the total amount of data included in the observation information, and determining an observation probability parameter according to the second parameter;

as can be seen from the foregoing, the observation information includes the prediction information corresponding to the initial graph, the neighbor graph set, and the initial graph. Wherein, it should be understood that the initial graph includes a plurality of nodes, and the prediction information corresponding to the initial graph is used to indicate the category to which each node in the initial graph belongs, so in essence, the prediction information corresponding to the initial graph may include a prediction value for indicating each node of the category to which each node belongs; the neighbor graph set comprises a target neighbor graph obtained according to the node characteristic matrix corresponding to the initial graph and a plurality of neighbor graphs obtained according to the node representation matrix corresponding to each convolution layer of the graph prediction model.

The total amount of data included in the observation information refers to the total amount of the above information, for example, if the initial graph includes 5 nodes, and the graph prediction model includes 2 convolutional layers, the prediction information includes 5 predicted values, the neighbor graph set includes 1 target neighbor graph, and 2 neighbor graphs obtained according to the node representation matrix, the total amount of data included based on the observation information may be equal to: 5 predictors +3 neighbors +1 initial graph 8.

In one embodiment, the observation probability parameters determined according to the second parameters may include four types, which are the first type parameters, the second type parameters, the third type parameters, and the fourth type parameters. The first type of parameter refers to the probability that any data in the observation set indicates the existence of an edge between a node i and a node j when the edge exists between the node i and the node j; the second type of parameter refers to the probability that any data in the observation set indicates that an edge does not exist between a node i and a node j when the edge exists between the node i and the node j; the third type of parameter refers to a probability that any data in the observation set indicates that an edge does not exist between the node i and the node j when the edge does not exist between the node i and the node j, and the fourth type of parameter refers to a probability that any data in the observation set indicates that an edge does not exist between the node i and the node j when the edge does not exist between the node i and the node j.

In a colloquial way, assuming that the second parameter of the observation submodel includes α and β, each data in the observation information is parameterized by the two parameters, for example, true-positive represents a first type parameter, and the value of the parameter is α, that is, represents the probability that any data observes that one really existing edge exists in any candidate graph; false-positive represents a third type of parameter, the value of the parameter is beta, namely, the probability that any data observes that a real edge in any candidate graph does not exist is represented; true-negative represents a second type of parameter, and the value of the parameter is 1-alpha, namely, the probability that any data observes that a true edge in any candidate graph does not exist is represented; false-negative represents a fourth type of parameter, and the value of the parameter can be 1-beta, namely representing that any data observes the probability that one true non-existing edge does not exist in any candidate graph.

S52: acquiring a first quantity of data indicating that an edge exists between a node i and a node j in observation information, and determining a second quantity of data indicating that an edge does not exist between the node i and the node j in the observation set according to the first quantity and the total quantity; the node i and the node j are any two nodes included in the first candidate graph, and i is smaller than j;

s53: calculating edge correlation probability between the node i and the node j according to the observation probability parameter, the first quantity and the second quantity;

the edge correlation probability between the node i and the node j is obtained by multiplying two probabilities, wherein one is the probability that a first number of edges occur between the node i and the node j in data included in the observation information if the edges between the node i and the node j actually exist in the first candidate graph; the other is that the edge between node i and node j does not really exist in the first candidate graph, and the observation information includes the probability that a first number of edges occur between node i and node j in the data.

For example, assume that the total amount of data included in the observation information is M, and a first amount of data indicating that an edge exists between node i and node j in the observation information is denoted as E_ijThen the second amount of data in the observation information that indicates that no edge exists between node i and node j is M-E_ij. Inputting the observation probability parameter, the first number and the second number into an operation formula of the inter-node edge correlation probability for operation, so as to obtain the edge correlation probability between the node i and the node j, wherein the edge correlation probability between the node i and the node j, which is calculated according to the observation probability parameter, the first number and the second number, can be expressed as a formula (6):

in formula (6), G represents the adjacency matrix corresponding to the first candidate graph, G_ijIs a matrix element in the adjacency matrix, the value depending on whether or not there is an edge between node i and node j in the first candidate graph, the value being 1 if there is an edge between node i and node j, and the value being 0 if there is no edge between node i and node j. In the formula (6), the first and second groups,

indicates if the section is in the first candidate graphIf the edge between the point i and the node j really exists, the probability that a first number of edges appear between the node i and the node j in the data included in the observation information;

indicating the probability of a first number of edges occurring between node i and node j in the data comprised by the observation information if the edges between node i and node j are really not present in the first candidate graph.

S54: and multiplying the edge correlation probabilities between every two nodes in the first candidate graph to obtain the existence probability of the observation information corresponding to the first candidate graph.

According to the method, the edge correlation probability between every two nodes in the first candidate graph can be calculated, and further, the edge correlation probability between every two nodes in the first candidate graph is multiplied to obtain the existence probability of the observation information corresponding to the first candidate graph. As can be seen from the foregoing, the observation information existence probability corresponding to the first candidate graph is used to indicate the probability of existence of observation information when the first candidate graph is used as the estimation graph.

Optionally, the edge correlation probability between every two nodes in the first candidate graph is multiplied to obtain the existence probability of the observation information corresponding to the first candidate graph, which can be represented by the following formula (7):

in one embodiment, the existence probability of the observation information corresponding to each candidate map in the plurality of candidate maps may be obtained through steps S51-S54, and further, the estimated adjacency matrix may be obtained through the above-mentioned step of performing map estimation processing according to the generation probability corresponding to each candidate map and the existence probability of the observation information corresponding to each candidate map, and the estimated map may be generated according to the estimated adjacency matrix.

In a specific implementation, assuming that N candidate maps include a second candidate map in addition to the first candidate map, the performing map estimation based on the generation probability corresponding to each candidate map and the observation information existence probability corresponding to each candidate map to obtain an estimated adjacency matrix includes:

acquiring the probability of the first parameter and the probability of the second parameter; inputting the probability of the first parameter, the probability of the second parameter, the generation probability corresponding to the first candidate graph and the observation information existence probability corresponding to the first candidate graph into a candidate probability calculation rule for operation to obtain a first candidate probability of the first candidate graph as an estimation graph; inputting the probability of the first parameter, the probability of the second parameter, the generation probability corresponding to the second candidate graph and the observation information existence probability corresponding to the second candidate graph into the candidate probability calculation rule for operation to obtain a second candidate probability of the second candidate graph as an estimation graph; and generating an estimated adjacency matrix based on the first candidate probability, the adjacency matrix corresponding to the first candidate graph, the second candidate probability and the adjacency matrix corresponding to the second candidate graph.

In brief, first candidate probabilities of a first candidate graph as an estimation graph and second candidate probabilities of a second candidate graph estimation graph are calculated, and then an estimated adjacency matrix is generated based on the first candidate probabilities and the second candidate probabilities and adjacency matrices corresponding to the respective candidate graphs.

Wherein, the probability of the first parameter and the probability of the second parameter are determined according to the first parameter and the second parameter, respectively, the probability of the first parameter can be expressed as P (Ω), and since the second parameter includes two of α and β, the probability of the second parameter also includes two, respectively expressed as P (α) and P (β). The candidate probability calculation rule may be expressed as shown in the following formula (8):

in equation (8), G represents the adjacency matrix corresponding to any candidate graph,

to representThe probability of generation corresponding to any one of the candidate maps,

indicating the probability of existence of observation information corresponding to any candidate graph. The probability of the first parameter, the probability of the second parameter, the generation probability corresponding to the first candidate graph and the observation information existence probability corresponding to the first candidate graph are brought into a formula (8) to obtain the first candidate probability corresponding to the first candidate graph; similarly, the probability of the first parameter, the probability of the second parameter, the generation probability corresponding to the second candidate graph, and the observation information existence probability corresponding to the second candidate graph are introduced into the formula (8), so that the second candidate probability corresponding to the second candidate graph can be obtained.

Optionally, after obtaining the first candidate probability of the first candidate graph as the estimation graph and the second candidate probability of the second candidate graph as the estimation graph by using the above formula (8), a probability distribution may be generated by using the first candidate probability and the second candidate probability, and is denoted as q (g), where the probability distribution includes the first candidate probability and the second candidate probability, and if the N candidate graph includes only the first candidate graph and the second candidate graph, the sum of the two candidate probabilities in q (g) is equal to 1.

Further, an estimated adjacency matrix is generated based on the first candidate probability, the adjacency matrix corresponding to the first candidate map, the second candidate probability, and the adjacency matrix corresponding to the second candidate map.

In a specific implementation, it is assumed that the adjacency matrix corresponding to the first candidate graph includes M rows and M columns, and then the adjacency matrix corresponding to the first candidate graph includes M by M matrix elements, and similarly, the adjacency matrix corresponding to the second candidate graph also includes M rows and M columns, and then the adjacency matrix corresponding to the second candidate graph also includes M by M matrix elements; estimating that the adjacency matrix comprises M rows and M columns, and M times M target matrix elements; the generating an estimated adjacency matrix based on the first candidate probability, the adjacency matrix corresponding to the first candidate graph, the second candidate probability, and the adjacency matrix corresponding to the second candidate graph includes:

acquiring a first matrix element at the jth row and jth column position in an adjacent matrix corresponding to the first candidate map, and acquiring a second matrix element at the jth row and jth column position in the adjacent matrix corresponding to the second candidate map; multiplying the first matrix element by the first candidate probability to obtain a first operation result, and multiplying the second matrix element by the second candidate probability to obtain a second operation result; and performing addition operation on the first operation result and the second operation result, wherein the operation result is used as a target matrix element at the ith row and jth column position in the estimated adjacent matrix.

The above process of calculating the target matrix element can be expressed by equation (9):

in formula (9), G represents a set of adjacency matrices corresponding to N candidate maps, and G represents_ijRepresenting matrix elements, Q, in any one contiguous matrix, located in the ith row and jth column_ijRepresenting the target matrix element in the estimated adjacency matrix located at the ith row and jth column.

For example, assume that the adjacency matrix corresponding to the first candidate map is shown as 600 in fig. 6, the adjacency matrix corresponding to the second candidate map is shown as 601 in fig. 6, and the estimated adjacency matrix is shown as 602 in fig. 6. The three adjacency matrices are all 5 rows and 5 columns and include 5 by 5 or 25 matrix elements, and since the first candidate map and the second candidate map are known, each matrix element in the adjacency matrix of the first candidate map and the matrix of the second candidate map is known, but each target matrix element in the estimated adjacency matrix is unknown, and needs to be derived from the adjacency matrix of the first candidate map and the adjacency matrix of the second candidate map. Specifically, matrix element 1 in the adjacency matrix 600 at the position of the first row and the second column is multiplied by the first candidate probability corresponding to the first candidate graph, assuming that the first candidate probability is 0.6, and the multiplication result is 0.6; then, multiplying the matrix element 0 positioned at the second column position in the first row in the adjacent matrix 601 by a second candidate probability corresponding to the second candidate graph, assuming that the second candidate probability is 0.4, and obtaining a multiplication result of 0; 0+ 0.6-0.6 is taken as the target matrix element at the first row, second column position 61 in the estimated adjacency matrix. By analogy, each target matrix element in the estimated adjacency matrix can be obtained.

After the estimated adjacency matrix is obtained through the above steps, an estimated map can be generated according to the estimated adjacency matrix.

Step S504, the graph prediction model is called to carry out prediction processing on the estimation graph to obtain prediction information corresponding to the estimation graph, and the graph prediction model is optimized based on the prediction information corresponding to the estimation graph and the label information.

In one embodiment, in the process of generating the estimated adjacency matrix in step S503, assuming that the first parameter of the structure sub-model and the second parameter of the observation sub-model are already optimized, in practical applications, in order to improve better matching between the graph estimator and the graph prediction model, the first parameter and the second parameter may also be optimized in the process of generating the estimated adjacency matrix; and then updating the estimated adjacency matrix which is estimated according to the optimized first parameter and the optimized second parameter.

In a specific implementation, before generating an estimated adjacency matrix based on the first candidate probability, the adjacency matrix corresponding to the first candidate graph, the second candidate probability, and the adjacency matrix corresponding to the second candidate graph, the method further includes: optimizing the first parameter and the second parameter according to the first candidate probability and the second candidate probability; and updating the first candidate probability and the second candidate probability respectively by adopting the first parameter after optimization processing and the second parameter after optimization processing. Thus, the estimated adjacency matrix finally generated in step S503 is obtained based on the updated first candidate probability and the updated second candidate probability.

In one embodiment, the optimizing the first parameter and the second parameter according to the first candidate probability and the second candidate probability includes:

adding the first candidate probability and the second candidate probability to obtain a combined posterior probability expression of the first parameter and the second parameter; transforming the combined posterior probability expression by using a preset inequality; performing derivation operation on the transformed joint posterior probability based on the first parameter to obtain an optimization equation corresponding to the first parameter, and performing derivation operation on the transformed joint posterior probability based on the second parameter to obtain an optimization equation corresponding to the second parameter; and solving the optimization equation corresponding to the first parameter to obtain the optimized first parameter, and solving the optimization equation corresponding to the second parameter to obtain the optimized second parameter.

The expression of the joint posterior probability of the first parameter and the second parameter obtained by adding the first candidate probability and the second candidate probability can be expressed by the following formula (10):

in formula (10), G represents a set of adjacency matrices corresponding to each candidate map of the N candidate maps,

and representing the candidate probability corresponding to the candidate map m in the N candidate maps.

Then, a combined posterior probability expression is transformed by adopting a preset inequality such as a jensen inequality, that is, the formula (10) is transformed to obtain a formula (11):

further, the formula (11) is derived for the first parameter respectively to obtain an optimized equation corresponding to the first parameter, as shown in the following formula (12):

and (3) deriving the formula (11) based on the second optimization parameter to obtain an optimization equation corresponding to the second parameter, as shown in the following formula (13) and formula (14):

solving equation (12) may result in the optimized first parameter, and solving equations (13) and (14) may result in the optimized second parameter.

For computational convenience, the order of summation of equation (12) can be exchanged, and the following rule is found to exist:

wherein the content of the first and second substances,

in formula (15), r and s represent two communities, and the interpretation of formula (15) is the probability Ω of the existence of an edge between community r and community s_rsIs the average probability that an edge exists between all nodes in the two communities.

Similar calculations are also performed for equations (13) and (14), respectively, resulting in equations (16) and (17):

the first parameter after the optimization processing and the second parameter after the optimization processing can be obtained through the formulas (15) to (17). Further, the estimated adjacency matrix is updated based on the optimized first parameter and the optimized second parameter. In a specific implementation, updating the estimated adjacency matrix based on the optimized first parameter and the optimized second parameter may be understood as substituting the above equations (15) - (17) into equation (9) for determining the estimated adjacency matrix, rewriting the equation for determining the estimated adjacency matrix, and expressing the updated estimated adjacency matrix as the following equation (18):

in one embodiment, in the above-described optimization of the first parameter and the second parameter, the expression form of the probability distribution q (g) of the existence probability of the observation information is continuously updated. The right side of the inequality in the above equation (11) is maximized when the equal sign is established, and if the equal sign is established, the expression of q (g) can be expressed as shown in equation (19):

substituting equations (4) and (7) into equation (19) and eliminating constants in the scores, the expression of q (g) can be transformed as shown in equation (20):

then, the formula (20) is rewritten based on the formula (18), and finally, the expression of q (g) is obtained as shown in the formula (21):

in an embodiment, after the estimated adjacency matrix is updated, an estimated graph may be generated according to the updated estimated adjacency matrix, and further, a graph prediction model is invoked to perform prediction processing on the estimated graph to obtain prediction information corresponding to the estimated graph, and the graph prediction model is optimized based on the prediction information corresponding to the estimated graph and the label information.

Optionally, the target matrix elements in the estimated adjacency matrix obtained by the above steps are between [0,1], but the fully connected graph is not only large in calculation amount, but also not significant for most applications, so when generating the estimated graph from the estimated adjacency matrix, we may first perform sparsification on the estimated adjacency matrix to extract the target adjacency matrix from the estimated adjacency matrix; then, an estimation map is generated from the target adjacency matrix.

In a specific implementation, the thinning the estimated adjacency matrix and extracting a target adjacency matrix from the estimated adjacency matrix includes: traversing each target matrix element in the estimated adjacency matrix, and replacing matrix elements smaller than or equal to a threshold value in the estimated adjacency matrix with zeros to obtain the target adjacency matrix.

For example, assuming that the threshold is set to ∈, extracting the target adjacency matrix from the estimated adjacency matrix may be represented by the following formula (22):

and step S505, if the training ending event is not detected, acquiring observation information corresponding to the estimation graph in the process of predicting the estimation graph by using the prediction model of the call graph.

And S506, calling the graph estimator to perform estimation processing based on the label information and observation information corresponding to the estimation graph to obtain a new estimation graph, and calling the optimized graph prediction model to perform prediction processing on the new estimation graph to obtain prediction information corresponding to the new estimation graph.

And step S507, updating the optimized graph prediction model based on the prediction information and the label information corresponding to the new estimation graph.

Some possible implementations included in steps S505 to S507 in one embodiment may refer to descriptions of related steps in fig. 3, and are not described herein again.

In one embodiment, training of the graph structure estimation model is stopped if an end training event is detected. At this time, if the model processing device obtains a graph to be processed, the graph prediction model in the optimized graph structure estimation model can be called to perform prediction processing on the graph to be processed, so as to obtain target observation information corresponding to the graph to be processed; then, calling a graph estimator in the graph structure estimation model to perform estimation processing based on the target observation information to obtain a target estimation graph; calling a graph pre-stored model to process the target estimation graph to obtain a prediction result corresponding to the target estimation graph, wherein the prediction result is used for indicating the category of each node in the graph to be processed.

The graph structure estimation model trained by the embodiment of the invention can be applied to any application based on the graph structure, such as friend recommendation, advertisement recommendation and the like.

The embodiment of the invention provides a graph structure estimation model which comprises a graph prediction model and a graph estimator. In the process of optimizing the graph structure estimation model, a graph prediction model in the graph structure estimation model can be used for carrying out prediction processing on an initial graph to obtain observation information corresponding to the initial graph, wherein the observation information comprises prediction information corresponding to the initial graph; then, the calling graph estimator carries out estimation processing based on the label information and the observation information to obtain an estimation graph; further, calling the graph prediction model to perform prediction processing on the estimation graph to obtain prediction information corresponding to the estimation graph; and optimizing the graph prediction model based on the prediction information corresponding to the estimation graph and the label information corresponding to the initial graph.

Further, if the training ending event is not detected, acquiring observation information corresponding to the estimation graph obtained in the process of predicting the estimation graph by using the graph prediction model; and calling the graph estimator to perform estimation processing based on the label information and observation information corresponding to the estimation graph to obtain a new estimation graph, and calling the optimized graph prediction model to perform prediction processing on the new estimation graph to obtain prediction information corresponding to the new estimation graph. The optimized graph prediction model is updated based on the prediction information and label information corresponding to the new estimation graph. And continuously repeating the steps until the training ending event is detected, and stopping training.

In order to more vividly illustrate the training process of the graph structure estimation model in the embodiment of the present invention, based on the method embodiments of fig. 3 and fig. 5, a training schematic diagram of the graph structure estimation model provided in the embodiment of the present invention is shown in fig. 7, and a training algorithm flow is shown in table 1. The training algorithm shown in table 1 is applied to the graph structure estimation model shown in fig. 7 to complete the training of the graph structure estimation model.

The training process of the graph structure estimation model in the embodiment of the present invention is summarized according to the training algorithm shown in table 1, and can be roughly summarized as follows: inputting a node characteristic matrix (feature matrix) X, an adjacency matrix (adjacencies matrix) A and label information (labels) of an initial graph, each neighbor graph (k in kNN), tolerance (tolerance) lambda, a threshold (threshold) epsilon and an iteration threshold (iterations) tau of the initial graph into a graph structure estimation model; if the current training round i is less than or equal to the iteration threshold tau, firstly updating the weight of each convolution layer of the graph prediction modelThe parameters theta are weighted until the graph prediction model converges; then, predicting the initial graph based on the converged graph prediction model to obtain observation information corresponding to the initial graph; then, updating a first parameter omega of a structure sub-model and second parameters alpha and beta of an observation sub-model in the graph estimator through the steps of 5, 6 and 7, and calculating to obtain an estimated adjacent moment Q according to the updated first parameter and the updated second parameter; further, the estimated adjacent matrix Q is subjected to sparsification treatment to obtain a target adjacent matrix S⁽ⁱ⁾(ii) a According to the target adjacency matrix S⁽ⁱ⁾An estimation map is generated. And repeating the training process until the current training round i is greater than the iteration threshold tau, and finishing the training of the graph structure estimation model.

TABLE 1

In fig. 7, the graph structure estimation model 700 includes a graph prediction model 701, which may include l convolutional layers, and a graph estimator 702, which includes a structure sub model 7021 and an observation sub model 7022. In the embodiment of the present invention, an initial graph is represented by using an adjacent matrix a corresponding to the initial graph, a node feature matrix S corresponding to the initial graph is input into a graph prediction model, each convolutional layer in l convolutional layers included in the graph prediction model performs convolutional processing on the initial graph, the input of each convolutional layer is the input of a previous convolutional layer, each convolutional layer obtains a node representation matrix of the layer after performing convolutional processing, as shown in fig. 7, a node representation matrix obtained after performing convolutional processing on a first convolutional layer is represented as H⁽¹⁾The node expression matrix obtained by convolution processing of the second convolution layer is represented as H⁽²⁾By analogy, the node expression matrix obtained after convolution processing is performed on the first convolution layer is represented as H^(l)(ii) a As can be seen from the foregoing, the node representation corresponding to the node feature matrix corresponding to the initial graph can be represented as H⁽⁰⁾。

Further, a neighbor graph is generated according to the node representation matrix corresponding to each convolution layer and the adjacency corresponding to the neighbor graph is obtainedMatrix, as shown in FIG. 7, represents matrix H according to the corresponding nodes of the first convolutional layer⁽¹⁾The obtained neighbor graph is 703, and the adjacency matrix corresponding to the neighbor graph 703 is represented as O⁽¹⁾And represents H according to the corresponding node of the second convolutional layer⁽²⁾The corresponding neighbor graph obtained by the matrix is 704, and the adjacent matrix corresponding to the neighbor graph 704 is represented as O⁽²⁾A matrix H is represented according to the nodes corresponding to the first convolutional layer^(l)The constructed neighbor graph is represented as 705, and the adjacency matrix corresponding to the neighbor graph 705 is represented as O^(l). A target neighbor graph constructed according to the node feature matrix S corresponding to the initial graph is 706, and an adjacency matrix corresponding to the target neighbor graph 706 is represented as O⁽⁰⁾. And combining the neighbor graph set and the initial graph into an observation graph set obtained by observing the initial graph.

Further, the initial graph a and each neighbor graph form a view graph set, and an optional view graph set may be represented as:

next, the model processing device normalizes the output of the last convolution layer, i.e., the ith convolution layer, to obtain the prediction information corresponding to the initial graph, assuming that the prediction information corresponding to the initial graph is represented as Z. Combining the prediction information Z and the observation graph set into one piece of observation information, inputting the observation information and label information corresponding to an initial graph into the graph estimator 702, generating a plurality of candidate graphs and obtaining the generation probability of each candidate graph by a structure submodel 7021 in the graph estimator 702 based on the label information, determining the existence probability of the observation information corresponding to each candidate graph by the observation submodel 7022, obtaining an estimated adjacency matrix Q according to the existence probability of the observation information corresponding to each candidate graph and the generation probability of each candidate graph, finally generating an estimation graph according to the estimated adjacency matrix Q, inputting the estimation graph into the graph prediction model 701, and continuing the next training.

In addition, in order to verify the advantages of the graph structure estimation model in the embodiment of the present invention in practical application, the graph structure estimation model in the embodiment of the present invention is compared with a graph processing model constructed based on GNN in the prior art in terms of performance of node classification in a graph.

In a specific implementation, the data used for the performance comparison is shown in table 2, from 6 data sets:

TABLE 2

In table 2, Cora, cineseer and Pubmed are reference network datasets. In these networks, nodes represent papers and edges represent citation relationships. Node features are bag-of-words representations of papers, while labels are academic areas. Chameleon and Squirrel are page-to-page networks in Wikipedia with a specific topic. In these datasets, the nodes are web pages and the edges represent hyperlinks. The node characteristics are informational nouns in the page and the tags correspond to monthly traffic for the page. An Actor dataset is an Actor subgraph of a fim-director-Actor-writer network. Each node represents an actor and the edges represent collaboration. The node features are keywords in Wikipedia and the tags are actor types.

The 6 data sets can be divided into homography (Cora, cieser and Pubmed) and heterography (Chameleon, Squirrel and Actor) according to the homogeneity of each data set. For the challenging data sets Chameleon, Squirrel and Actor, the supervision was set as a typical semi-supervised training/validation/test partition in this trial.

During testing, graph structure processing models (GENs) of embodiments of the present invention were compared to three classes of representative GNNs, including three spectrum-based graph processing models (e.g., SGC, GCN, and ChebNet), three space-based graph processing models (e.g., GAT, APPNP, and GraphSAGE), and three graph structure learning-based graph processing models (e.g., LDS, Pro-GNN, and Geom-GCN). In the embodiment of the present invention, for convenience of description, the SGC, the GAT, and the LDS are selected for comparison and compared with the GEN performed in the embodiment of the present invention.

In a specific experimental comparison process, the graph structure estimation model provided by the embodiment of the invention is realized based on a deep learning library PyTorch. All experimental comparisons were performed on a Linux server with a Graphics Processing Unit (GPU) and a Central Processing Unit (CPU).

In the graph structure processing model in the embodiment of the invention, the graph prediction model is based on GCN and is trained for 200 rounds by using an Adam optimizer. The initial learning rate is 0.01 and the weight decay is 5 e-4. ReLU is set as the activation function of the graph prediction model, and 0.5 dropout is applied to further prevent overfitting. For the hyperparametric web search control, the embedding dimension d is selected from {8,16,32,64,128 }, k of kNN is adjusted from 3 to 15, the tolerance λ is searched in {0.1,0.01,0.001}, and the threshold ε is adjusted in {0.1,0.2, …,0.9 }. The threshold value tau of the iteration times is fixed to be 50, and then the model with the highest accuracy of the verification set is selected for testing.

In the test process, SGC, GCN, GAT, APPNP and GraphSAGE realized by PyTorch Geometric were used. For the remaining benchmark methods ChebNet, LDS, Pro-GNN, and Geom-GCN, their corresponding source codes are used. And performing hyperparametric search on all models in the verified data set. For the sake of fairness, the graph structure estimation model and other graph processing models proposed in the embodiments of the present invention have the same search space size of the common hyper-parameter, including embedding dimension, initial learning rate, weight attenuation, and droop rate.

The performance comparison between the GEN model and other graph processing models for comparison on the node classification performance can be shown in table 3 below: the comparison of the classification performance of GEN, SGC, GAT and LDS on the data set is shown in Table 3, and the classification performance of other not shown models GCN, CheNet, APPNP, GraphSAGE, Pro-GNN and Geom-GCN on the three data sets is also poor compared with GEN; similarly, on other data sets except the 3 data sets, the classification performance of each comparative graph processing model is inferior to GEN. For convenience of description, the embodiments of the present invention are not necessarily listed in table form.

TABLE 3

From the comparison results, it was found that the GEN model consistently outperformed the other models on 3 data sets, and for the other 3 data sets, although not shown in the table, the performance of the GEN model was also superior to the other models. Especially in the case of label reduction and homogeneity reduction, this shows that the GEN model of the embodiment of the present invention can improve the node classification performance in a robust manner. With the reduction of labels and homogeneity, the performance of the GNNs is rapidly reduced, and the performance improvement of the GEN model is more obvious. These phenomena are desirable for embodiments of the present invention, i.e., noisy or sparse graph structures may prevent GNNs from gathering valid information, and the estimated graphs of embodiments of the present invention may alleviate this problem. The overwhelming advantage of GEN over other models means that GEN is able to estimate the appropriate graph structure. Compared with other models, the performance improvement of the GEN model in the embodiment of the invention shows that the community structure is definitely limited and the multi-aspect information is fully utilized, so that the better graph structure and the more reliable GCN parameters can be learned.

In table 2, 20, 5 and 10 labels were set for each class during the test, respectively. The mean and standard deviation of 10 random independent experiments are shown in figure 2.

To further understand the improvement in performance of GEN models over other GNN network-based graph processing models, embodiments of the present invention analyze the change in prediction confidence. In particular, for each training set and test set node v in the Cora and Chameleon datasets_iSelecting the true class y corresponding to the node from the final prediction information of the GCN and GEN models_iAnd plotted as a box plot as shown in figure 8 a.

In the box diagram shown in FIG. 8a, the box boxes represent 25-75%; the median is represented by the bold dashed line, 800 in FIG. 8 a; the solid black filled circles represent the average values, as in 801 in FIG. 8 a. Whiskers extend to minimum and maximum data points that are not outliers. To more intuitively represent, for each box type, the prediction confidence for 50 nodes is shown with a dotted gray filled circle, as shown at 802 in FIG. 8 a. It can be seen from the graph of fig. 8a that for the Cora or more challenging Chameleon dataset, the prediction confidence obtained on the estimated graph is much higher than the initial graph. Presumably, injecting multi-order neighbor similarities into graph structure learning can cause the learned graph to delineate more pairs of nodes by information, thereby making the classification distribution more peaky and providing the GCN with the ability to correct misclassified nodes.

To understand the iterative estimation process of GEN, the embodiment of the present invention provides the variation curves of true-positive probability α and false-positive probability β on the Cora and Chameleon data sets, as shown in fig. 8b, 81 indicates that true-positive probability α is represented on the Cora data set, 82 indicates the variation curve of false-positive probability β on the Cora data set, and 83 indicates that true-positive probability α and 84 indicates the variation curve of false-positive probability β on the Chameleon data set. In fig. 8b, the coordinates of the horizontal axis represent alternative steps of the desired and maximized values in the EM algorithm, with the dashed lines separating the different rounds of iteration of GEN. The tolerance lambda can be fixed to 0.001 in fig. 8 b. As can be seen from fig. 8b, the true-positive probability α gradually increases during the iteration process, meaning that the observed information built by the graph prediction model becomes more and more accurate.

The embodiment of the invention further compares the convergence rate of the GEN model and other graph processing models. Referring to fig. 9a, a graph of accuracy of GEN models and other graph processing models provided for embodiments of the present invention. In fig. 9a, 90A represents the accuracy curve of the LDS model on the Cora data set, and 90B represents the accuracy curve of the LDS model on the Chameleon data set; 91A represents the accuracy curve of the Pro-GNN model on the Cora data set, and 91B represents the accuracy curve of the Pro-GNN model on the Chameleon data set; 92A represents the accuracy curve of the GEN model on the Cora data set, and 92B represents the accuracy curve of the GEN model on the Chameleon data set. In fig. 9a, the abscissa is the number of iteration rounds and the ordinate is the accuracy. As can be seen in fig. 9a, GEN has faster convergence speed and better validation set accuracy on both data sets, demonstrating the efficiency and effectiveness of GEN. In addition, for the Chameleon dataset, the validation set accuracy of LDS and Pro-GNN fluctuates greatly, while GEN steadily improves the accuracy, which again confirms the robustness of considering the graph generation principle and multi-aspect information.

In parallel with the foregoing comparison, the effectiveness of the GEN model on the actual data set has been demonstrated, and the mechanism of the GEN model and the nature of the estimation map are described below.

In order to better explore the GEN mechanism, the estimation graph is researched by using an attribute stochastic model (SBM) which is widely used in a reference graph clustering method in the embodiment of the invention. For better visualization and analysis, in the SBM version in the embodiment of the present invention, there are 5 communities, each community having 20 nodes. The symmetric probability matrix is randomly initialized to generate edges, in most cases with diagonal elements largest in the corresponding rows to ensure a certain degree of homogeneity. The 8-dimensional features of the nodes are generated using a multivariate normal distribution, where nodes in different communities share a random covariance matrix, but have different means according to their own community. On the training/validation/test partition, we use 5 nodes per class for training, 5 nodes for validation, and the rest as tests.

In order to visually check the graph structure change caused by GEN, the initial graph and the estimated graph are visualized in fig. 9b by 911 and 913 using a Gephi tool in the embodiment of the present invention. At the same time, the local detail of the graph is enlarged and a particular node is selected to highlight the change in its neighborhood in 912 and 914 of FIG. 9 b. As can be seen by 911 in fig. 9b, the initial graph is relatively confusing and there are many connections between communities, which can be more clearly reflected by the neighborhood of the selected node. In this case, the classification accuracy of the graph prediction model GCN is only 60%. As can be seen by 913 in fig. 9b, after applying the GEN model of an embodiment of the present invention, it is clear to estimate the community structure of the graph. In which many erroneous edges are removed and the strongly connected relationship is preserved, while the classification accuracy of GEN is improved to 84%.

To quantify the community structure transitions before and after the estimation, a probability matrix of social intervals in the initial graph and a probability matrix of inter-community intervals in the estimated graph can be calculated and plotted as a heat map. Referring to fig. 9c, a schematic diagram of a thermodynamic diagram of a probability matrix is shown. In fig. 9c, 901 denotes a probability matrix of the social interval in the initial graph, and 902 denotes a probability matrix of the social interval in the estimated graph. As can be seen from fig. 9c, the non-diagonal color blocks in 901 are also deeper than the diagonal blocks, which indicates that the initial map cannot maintain good homogeneity, and the high probability connection between communities may cause difficulty in optimizing the map prediction model. However, for 902, the difference between the diagonal elements and the non-diagonal elements is enlarged, wherein the former is significantly larger than the latter, which explains the reason why the classification performance of the graph processing model GEN in the embodiment of the present invention is improved.

Recall that in the FIG. 3 and FIG. 5 embodiments, the estimated adjacency matrix Q in the GEN is the posterior probability of the graph structure, where Q_ijIndicating the confidence that the edge exists. To investigate the significance of the estimated edges, an edge confidence Q may be presented_ijAnd a first amount E of data in the observation information indicating that there is an edge between node i and node j_ijThe relationship (2) of (c). Assuming that the total number of data included in one observation is 3, E_ijIs from 0 to 3. For E_ijThe corresponding node pair number (one node pair comprising two nodes) is accumulated and the average edge confidence for these node pairs is calculated in fig. 9 d. 010 in FIG. 9d denotes each E_ijThe corresponding number of node pairs, 011 denotes each E_ijAverage edge confidence of. As can be seen in FIG. 9d, most node pairs appear at E_ijEqual to 0 because the graph is sparse and most node pairs are difficult to meet. From a first number E_ijIn relation to the average edge confidence, it can be seen that observing only zero or one edge means Q_ijLow, so a single observation may be due to an error. However, at E _ij1 and E_ijThere was a relatively sharp transition between 2, indicating that two or more observations on the same side resulted in a larger Q_ijThis reflects a higher confidence that the edge exists in the best graph.

To show the edge confidence distribution, the edges may be divided into two groups, one for the same community and one for a different community. Normalized histograms of these edge confidences on the training dataset, the validation dataset, and the test dataset trained on the GEN model are shown below by fig. 9e, respectively. In fig. 9e, a dotted white filled rectangle box is used to represent the edge of the same community, and a solid black filled matrix box is used to represent the edge of a different community. It can be observed from fig. 9e that the edge confidence of the same community is set in the last region (greater than 0.9), while the edge confidence of different communities is biased towards the first region (less than 0.1). This indicates that GEN captures useful edge confidence, i.e., the edge confidence for the same social interval is higher.

In the above description, the GEN model is applied to the static graph, and the GEN model can be applied to the dynamic graph in the future. From an intuitive perspective, observation information can be built on different time slices, so the GEN can infer graph results based on a series of historical interactions.

Based on the embodiment of the training method of the graph structure estimation model, the embodiment of the invention provides a training device of the graph structure estimation model. Referring to fig. 10, a schematic structural diagram of a training apparatus for a graph structure estimation model according to an embodiment of the present invention is provided. The training device of fig. 10 may operate as follows:

an obtaining unit 1001, configured to obtain an initial graph and tag information corresponding to the initial graph; the initial graph comprises a plurality of nodes, the label information is used for indicating the category of a target node in the initial graph, and the target node is any one or more of the plurality of nodes of the initial graph;

a processing unit 1002, configured to call a graph prediction model included in the graph structure estimation model to perform prediction processing on the initial graph, so as to obtain observation information corresponding to the initial graph;

the processing unit 1002 is further configured to invoke a graph estimator included in the graph structure estimation model to perform estimation processing based on the label information and the observation information to obtain an estimated graph; calling the graph prediction model to perform prediction processing on the estimation graph to obtain prediction information corresponding to the estimation graph, wherein the prediction information corresponding to the estimation graph is used for indicating the category of each node in the estimation graph;

the processing unit 1002 is further configured to optimize the graph prediction model based on the prediction information corresponding to the estimation graph and the label information.

In one embodiment, the obtaining unit 1001 is further configured to: if the training ending event is not detected, acquiring observation information corresponding to the estimation graph obtained in the process of calling the graph prediction model to perform prediction processing on the estimation graph;

the processing unit 1002 is further configured to invoke the graph estimator to perform estimation processing based on the tag information and observation information corresponding to the estimated graph to obtain a new estimated graph, and invoke an optimized graph prediction model to perform prediction processing on the new estimated graph to obtain prediction information corresponding to the new estimated graph, where the prediction information corresponding to the new estimated graph is used to indicate a category to which each node in the new estimated graph belongs;

and updating the optimized graph prediction model based on the prediction information corresponding to the new estimation graph and the label information.

In one embodiment, the observation information includes prediction information corresponding to the initial graph and an observation graph set corresponding to the initial graph, the prediction information corresponding to the initial graph is used to indicate a category to which each node in the initial graph belongs, and the observation graph set includes the initial graph and a neighbor graph set corresponding to the initial graph; the graph prediction model includes a first convolution layer and a second convolution layer, and the processing unit 1002 executes the following steps when calling the graph prediction model included in the graph structure estimation model to perform prediction processing on the initial graph to obtain the observation information corresponding to the initial graph:

acquiring a node characteristic matrix of the initial graph, inputting the node characteristic matrix into the first convolutional layer, and performing convolution operation based on a first weight parameter corresponding to the first convolutional layer to obtain a node representation matrix corresponding to the first convolutional layer; inputting the node representation matrix corresponding to the first convolutional layer into the second convolutional layer to perform convolution operation based on a second weight parameter corresponding to the second convolutional layer to obtain a node representation matrix corresponding to the second convolutional layer;

normalizing the node expression matrix corresponding to the second convolution layer to obtain the prediction information corresponding to the initial graph; constructing a first neighbor graph based on the node representation matrix corresponding to the first convolutional layer, constructing a second neighbor graph based on the node representation matrix corresponding to the second convolutional layer, and constructing a target neighbor graph based on the node feature matrix of the initial graph; forming the first neighbor graph, the second neighbor graph, and the target neighbor graph into the set of neighbor graphs.

In one embodiment, the graph estimator includes a structure sub-model and an observation sub-model, and when the graph estimator included in the graph structure estimation model is called to perform estimation processing based on the label information and the observation information to obtain an estimated graph, the processing unit 1002 performs the following steps:

calling the structure submodel to generate N candidate graphs based on the label information and the prediction information corresponding to the initial graph in the observation information corresponding to the initial graph, wherein N is an integer greater than or equal to 1; determining the generation probability corresponding to the corresponding candidate graph based on the first parameter of the structure submodel and the adjacency matrix corresponding to each candidate graph in the N candidate graphs;

calling the observation submodel to calculate the existence probability of the observation information corresponding to the corresponding candidate graph based on the second parameter of the observation submodel, the observation information corresponding to the initial graph and the adjacency matrix corresponding to each candidate graph; wherein, the observation information existence probability corresponding to the candidate map m is used for representing the probability of existence of the observation information when the candidate map m is taken as the estimation map, and m is greater than or equal to 1 and less than or equal to N; and carrying out graph estimation based on the generation probability corresponding to each candidate graph and the observation information existence probability corresponding to each candidate graph to obtain an estimated adjacency matrix, and generating the estimated graph according to the estimated adjacency matrix.

In one embodiment, the N candidate graphs include a first candidate graph, and when the observation submodel is invoked to calculate the existence probability of the observation information corresponding to the corresponding candidate graph based on the second parameter of the observation submodel, the observation information corresponding to the initial graph, and the adjacency matrix corresponding to each candidate graph, the processing unit 1002 performs the following steps:

acquiring the total amount of data included in the observation information, and determining an observation probability parameter according to the second parameter; acquiring a first quantity of data indicating that an edge exists between a node i and a node j in the observation information, and determining a second quantity of data indicating that an edge does not exist between the node i and the node j in the observation information according to the first quantity and the total quantity; the node i and the node j are any two nodes included in the first candidate graph, and i is smaller than j; calculating the edge correlation probability between the node i and the node j according to the observation probability parameter, the first quantity and the second quantity; and multiplying the edge correlation probabilities between every two nodes in the first candidate graph to obtain the existence probability of the observation information corresponding to the first candidate graph.

In one embodiment, the observation probability parameters include a first class parameter, a second class parameter, a third class parameter, and a fourth class parameter; the first type of parameter refers to the probability that any data in the observation set indicates the existence of an edge between a node i and a node j when the edge exists between the node i and the node j; the second type of parameter refers to the probability that any data in the observation set indicates that an edge does not exist between a node i and a node j when the edge exists between the node i and the node j; the third type of parameter refers to a probability that any data in the observation set indicates that an edge does not exist between the node i and the node j when the edge does not exist between the node i and the node j, and the fourth type of parameter refers to a probability that any data in the observation set indicates that an edge does not exist between the node i and the node j when the edge does not exist between the node i and the node j.

In an embodiment, the N candidate maps include a first candidate map and a second candidate map, and when the processing unit 1002 performs map estimation based on the generation probability corresponding to each candidate map and the observation information existence probability corresponding to each candidate map to obtain the estimated adjacency matrix, the following steps are performed:

acquiring the probability of the first parameter and the probability of the second parameter; inputting the probability of the first parameter, the probability of the second parameter, the generation probability corresponding to the first candidate graph and the observation information existence probability corresponding to the first candidate graph into a candidate probability calculation rule for operation to obtain a first candidate probability of the first candidate graph as an estimation graph;

inputting the probability of the first parameter, the probability of the second parameter, the generation probability corresponding to the second candidate graph and the observation information existence probability corresponding to the second candidate graph into the candidate probability calculation rule for operation to obtain a second candidate probability of the second candidate graph as an estimation graph; and generating an estimated adjacency matrix based on the first candidate probability, the adjacency matrix corresponding to the first candidate graph, the second candidate probability and the adjacency matrix corresponding to the second candidate graph.

In one embodiment, the adjacency matrix corresponding to any candidate map comprises M rows and M columns, the adjacency matrix corresponding to any candidate map comprises M by M matrix elements, the estimated adjacency matrix comprises M rows and M columns, and the estimated adjacency matrix comprises M by M target matrix elements; the processing unit 1002 generates an estimated adjacency matrix based on the first candidate probability, the adjacency matrix corresponding to the first candidate map, the second candidate probability, and the adjacency matrix corresponding to the second candidate map. The following steps are carried out:

In one embodiment, the processing unit 1002 is further configured to:

optimizing the first parameter and the second parameter according to the first candidate probability and the second candidate probability; and updating the estimated adjacency matrix based on the optimized first parameter and the optimized second parameter.

In one embodiment, when the optimization processing is performed on the first parameter and the second parameter according to the first candidate probability and the second candidate probability, the following steps are performed:

adding the first candidate probability and the second candidate probability to obtain a combined posterior probability expression of the first parameter and the second parameter; transforming the combined posterior probability expression by using a preset inequality;

performing derivation operation on the transformed joint posterior probability based on the first parameter to obtain an optimization equation corresponding to the first parameter, and performing derivation operation on the transformed joint posterior probability based on the second parameter to obtain an optimization equation corresponding to the second parameter; and solving the optimization equation corresponding to the first parameter to obtain the optimized first parameter, and solving the optimization equation corresponding to the second parameter to obtain the optimized second parameter.

In one embodiment, the processing unit 1002, when generating the estimation map according to the estimation adjacency matrix, performs the following steps:

carrying out sparsification processing on the estimated adjacency matrix, and extracting a target adjacency matrix from the estimated adjacency matrix; and generating the estimation graph according to the target adjacency matrix.

In one embodiment, when the estimated adjacency matrix is thinned and a target adjacency matrix is extracted from the estimated adjacency matrix, the processing unit 1002 performs the following steps: traversing each target matrix element in the estimated adjacency matrix, and replacing matrix elements smaller than or equal to a threshold value in the estimated adjacency matrix with zeros to obtain the target adjacency matrix.

In one embodiment, the prediction information corresponding to the initial graph includes a prediction value corresponding to each node in a plurality of nodes of the initial graph, the prediction value corresponding to any node is used for indicating a category to which the any node belongs, and the label information includes a target value used for indicating a category to which a target node belongs; the processing unit 1002 is further configured to:

obtaining a predicted value corresponding to the target node from the predicted information corresponding to the initial graph, and determining difference information between the predicted value corresponding to the target node and the target value included in the tag information; constructing a loss function corresponding to the graph prediction model according to the difference information; updating the first weight parameter and the second weight parameter in a direction of decreasing the value of the penalty function; and updating the prediction information corresponding to the initial graph and the neighbor graph set respectively based on the updated first weight parameter and the updated second weight parameter.

In one embodiment, the obtaining unit 1001 is further configured to: acquiring a graph to be processed; the processing unit 1002 is further configured to: calling the optimized graph prediction model in the graph structure estimation model to perform prediction processing on the graph to be processed to obtain target observation information corresponding to the graph to be processed, wherein the target observation information comprises target prediction information corresponding to the graph to be processed; calling the graph estimator to perform estimation processing based on the target observation information to obtain a target estimation graph; and calling the graph prediction model to process the target estimation graph to obtain a prediction result corresponding to the target estimation graph, wherein the prediction result is used for indicating the category of each node in the graph to be processed.

According to an embodiment of the present invention, the steps involved in the training method of the graph structure estimation model shown in fig. 3 and 5 may be performed by units in the training apparatus of the graph structure estimation model shown in fig. 10. For example, step S301 shown in fig. 3 may be performed by the acquiring unit 1001 in the training device of the graph structure estimation model shown in fig. 10, and steps S302 to S304 may be performed by the processing unit 1002 in the training device of the graph structure estimation model shown in fig. 10; as another example, step S501 in the training method of the graph structure estimation model shown in fig. 5 may be performed by the obtaining unit 1001 in the training device of the graph structure estimation model shown in fig. 10, and steps S502 to S507 may be performed by the processing unit 1002 in the training device of the graph structure estimation model shown in fig. 10.

According to another embodiment of the present invention, the units in the training apparatus of the graph structure estimation model shown in fig. 10 may be respectively or entirely combined into one or several other units to form the graph structure estimation model, or some unit(s) thereof may be further split into multiple units with smaller functions to form the graph structure estimation model, which may achieve the same operation without affecting the achievement of the technical effect of the embodiment of the present invention. The units are divided based on logic functions, and in practical application, the functions of one unit can be realized by a plurality of units, or the functions of a plurality of units can be realized by one unit. In other embodiments of the present invention, the training device based on the graph structure estimation model may also include other units, and in practical applications, these functions may also be implemented by the assistance of other units, and may be implemented by cooperation of multiple units.

According to another embodiment of the present invention, a training apparatus of a graph structure estimation model as shown in fig. 10 may be constructed by running a computer program (including program codes) capable of executing steps involved in the respective methods as shown in fig. 3 and 5 on a general-purpose computing device such as a computer including a processing element such as a Central Processing Unit (CPU), a random access storage medium (RAM), a read-only storage medium (ROM), and a storage element, and a training method of a graph structure estimation model according to an embodiment of the present invention may be implemented. The computer program may be embodied on a computer-readable storage medium, for example, and loaded into and executed by the above-described computing apparatus via the computer-readable storage medium.

Based on the above method and apparatus embodiments, an embodiment of the present invention further provides a training device for a graph structure estimation model, and referring to fig. 11, the embodiment of the present invention provides a schematic structural diagram of the training device for the graph structure estimation model. The training device shown in fig. 11 may comprise at least a processor 1101, an input interface 1102, an output interface 1103, and a computer storage medium 1104. The processor 1101, the input interface 1102, the output interface 1103, and the computer storage medium 1104 may be connected by a bus or other means.

A computer storage medium 1104 may be stored in a memory of the training device of the graph structure estimation model, the computer storage medium 901 being adapted to store a computer program, the computer program comprising a computer program, the processor 901 being adapted to execute the computer program stored by the computer storage medium 1104. The processor 1101 (or CPU) is a computing core and a control core of the training device of the graph structure estimation model, and is adapted to implement a computer program, and specifically to load and execute:

The embodiment of the invention also provides a computer storage medium (Memory), which is a Memory device in the training device of the graph structure estimation model and is used for storing programs and data. It is understood that the computer storage medium herein may include a built-in storage medium of the training device of the graph structure estimation model, and may also include an extended storage medium supported by the training device of the graph structure estimation model. The computer storage medium provides a storage space that stores an operating system of a training device of the graph structure estimation model. In addition, a computer program suitable for being loaded and executed by the processor 901 is also stored in the storage space; or a computer program, which may be one or more computer programs (including program code), suitable for being loaded and executed by the processor 1101. The computer storage medium may be a high-speed RAM memory, or may be a non-volatile memory (non-volatile memory), such as at least one disk memory; and optionally at least one computer storage medium located remotely from the processor.

In one embodiment, the computer storage medium may be loaded by the processor 1101 and executes a computer program stored in the computer storage medium to implement the corresponding steps of the training method described above with respect to the graph structure estimation model shown in fig. 3. In a specific implementation, the computer program in the computer storage medium is loaded by the processor 1101 and performs the following steps:

In one embodiment, the processor 1101 is further configured to perform: if the training ending event is not detected, acquiring observation information corresponding to the estimation graph obtained in the process of calling the graph prediction model to perform prediction processing on the estimation graph; calling the graph estimator to perform estimation processing based on the label information and observation information corresponding to the estimation graph to obtain a new estimation graph, calling an optimized graph prediction model to perform prediction processing on the new estimation graph to obtain prediction information corresponding to the new estimation graph, wherein the prediction information corresponding to the new estimation graph is used for indicating the category to which each node in the new estimation graph belongs; and updating the optimized graph prediction model based on the prediction information corresponding to the new estimation graph and the label information.

In one embodiment, the observation information includes prediction information corresponding to the initial graph and an observation graph set corresponding to the initial graph, the prediction information corresponding to the initial graph is used for indicating a category to which each node in the initial graph belongs, and the observation graph set includes the initial graph and a neighbor graph set corresponding to the initial graph; the graph prediction model includes a first convolutional layer and a second convolutional layer; when the processor 1101 calls the graph prediction model included in the graph structure estimation model to perform prediction processing on the initial graph to obtain the observation information corresponding to the initial graph, the following steps are performed:

acquiring a node characteristic matrix of the initial graph, inputting the node characteristic matrix into the first convolutional layer, and performing convolution operation based on a first weight parameter corresponding to the first convolutional layer to obtain a node representation matrix corresponding to the first convolutional layer;

inputting the node representation matrix corresponding to the first convolutional layer into the second convolutional layer to perform convolution operation based on a second weight parameter corresponding to the second convolutional layer to obtain a node representation matrix corresponding to the second convolutional layer; normalizing the node expression matrix corresponding to the second convolution layer to obtain the prediction information corresponding to the initial graph; constructing a first neighbor graph based on the node representation matrix corresponding to the first convolutional layer, constructing a second neighbor graph based on the node representation matrix corresponding to the second convolutional layer, and constructing a target neighbor graph based on the node feature matrix of the initial graph; forming the first neighbor graph, the second neighbor graph, and the target neighbor graph into the set of neighbor graphs.

In one embodiment, the graph estimator includes a structure submodel and an observation submodel; when the processor 1101 calls a graph estimator included in the graph structure estimation model to perform estimation processing based on the label information and the observation information to obtain an estimated graph, the following steps are performed:

In one embodiment, the N candidate maps include a first candidate map, and the processor 1101, when invoking the observation submodel and calculating the existence probability of the observation information corresponding to the corresponding candidate map based on the second parameter of the observation submodel, the observation information corresponding to the initial map, and the adjacency matrix corresponding to each candidate map, performs the following steps:

In one embodiment, the N candidate maps include a first candidate map and a second candidate map, and the processor 1101 performs the following steps when performing map estimation based on the generation probability corresponding to each candidate map and the observation information existence probability corresponding to each candidate map to obtain the estimated adjacency matrix:

In one embodiment, the adjacency matrix corresponding to any candidate map comprises M rows and M columns, the adjacency matrix corresponding to any candidate map comprises M by M matrix elements, the estimated adjacency matrix comprises M rows and M columns, and the estimated adjacency matrix comprises M by M target matrix elements; the processor 1101, when generating an estimated adjacency matrix based on the first candidate probability, the adjacency matrix corresponding to the first candidate map, the second candidate probability, and the adjacency matrix corresponding to the second candidate map, performs the following steps:

In one embodiment, after generating an estimated adjacency matrix based on the first candidate probability, the adjacency matrix corresponding to the first candidate graph, the second candidate probability, and the adjacency matrix corresponding to the second candidate graph, the processor 1101 is further configured to:

In one embodiment, the processor 1101 performs the following steps when performing optimization processing on the first parameter and the second parameter according to the first candidate probability and the second candidate probability:

In one embodiment, the processor 1101, when generating the estimation map from the estimation adjacency matrix, performs the following steps:

In one embodiment, when the processor 1101 performs the sparsification process on the estimated adjacency matrix and extracts the target adjacency matrix from the estimated adjacency matrix, the following steps are performed: traversing each target matrix element in the estimated adjacency matrix, and replacing matrix elements smaller than or equal to a threshold value in the estimated adjacency matrix with zeros to obtain the target adjacency matrix.

In one embodiment, the prediction information corresponding to the initial graph includes a prediction value corresponding to each node in a plurality of nodes of the initial graph, the prediction value corresponding to any node is used to indicate a category to which the any node belongs, the tag information includes a target value used to indicate a category to which a target node belongs, and after the first neighbor graph and the second neighbor graph are combined into the neighbor graph set, the processor 1101 is further configured to:

In one embodiment, the processor 1101 is further configured to: obtaining a graph to be processed, calling the graph prediction model optimized in the graph structure estimation model to perform prediction processing on the graph to be processed, and obtaining target observation information corresponding to the graph to be processed, wherein the target observation information comprises target prediction information corresponding to the graph to be processed; calling the graph estimator to perform estimation processing based on the target observation information to obtain a target estimation graph; and calling the graph prediction model to process the target estimation graph to obtain a prediction result corresponding to the target estimation graph, wherein the prediction result is used for indicating the category of each node in the graph to be processed.

According to an aspect of the present application, an embodiment of the present invention also provides a computer program product or a computer program, the computer program product comprising a computer program, the computer program being stored in a computer storage medium. The processor 1101 reads the computer program from a computer storage medium, and the processor 1101 executes the computer program, so that the training apparatus performs the training method of the graph structure estimation model shown in fig. 3 and 5, specifically:

Claims

1. A method for training a graph structure estimation model, comprising:

2. The method of claim 1, wherein after optimizing the graph prediction model based on the prediction information corresponding to the estimation graph and the label information, the method further comprises:

if the training ending event is not detected, acquiring observation information corresponding to the estimation graph obtained in the process of calling the graph prediction model to perform prediction processing on the estimation graph;

calling the graph estimator to perform estimation processing based on the label information and observation information corresponding to the estimation graph to obtain a new estimation graph, calling an optimized graph prediction model to perform prediction processing on the new estimation graph to obtain prediction information corresponding to the new estimation graph, wherein the prediction information corresponding to the new estimation graph is used for indicating the category to which each node in the new estimation graph belongs;

3. The method of claim 1, wherein the observation information comprises prediction information corresponding to the initial graph and an observation graph set corresponding to the initial graph, the prediction information corresponding to the initial graph is used for the observation information, and the observation graph set comprises the initial graph and a neighbor graph set corresponding to the initial graph; the method for predicting the initial graph by using the graph structure estimation model comprises the following steps of:

inputting the node representation matrix corresponding to the first convolutional layer into the second convolutional layer to perform convolution operation based on a second weight parameter corresponding to the second convolutional layer to obtain a node representation matrix corresponding to the second convolutional layer;

normalizing the node expression matrix corresponding to the second convolution layer to obtain the prediction information corresponding to the initial graph;

constructing a first neighbor graph based on the node representation matrix corresponding to the first convolutional layer, constructing a second neighbor graph based on the node representation matrix corresponding to the second convolutional layer, and constructing a target neighbor graph based on the node feature matrix of the initial graph;

forming the first neighbor graph, the second neighbor graph, and the target neighbor graph into the set of neighbor graphs.

4. The method of claim 3, wherein the graph estimator comprises a structure submodel and an observation submodel, and wherein invoking the graph structure estimation model comprises the graph estimator performing an estimation process based on the label information and the observation information to obtain an estimated graph, comprising:

calling the observation submodel to calculate the existence probability of the observation information corresponding to the corresponding candidate graph based on the second parameter of the observation submodel, the observation information corresponding to the initial graph and the adjacency matrix corresponding to each candidate graph; wherein, the observation information existence probability corresponding to the candidate map m is used for representing the probability of existence of the observation information when the candidate map m is taken as the estimation map, and m is greater than or equal to 1 and less than or equal to N;

and carrying out graph estimation based on the generation probability corresponding to each candidate graph and the observation information existence probability corresponding to each candidate graph to obtain an estimated adjacency matrix, and generating the estimated graph according to the estimated adjacency matrix.

5. The method of claim 4, wherein the N candidate graphs include a first candidate graph, and wherein the invoking the observation submodel calculates the probability of existence of the observation information corresponding to the respective candidate graph based on a second parameter of the observation submodel, the observation information corresponding to the initial graph, and the adjacency matrix corresponding to each candidate graph, comprises:

acquiring the total amount of data included in the observation information, and determining an observation probability parameter according to the second parameter;

acquiring a first quantity of data indicating that an edge exists between a node i and a node j in the observation information, and determining a second quantity of data indicating that an edge does not exist between the node i and the node j in the observation information according to the first quantity and the total quantity; the node i and the node j are any two nodes included in the first candidate graph, and i is smaller than j;

calculating the edge correlation probability between the node i and the node j according to the observation probability parameter, the first quantity and the second quantity;

and multiplying the edge correlation probabilities between every two nodes in the first candidate graph to obtain the existence probability of the observation information corresponding to the first candidate graph.

6. The method of claim 5, wherein the observation probability parameters include a first class of parameters, a second class of parameters, a third class of parameters, and a fourth class of parameters; the first type of parameter refers to the probability that any data in the observation set indicates the existence of an edge between a node i and a node j when the edge exists between the node i and the node j; the second type of parameter refers to the probability that any data in the observation set indicates that an edge does not exist between a node i and a node j when the edge exists between the node i and the node j; the third type of parameter refers to a probability that any data in the observation set indicates that an edge does not exist between the node i and the node j when the edge does not exist between the node i and the node j, and the fourth type of parameter refers to a probability that any data in the observation set indicates that an edge does not exist between the node i and the node j when the edge does not exist between the node i and the node j.

7. The method of claim 4, wherein the N candidate maps include a first candidate map and a second candidate map, and wherein the performing the map estimation based on the generation probability corresponding to each candidate map and the observation information existence probability corresponding to each candidate map to obtain the estimated adjacency matrix comprises:

acquiring the probability of the first parameter and the probability of the second parameter;

inputting the probability of the first parameter, the probability of the second parameter, the generation probability corresponding to the first candidate graph and the observation information existence probability corresponding to the first candidate graph into a candidate probability calculation rule for operation to obtain a first candidate probability of the first candidate graph as an estimation graph;

inputting the probability of the first parameter, the probability of the second parameter, the generation probability corresponding to the second candidate graph and the observation information existence probability corresponding to the second candidate graph into the candidate probability calculation rule for operation to obtain a second candidate probability of the second candidate graph as an estimation graph;

and generating an estimated adjacency matrix based on the first candidate probability, the adjacency matrix corresponding to the first candidate graph, the second candidate probability and the adjacency matrix corresponding to the second candidate graph.

8. The method of claim 7, wherein the adjacency matrix for any candidate comprises M rows and M columns, the adjacency matrix for any candidate comprises M by M matrix elements, the estimated adjacency matrix comprises M rows and M columns, and the estimated adjacency matrix comprises M by M target matrix elements; the generating an estimated adjacency matrix based on the first candidate probability, the adjacency matrix corresponding to the first candidate graph, the second candidate probability, and the adjacency matrix corresponding to the second candidate graph includes:

acquiring a first matrix element at the jth row and jth column position in an adjacent matrix corresponding to the first candidate map, and acquiring a second matrix element at the jth row and jth column position in the adjacent matrix corresponding to the second candidate map;

multiplying the first matrix element by the first candidate probability to obtain a first operation result, and multiplying the second matrix element by the second candidate probability to obtain a second operation result;

and performing addition operation on the first operation result and the second operation result, wherein the operation result is used as a target matrix element at the ith row and jth column position in the estimated adjacent matrix.

9. The method of claim 7, wherein after generating an estimated adjacency matrix based on the first candidate probability, the adjacency matrix corresponding to the first candidate graph, the second candidate probability, and the adjacency matrix corresponding to the second candidate graph, the method further comprises:

optimizing the first parameter and the second parameter according to the first candidate probability and the second candidate probability;

and updating the estimated adjacency matrix based on the optimized first parameter and the optimized second parameter.

10. The method of claim 9, wherein the optimizing the first parameter and the second parameter based on the first candidate probability and the second candidate probability comprises:

adding the first candidate probability and the second candidate probability to obtain a combined posterior probability expression of the first parameter and the second parameter;

transforming the combined posterior probability expression by using a preset inequality;

performing derivation operation on the transformed joint posterior probability based on the first parameter to obtain an optimization equation corresponding to the first parameter, and performing derivation operation on the transformed joint posterior probability based on the second parameter to obtain an optimization equation corresponding to the second parameter;

and solving the optimization equation corresponding to the first parameter to obtain the optimized first parameter, and solving the optimization equation corresponding to the second parameter to obtain the optimized second parameter.

11. The method of claim 4, wherein the generating the estimate map from the estimate adjacency matrix comprises:

carrying out sparsification processing on the estimated adjacency matrix, and extracting a target adjacency matrix from the estimated adjacency matrix;

and generating the estimation graph according to the target adjacency matrix.

12. The method of claim 11, wherein the sparsifying of the estimated adjacency matrix and extracting a target adjacency matrix from the estimated adjacency matrix comprises:

traversing each target matrix element in the estimated adjacency matrix, and replacing matrix elements smaller than or equal to a threshold value in the estimated adjacency matrix with zeros to obtain the target adjacency matrix.

13. The method of claim 3, wherein the prediction information corresponding to the initial graph comprises a prediction value corresponding to each node in a plurality of nodes of the initial graph, the prediction value corresponding to any node is used for indicating a category to which the any node belongs, the label information comprises a target value used for indicating a category to which a target node belongs, and after the first neighbor graph and the second neighbor graph are combined into the neighbor graph set, the method further comprises:

obtaining a predicted value corresponding to the target node from the predicted information corresponding to the initial graph, and determining difference information between the predicted value corresponding to the target node and the target value included in the tag information;

constructing a loss function corresponding to the graph prediction model according to the difference information;

updating the first weight parameter and the second weight parameter in a direction of decreasing the value of the penalty function;

and updating the prediction information corresponding to the initial graph and the neighbor graph set respectively based on the updated first weight parameter and the updated second weight parameter.

14. The method of claim 1, wherein the method further comprises:

obtaining a graph to be processed, calling the graph prediction model optimized in the graph structure estimation model to perform prediction processing on the graph to be processed, and obtaining target observation information corresponding to the graph to be processed, wherein the target observation information comprises target prediction information corresponding to the graph to be processed;

calling the graph estimator to perform estimation processing based on the target observation information to obtain a target estimation graph;

and calling the graph prediction model to process the target estimation graph to obtain a prediction result corresponding to the target estimation graph, wherein the prediction result is used for indicating the category of each node in the graph to be processed.

15. A model processing apparatus, comprising:

the processing unit is used for calling a graph prediction model included in a graph structure estimation model to perform prediction processing on the initial graph to obtain observation information corresponding to the initial graph, wherein the observation information includes prediction information corresponding to the initial graph, and the prediction information corresponding to the initial graph is used for indicating the category of each node in the initial graph;