CN111462088A

CN111462088A - Data processing method, device, equipment and medium based on graph convolution neural network

Info

Publication number: CN111462088A
Application number: CN202010251576.0A
Authority: CN
Inventors: 张�杰; 徐倩
Original assignee: WeBank Co Ltd
Current assignee: WeBank Co Ltd
Priority date: 2020-04-01
Filing date: 2020-04-01
Publication date: 2020-07-28

Abstract

The invention discloses a data processing method, a device, equipment and a medium based on a graph convolution neural network, wherein the method comprises the following steps: acquiring first graph structure data to be processed, and inputting the first graph structure data into a graph convolution neural network which is trained in advance; calling the graph convolutional neural network to perform neighbor node sampling operation and neighborhood information aggregation operation on each first node in the first graph structure data so as to update each first node; and calling the graph convolution neural network to obtain a processing result of the first graph structure data based on each updated first node. The invention realizes the difference of the relation between the excavation nodes of the graph convolution neural network, so that more accurate processing results can be made according to more sufficient excavated information.

Description

Data processing method, device, equipment and medium based on graph convolution neural network

Technical Field

The invention relates to the field of artificial intelligence, in particular to a data processing method, a data processing device, data processing equipment and a data processing medium based on a graph convolution neural network.

Background

With the development of computer technology, more and more technologies (big data, distributed, Blockchain, artificial intelligence, etc.) are applied to the financial field, and the traditional financial industry is gradually changing to financial technology (Fintech), but higher requirements are also put forward on the technologies due to the requirements of security and real-time performance of the financial industry.

In recent years, deep learning has started a technical revolution in the industrial and academic fields, and has promoted development in various directions such as image detection and recognition, natural language processing, and speech recognition. At present, how to apply deep learning techniques, especially widely-used convolutional neural networks, to complex and high-dimensional graph structure data (such as social networks and protein structure data) processing is a hot spot of current academic research and is also an interest point in the industry.

The Neural Message propagation mechanism (Neural Message publishing) represented by Graph Convolutional Neural network (GCN) is a promising breakthrough for extending the convolution operation to Graph structure data. However, when the GCN performs neighborhood information aggregation, information of all neighbor nodes of a node in graph structure data is aggregated, and when the importance degrees of the neighbor nodes are different, the information in the graph structure data cannot be accurately mined, which results in inaccurate processing results of the graph structure data.

Disclosure of Invention

The invention mainly aims to provide a data processing method, a data processing device, data processing equipment and a data processing medium based on a graph convolution neural network, and aims to solve the problem that the processing result of graph structure data is inaccurate because the conventional graph convolution neural network GCN cannot accurately mine information in the graph structure data.

In order to achieve the above object, the present invention provides a data processing method based on a convolutional neural network, which includes the following steps:

acquiring first graph structure data to be processed, and inputting the first graph structure data into a graph convolution neural network which is trained in advance;

calling the graph convolutional neural network to perform neighbor node sampling operation and neighborhood information aggregation operation on each first node in the first graph structure data so as to update each first node;

and calling the graph convolution neural network to obtain a processing result of the first graph structure data based on each updated first node.

Optionally, the step of invoking the graph convolutional neural network to perform neighbor node sampling operation and neighborhood information aggregation operation on each first node in the first graph structure data, so as to update each first node includes:

calling the graph convolution neural network to perform neighbor node sampling operation and neighborhood information aggregation operation on each first node in the first graph structure data according to a permutation and combination mode to update each first node, wherein the permutation and combination mode is the permutation and combination mode of the neighbor node sampling operation and the neighborhood information aggregation operation.

Optionally, when the permutation and combination mode is a combination mode of one-time neighbor node sampling operation and one-time neighborhood information aggregation, the step of calling the graph convolution neural network to perform neighbor node sampling operation and neighborhood information aggregation operation on each first node in the first graph structure data according to the permutation and combination mode to update each first node includes:

calling the graph convolution neural network to generate sampling probabilities respectively corresponding to the first nodes in the first graph structure data based on sampling parameters in the graph convolution neural network;

respectively taking the sampling probability corresponding to each first node as a first diagonal element corresponding to each first node in a first diagonal matrix;

multiplying the node feature matrix in the first graph structure data by the first diagonal matrix to perform neighbor node sampling on each first node to obtain a first sampling result;

calling the graph convolutional neural network to perform neighborhood information aggregation operation on each first node based on the first sampling result so as to update each first node.

calling the graph convolution neural network to perform multiple sets of neighbor node sampling operation and neighborhood information aggregation operation on each first node in the first graph structure data respectively based on each set of sampling parameters in the graph convolution neural network, and correspondingly obtaining each set of initial update nodes;

and information merging is carried out on the corresponding same nodes in each set of the initial update nodes to obtain each updated first node.

Optionally, before the step of obtaining the graph structure data to be processed and inputting the graph structure data into the graph convolutional neural network trained in advance, the method further includes:

acquiring second graph structure data, and inputting the second graph structure data into a convolutional network of a graph to be trained;

calling the convolutional network of the graph to be trained to perform neighbor node sampling operation and neighborhood information aggregation operation on each second node in the second graph result data so as to update each second node;

calling the convolution network of the graph to be trained to obtain an output result based on each updated second node;

and updating the convolutional network of the graph to be trained based on the output result and the label data corresponding to the second graph structure data so as to train the convolutional network of the graph to be trained to obtain the graph convolutional neural network after training.

Optionally, the step of invoking the convolutional network of the graph to be trained to perform neighbor node sampling operation and neighborhood information aggregation operation on each second node in the second graph result data to update each second node includes:

calling the convolutional network of the graph to be trained based on sampling parameters in the convolutional network of the graph to be trained, and generating sampling values respectively corresponding to each second node in the second graph structure data based on a conductive operation;

respectively taking the sampling values corresponding to the second nodes as second diagonal elements corresponding to the second nodes in a second diagonal matrix;

multiplying the node feature matrix in the second graph structure data by the second diagonal matrix to perform neighbor node sampling on each second node to obtain a second sampling result;

and calling the convolutional network of the graph to be trained to perform neighborhood information aggregation operation on each second node based on the second sampling result so as to update each second node.

Optionally, the step of calling the graph convolutional neural network to obtain a processing result of the first graph structure data based on each updated first node further includes:

and acquiring the processing result of the user to be predicted from the processing result as a fraud possibility prediction result of the user to be predicted.

In order to achieve the above object, the present invention further provides a data processing apparatus based on a convolutional neural network, including:

the acquisition module is used for acquiring first graph structure data to be processed and inputting the first graph structure data into a graph convolution neural network which is trained in advance;

the operation module is used for calling the graph convolution neural network to perform neighbor node sampling operation and neighborhood information aggregation operation on each first node in the first graph structure data so as to update each first node;

and the processing module is used for calling the graph convolution neural network to obtain a processing result of the first graph structure data based on each updated first node.

In order to achieve the above object, the present invention further provides a data processing apparatus based on a convolutional neural network, including: a memory, a processor and a data processing program based on a convolutional neural network stored on the memory and operable on the processor, the data processing program based on a convolutional neural network implementing the steps of the data processing method based on a convolutional neural network as described above when executed by the processor.

In addition, to achieve the above object, the present invention further provides a computer readable storage medium, on which a data processing program based on a convolutional neural network is stored, and the data processing program based on a convolutional neural network implements the steps of the data processing method based on a convolutional neural network as described above when being executed by a processor.

In the invention, when the graph structure data to be processed is processed, the graph convolutional neural network is called to perform neighbor node sampling operation and neighborhood information aggregation operation on the graph structure data so as to update each node, and the graph convolutional neural network is called to obtain the processing result of the graph structure data based on the updated first node. By adding a sampling mechanism into the graph convolution neural network, sampling operation is carried out on neighbor nodes of the nodes, so that the difference between the sampled neighbor nodes and the non-sampled neighbor nodes exists during neighborhood information aggregation operation, the difference processing of the neighbor nodes is realized, the difference of the relationship between the nodes is excavated, the updated nodes are processed based on the difference to obtain a final processing result, and the difference is realized to be embodied in the processing result.

Drawings

FIG. 1 is a schematic diagram of a hardware operating environment according to an embodiment of the present invention;

FIG. 2 is a schematic flow chart diagram of a data processing method based on a convolutional neural network according to a first embodiment of the present invention;

FIG. 3 is a block diagram of a data processing apparatus based on graph convolution neural network according to a preferred embodiment of the present invention.

The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.

Detailed Description

It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

As shown in fig. 1, fig. 1 is a schematic device structure diagram of a hardware operating environment according to an embodiment of the present invention.

It should be noted that, in the embodiment of the present invention, the data processing device based on the graph convolutional neural network may be a smart phone, a personal computer, a server, and the like, and is not limited specifically herein.

As shown in fig. 1, the data processing apparatus based on the graph convolution neural network may include: a processor 1001, such as a CPU, a network interface 1004, a user interface 1003, a memory 1005, a communication bus 1002. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a non-volatile memory (e.g., a magnetic disk memory). The memory 1005 may alternatively be a storage device separate from the processor 1001.

Those skilled in the art will appreciate that the device architecture shown in fig. 1 does not constitute a limitation of a data processing device based on a graph-convolutional neural network, and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.

As shown in fig. 1, a memory 1005, which is a kind of computer storage medium, may include therein an operating system, a network communication module, a user interface module, and a data processing program based on a graph convolution neural network. Among them, the operating system is a program that manages and controls the hardware and software resources of the device, supporting the operation of the data processing program based on the convolutional neural network and other software or programs.

In the device shown in fig. 1, the user interface 1003 is mainly used for data communication with a client; the network interface 1004 is mainly used for establishing communication connection with a server; and the processor 1001 may be configured to call the data processing program based on the atlas neural network stored in the memory 1005, and perform the following operations:

Further, the step of invoking the graph convolution neural network to perform neighbor node sampling operation and neighborhood information aggregation operation on each first node in the first graph structure data to update each first node includes:

Further, when the permutation and combination mode is a combination mode of one-time neighbor node sampling operation and one-time neighborhood information aggregation, the step of calling the graph convolution neural network to perform neighbor node sampling operation and neighborhood information aggregation operation on each first node in the first graph structure data according to the permutation and combination mode to update each first node includes:

Further, before the step of obtaining the graph structure data to be processed and inputting the graph structure data into the graph convolutional neural network trained in advance, the processor 1001 may be further configured to call a data processing program based on the graph convolutional neural network stored in the memory 1005, and perform the following operations:

Further, the step of calling the convolutional network of the graph to be trained to perform neighbor node sampling operation and neighborhood information aggregation operation on each second node in the second graph result data so as to update each second node includes:

Further, after the step of calling the graph convolutional neural network to obtain the processing result of the first graph structure data based on the updated first nodes, the processor 1001 may be further configured to call a data processing program based on the graph convolutional neural network stored in the memory 1005, and perform the following operations:

Based on the above structure, various embodiments of the data processing method based on the graph convolution neural network are proposed.

Referring to fig. 2, fig. 2 is a schematic flow chart diagram of a data processing method based on a graph convolution neural network according to a first embodiment of the present invention.

While a logical order is shown in the flow chart, in some cases, the steps shown or described may be performed in an order different than that shown. The executing subject of each embodiment of the data processing method based on the graph convolution neural network can be a device such as a smart phone, a personal computer, a server and the like, and for convenience of description, the executing subject is omitted in the following embodiments for explanation. In this embodiment, the data processing method based on the graph convolution neural network includes:

step S10, acquiring first graph structure data to be processed and inputting the first graph structure data into a graph convolution neural network trained in advance;

the graph convolution neural network is used for processing the graph structure data. Wherein, the Graph (Graph) is a nonlinear structure more complex than a tree structure, and is a data structure composed of a non-empty node set and a finite set of edges describing the relationship between nodes; the graph structure data includes information of nodes in the graph and information of edges, the information of the nodes may include characteristic information of the nodes, and the information of the edges may include nodes connected by the edges.

In this embodiment, a convolutional neural network may be trained in advance, and the structure of the convolutional neural network may be that a sampling mechanism is added on the basis of the existing convolutional neural network. And carrying out sampling operation on the neighbor nodes of the nodes through a sampling mechanism, wherein the aggregation operation has different treatment on the sampled neighbor nodes and the non-sampled neighbor nodes when aggregating the neighborhood information, thereby realizing the differential treatment on the neighbor nodes. Specifically, in the existing graph volume neural network, when performing neighborhood information aggregation on a node, information of all neighbor nodes of the node is aggregated indiscriminately, but in this embodiment, neighbor node sampling may be performed on each node based on the sampling mechanism, that is, each neighbor node of the node may be sampled, and when performing neighborhood information aggregation on the node, information of neighbor nodes sampled in neighbor nodes of the node is aggregated, and information of neighbor nodes not sampled is not aggregated. In addition, in the sampling mechanism, the sampling of each neighbor node of the node is independent, namely the sampling probability of each neighbor node is different, so that a biased sampling is realized, and the differential treatment of each neighbor node is achieved. The graph convolution neural network is trained, so that a sampling mechanism can learn how to distinguish neighbor nodes, the probability that important neighbor nodes with high relevance are sampled is high, the probability that unimportant neighbor nodes with low relevance are sampled is low, and the message circulation of the node-neighbor relation is correspondingly amplified or inhibited.

It should be noted that training sample data may be selected based on a training task, and training is performed using the training sample data to obtain a convolutional neural network for implementing a certain specific prediction or classification task, for example, the training sample data may use social network data of each user, the social network data may be from a bank, an e-commerce, a social network platform, and the like, and using these social network data, a convolutional neural network for predicting whether a user is a fraudulent user may be trained. The method for training the convolutional neural network of the graph can adopt a training method of a common neural network model, and details are not described herein.

After the atlas neural network is trained, the atlas neural network may be used to process the atlas structural data. Specifically, the first graph structure data to be processed may be obtained, for example, when the graph-convolution neural network is used for predicting whether the user is a fraudulent user, the obtained first graph structure data may be social network data of a certain user to be predicted. The first graph structure data may include a node feature matrix representing information of each node (hereinafter, referred to as a first node), rows of the node feature matrix representing each node, columns representing each feature, and a row representing each feature information of a certain node, where n represents the number of nodes and d represents the feature dimension. The first graph structure data may further include a matrix representing a relationship between the nodes, and the matrix may be an adjacency matrix or a matrix transformed by an adjacency matrix, such as a laplace matrix.

Step S20, calling the graph convolutional neural network to perform neighbor node sampling operation and neighborhood information aggregation operation on each first node in the first graph structure data so as to update each first node;

and calling the graph convolutional neural network to perform neighbor node sampling operation and neighborhood information aggregation operation on each first node in the first graph structure data, and updating the information of each first node after the two operations. The sampling operation of the neighbor node may be to generate a sampling probability of each neighbor node of a certain node according to a trained sampling parameter in the graph convolution neural network, where the sampling probability of the neighbor node is greater than a random number, indicating that the neighbor node is sampled, and if the sampling probability of the neighbor node is not greater than a random number, indicating that the neighbor node is not sampled. The neighborhood information aggregation operation may employ a neighborhood information aggregation operation in existing graph convolutional neural networks, e.g., a pre-multiplication of a adjacency matrix over a node feature matrix.

It should be noted that the order of the neighbor node sampling operation and the neighborhood information aggregation operation, and the number of the two operations may not be limited, for example, the neighbor node sampling operation may be performed first, and then the neighborhood information aggregation operation may be performed, or the neighbor node sampling operation may be performed first, and then the neighbor node sampling operation may be performed again, or multiple neighbor node sampling operations and multiple neighbor information aggregation operations may be performed alternately.

In addition, there may be a plurality of aggregation layers of the graph convolution neural network, for example, there are two aggregation layers, and the sampling mechanisms of the two aggregation layers may be different, so that the first aggregation layer may be called to perform the above-mentioned neighbor node sampling operation and neighborhood information aggregation operation, and then the second aggregation layer may be called to perform the neighbor node sampling operation and neighborhood information aggregation operation on the updated first node, and the finally updated first node information is input to the next layer of the second aggregation layer. That is, when the graph convolution neural network has a plurality of aggregation layers, the node information input to the first aggregation layer is initial node information, and the node information aggregated with neighbor node information (neighborhood information) is output by the first aggregation layer; the second aggregation layer aggregates the information of the neighbor nodes based on the node information output by the first aggregation layer, so that the initial node aggregates the information of the neighbors and the information of the neighbors after two times of aggregation; the node information processed by the following aggregation layer is the node information output by the aggregation layer previous to the aggregation layer, that is, a recursive aggregation mode.

The graph convolution neural network obtained through training is used for conducting neighbor node sampling operation and neighborhood information aggregation operation on each first node, when neighborhood information aggregation is conducted on the first nodes, the difference of neighbor nodes of the first nodes is considered, differential aggregation can be conducted on the information of the neighbor nodes, the reality that the importance degrees of the neighbor nodes are different in an actual application scene can be fitted, and therefore the aggregation result based on the difference can obtain a more accurate processing result. Specifically, the probability that the neighbor node which is important or has a close relationship with the neighbor node is sampled is higher, so that the effect of amplifying the message path between the first node and the important neighbor node is achieved, the information of the important neighbor node is used for enriching the information of the first node more, and the effect that the graph convolution neural network optimizes the processing result by using the network information can be exerted more; on the contrary, the probability that the neighbor node which is unimportant or has a less close relationship with the neighbor node is sampled is relatively small, so that the effect of inhibiting a message path between the first node and the unimportant neighbor node is achieved, the information of the unimportant neighbor node does not interfere with the information of the first node, the effect that the graph convolution neural network optimizes the processing result by using the network information is achieved, meanwhile, the influence of the interference information of the network is avoided, and the accuracy of the processing result is further improved.

And step S30, calling the graph convolution neural network to obtain the processing result of the first graph structure data based on each updated first node.

After the information of each first node is updated, a processing result of the first graph structure data can be obtained based on each updated first node by using the graph convolutional neural network. Specifically, the aggregation layer of the graph convolution neural network may be connected to the output layer, or connected to another network layer and then connected to the output layer according to different specific training tasks, that is, the graph convolution neural network in this embodiment adopts a sampling mechanism in the aggregation layer, and other structures may adopt the structure of the existing graph convolution neural network. Then, after the aggregation layer updates the information of each node, the updated node information may be input to a network layer after the graph convolution network aggregation layer, and a subsequent network layer is invoked to process the information to obtain an output result, where the output result is a processing result of the first graph structure data.

In this embodiment, when the graph structure data to be processed is processed, the graph convolutional neural network is invoked to perform neighbor node sampling operation and neighborhood information aggregation operation on the graph structure data so as to update information of each node, and the graph convolutional neural network is invoked to obtain a processing result of the graph structure data based on the updated first node. By adding a sampling mechanism into the graph convolution neural network, sampling operation is carried out on neighbor nodes of the nodes, so that the difference between the sampled neighbor nodes and the non-sampled neighbor nodes exists during neighborhood information aggregation operation, the difference processing of the neighbor nodes is realized, the difference of the relationship between the nodes is excavated, the updated nodes are processed based on the difference to obtain a final processing result, and the difference is realized to be embodied in the processing result.

Further, the first graph structure data is social network data of the user to be predicted, and after the step S30, the method further includes:

step S40, obtaining the processing result of the user to be predicted from the processing result as the fraud possibility prediction result of the user to be predicted.

Further, in this embodiment, the training task may be to train a convolutional neural network that predicts the likelihood of fraud for the user. The trained convolutional neural network can be used to predict whether a user is a fraudulent user. The first graph structure data to be processed may be social network data of the user to be predicted, the social network data may be one network structure data centering on the user to be predicted, where the network structure data includes the user to be predicted and other users associated with the predicted user, the users serve as nodes in the graph structure data, information of the nodes is feature information of each user, the feature information may include portrait feature information basic to the user, and may also include some information related to determination of fraud possibility, such as transaction feature information, and edges in the graph structure data are users connected with social intersections. The social network data of the user to be predicted is input into the graph convolution neural network as graph structure data, a processing result is obtained after the processing according to the steps, the processing result can be a fraud possibility prediction result of each node in the graph structure data, namely each user, namely the graph convolution neural network processing obtains the fraud possibility prediction result of each user node, and then the fraud possibility prediction result of the user to be predicted can be selected from the graph convolution neural network processing. The fraud likelihood prediction result may be 0 or 1, where 0 indicates that the user to be predicted is not a fraudulent user, and 1 indicates that the user to be predicted is a fraudulent user.

In the embodiment, the social network data of the user to be predicted is processed through the graph convolution neural network to obtain the fraud possibility prediction result of the user to be predicted, and compared with the method that the prediction is carried out only by adopting the personal information of the user to be predicted, more information can be mined from richer social network data, so that the obtained fraud possibility prediction result is more accurate; on this basis, compared with the existing graph convolution neural network, because a sampling mechanism is introduced into the graph convolution neural network in the embodiment, neighbor node sampling operation is performed on each node, so that the graph convolution neural network can dig out different relationships between users to be predicted and different users, and the information of the neighbor users is distinguished and utilized based on the difference of the dug-out relationships, so that the obtained fraud possibility prediction result is more accurate.

Further, based on the first embodiment, the second embodiment of the data processing method based on the graph convolution neural network according to the present invention is proposed, and in this embodiment, the step S20 includes:

step S201, invoking the graph convolution neural network to perform neighbor node sampling operation and neighborhood information aggregation operation on each first node in the first graph structure data according to a permutation and combination mode to update each first node, where the permutation and combination mode is a permutation and combination mode of the neighbor node sampling operation and the neighborhood information aggregation operation.

In this embodiment, there may be a permutation and combination model in the graph convolution neural network, where the permutation and combination model is a permutation and combination model of neighbor node sampling operation and neighborhood information aggregation operation, and in the permutation and combination model, the two operations may be performed once or many times, and the order of the two operations may also be set according to specific needs. For example, if a represents a neighbor node sampling operation and b represents a neighborhood information aggregation operation, the permutation and combination pattern may be ab, ba, aba, bab, aab, …, etc. The graph convolutional neural network can be invoked to operate each first node in the first graph structure data according to the permutation and combination pattern, that is, two operations are performed according to the sequence in the permutation and combination pattern, for example, when the permutation and combination pattern is ab, neighbor node sampling operation and neighborhood information aggregation operation can be performed on each first node. It should be noted that the effects achieved according to different permutation and combination modes are different, so that different permutation and combination modes can be selected to achieve different effects according to different actual application scenarios.

Further, when the permutation and combination mode is a combination mode of one-time neighbor node sampling operation and one-time neighborhood information aggregation, the step S201 includes:

step S202, calling the graph convolution neural network to generate sampling probabilities respectively corresponding to each first node in the first graph structure data based on sampling parameters in the graph convolution neural network;

in this embodiment, to perform bernoulli sampling on each node in the graph convolution neural network, whether each node is sampled or not is independent, each node corresponds to a sampling probability, and the size of the sampling probability reflects the importance of the node, and the more nodes are sampled to participate in the aggregation process, the greater the importance of the nodes in the graph is. The graph convolution neural network may include sampling parameters, each node may correspond to one or a set of sampling parameters, and the sampling probability of the node may be generated by using the sampling parameter corresponding to the node and the information of the node, for example, the sampling probability Pi of the node i may be generated by using the following formula.

The value range of Pi is 0-1, s is a learnable parameter vector, namely a sampling parameter, W can be a linear transformation parameter in an aggregation layer of a graph convolution neural network, namely parameter sharing can be adopted, and H is a node feature matrix. In the formula (HW)_iDot product with s.

In the training process of the graph convolution neural network, the sampling parameters are learned to obtain different sampling probabilities based on the information of the nodes through training, so that the difference of the relationship between the nodes can be mined.

Step S203, taking the sampling probability corresponding to each first node as a first diagonal element corresponding to each first node in a first diagonal matrix;

and taking the sampling probability corresponding to each first node as a first diagonal element corresponding to each first node in the first diagonal matrix. The first diagonal matrix is a diagonal matrix with 0 diagonal elements except for the diagonal elements, the diagonal elements in each row of the first diagonal matrix correspond to a node, and the sampling probability of the node is used as the value of the diagonal element corresponding to the node, so that the value of each diagonal element of the first diagonal matrix can be obtained.

Step S204, a node feature matrix in the first graph structure data is multiplied by the first diagonal matrix to carry out neighbor node sampling on each first node to obtain a first sampling result;

after the first diagonal matrix is obtained, to perform neighbor node sampling operation on each first node, the first diagonal matrix may be pre-multiplied on the basis of the node feature matrix of the first graph structure data to obtain a matrix, and the matrix is used as a first sampling result. According to the principle of matrix multiplication, most of the information corresponding to the nodes with high sampling probability in the matrix obtained by multiplying the two matrices is reserved, and a small part of the information corresponding to the nodes with low sampling probability is reserved, even is completely 0. The effect of matrix multiplication is that when neighborhood information aggregation operation is performed, a node with high sampling probability can transmit most of the information to its neighbor nodes, that is, for its neighbor nodes, the node is the sampled neighbor node; the node with low sampling probability transmits the information of the node to the neighbor node of the node in a small part, namely, for the neighbor node of the node, the node is the neighbor node which is not sampled; therefore, the amplification or inhibition effect of the message flow between the node and the neighbor node is achieved.

Step S205, invoking the graph convolutional neural network to perform neighborhood information aggregation operation on each first node based on the first sampling result, so as to update each first node.

After the first sampling result is obtained, the graph convolutional neural network can be called to perform neighborhood information aggregation operation on the first nodes based on the first sampling result, so that the information of each first node is updated. Specifically, the neighborhood information aggregation operation may be a conversion form of a first sampling result by a neighboring matrix of the graph structure data or a neighboring matrix, and then linear conversion is performed by using a linear conversion parameter result, and then nonlinear conversion is performed to obtain each updated first node. For example, when there are multiple aggregation layers, X is a node feature matrix of graph structure data, H ═ X, Q is a diagonal matrix, a is an adjacency matrix or a transformation form of an adjacency matrix, σ () is a nonlinear activation function, and W is a linear transformation parameter of an aggregation layer, then H can be recursively generated from H ═ σ [ (AQHW) ].

In this embodiment, a biased node sampling aggregation mode is implemented by generating sampling probabilities of each node, taking the sampling probabilities of each node as diagonal elements of a diagonal matrix, and multiplying a node feature matrix of graph structure data by the diagonal matrix, so that a graph convolution neural network can mine the difference of different nodes in graph structure data.

Further, according to the permutation and combination pattern, in the H ═ σ [ (AQHW) ] formula, a and Q on the left side of H on the right side are arranged according to the permutation and combination pattern, for example, if the permutation and combination pattern is AQA, the formula is changed to H ═ σ [ (AQAHW) ], that is, the neighborhood information aggregation operation is performed on the node feature matrix first, then the left multiplication is performed on a diagonal matrix to perform the neighbor node sampling operation, then the neighborhood information aggregation operation is performed, and then the linear transformation and the nonlinear transformation are performed.

Further, the step S20 includes:

step S201, calling the graph convolution neural network to perform multiple sets of neighbor node sampling operation and neighborhood information aggregation operation on each first node in the first graph structure data respectively based on each set of sampling parameters in the graph convolution neural network, and correspondingly obtaining each set of initial update nodes;

further, in this embodiment, a plurality of sets of sampling parameters may be set in the graph convolution neural network, and then a plurality of sets of corresponding linear transformation parameters W may also be set. And calling the graph volume neural network to perform neighbor node sampling operation and neighborhood information aggregation operation on each first node in the graph structure data respectively on the basis of each set of sampling parameters in parallel, wherein after each set of neighbor node sampling operation and neighborhood information aggregation operation, the information of each first node is updated, so that a plurality of sets of updated first nodes are obtained, and the obtained sets of updated first nodes are called as a plurality of sets of initial update nodes. The process of performing the neighbor node sampling operation and the neighborhood information aggregation operation by using a set of sampling parameters may refer to the steps in this embodiment or the first embodiment, which are not described in detail herein.

Step S202, information combination is carried out on the corresponding same nodes in each set of the initial update nodes, and each updated first node is obtained.

It can be understood that each set of the initial update nodes includes information of each node, and information of the corresponding same node in each set of the initial update nodes is merged, that is, for a certain node, information of the node in each set of the initial update nodes is merged to obtain the updated node. The information merging may be performed by vector splicing, and vector-form information of a certain node in each set of the initial update nodes is vector-spliced to obtain the updated node, for example, there are 10 sets of the initial update nodes, and each node in each set of the initial update nodes has 50-dimensional information, so that after the information merging is performed by the vector splicing, each node has 500-dimensional information.

In this embodiment, a plurality of sets of sampling parameters are set, and the graph convolutional neural network is invoked to perform a plurality of sets of neighbor node sampling operations and neighborhood information aggregation operations based on the plurality of sets of sampling parameters, so that the graph convolutional neural network can mine more possibilities of relationships among nodes in graph structure data based on the plurality of sets of sampling parameters, and further improve the accuracy of data processing performed by the graph convolutional neural network.

Further, based on the first embodiment and the second embodiment, a third embodiment of the data processing method based on a graph convolution neural network according to the present invention is proposed, and in this embodiment, before the step S10, the method further includes:

step S50, acquiring second graph structure data, and inputting the second graph structure data into the convolutional network of the graph to be trained;

further, in this embodiment, the training process of the graph convolution neural network may specifically be to preset a graph convolution neural network to be trained (hereinafter referred to as a graph convolution network to be trained), and the structure of the graph convolution network to be trained may be set as an input layer, a hidden layer, and an output layer, where the hidden layer may include multiple aggregation layers, and the multiple aggregation layers may be connected in series or connected in parallel. The hidden layer may further include, according to a specific training task, other network layers that may be used in the graph convolutional neural network, which is not limited herein.

And acquiring second graph structure data for training the graph convolutional neural network, wherein the second graph structure data is different according to different model training tasks, for example, when the model training task is to predict whether the user is a fraudulent user, the second graph structure data can be social network data of the user. It should be noted that the second graph structure data may be one graph structure data, and is generally trained by using a large amount of graph structure data to achieve a training effect, so the second graph structure data may also be a plurality of graph structure data, and the following description will be given by using one graph structure data for convenience.

And inputting the second graph structure data into the convolutional network of the graph to be trained.

Step S60, calling the convolution network of the graph to be trained to perform neighbor node sampling operation and neighborhood information aggregation operation on each second node in the second graph result data so as to update each second node;

a sampling mechanism can be respectively arranged in each aggregation layer of the convolutional network of the graph to be trained, and neighbor node sampling operation can be carried out on the nodes according to the sampling mechanism. And calling the convolutional network of the graph to be trained to perform neighbor node sampling operation and neighborhood information aggregation operation on each second node in the second graph structure data, and after the two operations, updating the information of each second node in the second graph structure data. Specifically, the sampling operation of the neighbor node may be to generate, for a certain node, a sampling probability of each neighbor node of the node according to a sampling parameter in the convolutional network of the graph to be trained, where the sampling probability of the neighbor node is greater than a random number, it indicates that the neighbor node is sampled, and if the sampling probability of the neighbor node is not greater than a random number, it indicates that the neighbor node is not sampled. The neighborhood information aggregation operation may employ a neighborhood information aggregation operation in existing graph convolutional neural networks, e.g., a pre-multiplication of a adjacency matrix over a node feature matrix.

Step S70, calling the convolution network of the graph to be trained to obtain an output result based on each updated second node;

and after the information of each second node is updated, calling the convolution network to be trained to obtain an output result based on each updated second node. Specifically, the aggregation layer of the convolutional network to be trained may be connected to the output layer, or connected to another network layer and then connected to the output layer according to different training tasks. That is, in the embodiment, the convolutional network of the graph to be trained adopts a sampling mechanism in the aggregation layer, and other structures may adopt the structure of the existing graph convolutional neural network. Then, after the aggregation layer updates the information of each node, the updated node information can be input to a network layer behind the convolutional network aggregation layer of the graph to be trained, and a later network layer is called to process the information to obtain an output result.

Step S80, updating the convolutional network of the graph to be trained based on the output result and the label data corresponding to the second graph structure data, so as to train the convolutional network of the graph to be trained to obtain the graph convolutional neural network after training.

When the convolutional network sampling supervised training mode of the graph to be trained is trained, the second graph structure data corresponds to one label data and is used for supervising an output result. For example, when the training task is to predict the likelihood of fraud for a user, the tag data may be a true tag of whether the respective user in the second graph structure data (social network data) is a fraudulent user. Specifically, a loss function of the convolutional network of the graph to be trained can be constructed based on the output result and the label data of the second graph structure data, gradient values of all parameters in the convolutional network of the graph to be trained are calculated based on the loss function, and then all parameters are updated by adopting a gradient descent method based on the gradient values, so that the convolutional network of the graph to be trained is updated. After repeated iteration updating, the graph convolution neural network which is trained can be obtained. It should be noted that the sampling parameters in the graph volume network to be trained are also trained, and through the training process, the sampling parameters learn how to distinguish the relationship between the nodes, so that the graph volume neural network can dig out the difference between the nodes.

Further, the step S60 includes:

step S601, calling the convolution network of the graph to be trained based on the sampling parameters in the convolution network of the graph to be trained, and generating sampling values respectively corresponding to each second node in the second graph structure data based on a tractable operation;

further, in this embodiment, corresponding to the usage process of the graph convolution neural network in the second embodiment, there may be a permutation and combination pattern in the graph convolution network to be trained, where the permutation and combination pattern is a permutation and combination model of neighbor node sampling operation and neighborhood information aggregation operation, and in the permutation and combination pattern, the number of the two operations may be one or multiple times, and the sequence of the two operations may also be set according to specific needs. The graph convolutional network to be trained can be called to operate each second node in the second graph structure data according to the permutation and combination model, namely, two operations are executed according to the sequence in the permutation and combination model.

When the permutation and combination mode is a combination mode of one-time neighbor node sampling operation and one-time neighborhood information aggregation, Bernoulli sampling is carried out on each node in the graph convolution neural network, whether each node is sampled or not is mutually independent, each node corresponds to one sampling probability, the sampling probability reflects the importance of the node, and the more probability is, the more importance is in the graph, when the node participates in the aggregation process. The graph convolution neural network may include sampling parameters, each node may correspond to one or a set of sampling parameters, and the sampling probability of the node may be generated by using the sampling parameter corresponding to the node and the information of the node, for example, the sampling probability Pi of the node i may be generated by using the following formula.

The neighbor node sampling operation can be performed according to the sampling probability, but because the sampling probability is not a simple weight but a distribution, the gradient cannot be transmitted by using the sampling probability in the neural network, so that the parameter updating cannot be performed, and the training of the neural network cannot be completed. In this embodiment, the convolutional network of the graph to be trained may be called, based on the sampling parameters in the convolutional network of the graph to be trained, and based on the conductive operation, the sampling values corresponding to the respective second nodes are generated, and then the sampling values are used to perform the operation of using the neighboring nodes. Wherein the tractable operation can be processed according to the GumbelSoftmax technique, a reparameterization technique. In particular, since each node is probability-compliant bernoulli-distribution, i.e., two-class, Gumbel sigmoid technique (a reparameterization technique) is used in practice. Referring to the theory of the gumbel-max tribck (a method of sampling from discrete distributions), a sample (decimated to 1, not decimated to 0) is drawn from the two-class distribution sigmoid (a) and can be expressed as:

gumbel_sigm(a)＝exp^([a+gumbel1]/t)/[exp^([a+gumbel1]/t)+exp^(gumbel2/t)]

according to the definition of sigmoid (S-type growth curve), it can be further written as:

gumbel_sigm(a)＝sigmoid([a+gumbel1-gumbel2]/t)

where t >0, controls the extent of the above-described analog sampling process, the closer t is to 0, the better the simulation (although there may be numerical problems and so in practice it is appropriate, say 0.1). And, respectively, Gumbel1 and Gumbel2 are samples taken from the Gumbel distribution Gumbel (0,1), i.e., Gumbel1(Gumbel2) ═ log (-log (unifonm (0, 1))).

That is, for two-distribution sigmoid (a), if the number _ sigm (a) tends to 1, the sample is decimated, tends to 0, and is not decimated. In conjunction with the above expression of Pi, we can obtain that the sampled value Qii of node i is:

that is, the formula Qii is used to obtain the sampling value corresponding to each second node.

Step S602, the sampling values corresponding to the second nodes are respectively used as second diagonal elements corresponding to the second nodes in a second diagonal matrix;

and taking the sampling probability corresponding to each second node as a second diagonal element corresponding to each second node in the second diagonal matrix. The second diagonal matrix is a diagonal matrix with 0 diagonal elements except for the diagonal elements, the diagonal elements in each row of the second diagonal matrix correspond to a node, and then the value of each diagonal element of the second diagonal matrix can be obtained by taking the sampling value of the node as the value of the diagonal element corresponding to the node.

Step S603, pre-multiplying the node feature matrix in the second graph structure data by the second diagonal matrix to perform neighbor node sampling on each second node to obtain a second sampling result;

after the second diagonal matrix is obtained, the neighbor node sampling operation is performed on each second node, and the second diagonal matrix may be pre-multiplied on the basis of the node feature matrix of the second graph structure data to obtain a matrix, which is used as a second sampling result.

Step S604, calling the convolutional network of the graph to be trained to perform neighborhood information aggregation on each second node based on the second sampling result, so as to update each second node.

After the second sampling result is obtained, the convolutional network of the graph to be trained can be called to perform neighborhood information aggregation operation on the second nodes based on the second sampling result, so that the information of each second node is updated. Specifically, the neighborhood information aggregation operation may be a transformation form in which the second sampling result is pre-multiplied by an adjacent matrix of the second graph structure data or the adjacent matrix, and then linear transformation is performed by using the linear transformation parameter result, and then nonlinear transformation is performed to obtain each updated second node.

In this embodiment, a biased node sampling aggregation mode is implemented by generating sampling values of each node, taking the sampling values of each node as diagonal elements of a diagonal matrix, and multiplying a node feature matrix of graph structure data by the diagonal matrix, so that a graph volume network to be trained can learn the capability of mining the difference of different nodes in graph structure data in a training process. And moreover, the sampling value is generated based on the conductible processing operation, and the problem that gradient information cannot be transmitted due to sampling adopted in the neural network is avoided.

In this embodiment, when the graph convolution neural network obtained by training is used, sampling probability is used instead of sampling value, so that when the model is used, the processing result obtained when the model is used each time due to partial information loss caused by sampling is not different, that is, the model using process becomes a deterministic process by using the sampling probability instead of the sampling value when the model is used. On the other hand, the principle behind the method is similar to the principle of neuron random inactivation in deep learning, and the probability is used for replacing the result of sampling when the model is used, which is equivalent to the integration that the final inference result is the result of sampling for countless times, so that the processing result of the graph convolution neural network is more robust.

In addition, the biased sampling aggregation mode in this embodiment realizes the effect of amplifying or suppressing the message circulation of the node-neighbor relationship, and because the method of sampling the neighbor nodes is adopted in the training process, that is, in each training process, whether the neighbor nodes are sampled randomly or not, and further whether the information of the neighbor nodes is aggregated randomly or not, the random selection and non-selection method is similar to the principle of neuron random inactivation in deep learning, so that the overfitting risk in the training process of the graph convolution neural network can be reduced.

In addition, an embodiment of the present invention further provides a data processing apparatus based on a convolutional neural network, and referring to fig. 3, the data processing apparatus based on a convolutional neural network includes:

the acquisition module 10 is configured to acquire a first graph structure data to be processed and input the first graph structure data into a graph convolution neural network trained in advance;

an operation module 20, configured to invoke the graph convolution neural network to perform neighbor node sampling operation and neighborhood information aggregation operation on each first node in the first graph structure data, so as to update each first node;

and the processing module 30 is configured to invoke the graph convolution neural network to obtain a processing result of the first graph structure data based on each updated first node.

Further, the operation module 20 is further configured to:

Further, when the permutation combination mode is a combination mode of one-time neighbor node sampling operation and one-time neighborhood information aggregation, the operation module 20 includes:

the first generating unit is used for calling the graph convolution neural network to generate sampling probabilities respectively corresponding to each first node in the first graph structure data based on the sampling parameters in the graph convolution neural network;

a first determining unit, configured to use the sampling probabilities corresponding to the first nodes as first diagonal elements corresponding to the first nodes in a first diagonal matrix, respectively;

the first sampling unit is used for multiplying the node characteristic matrix in the first graph structure data by the first diagonal matrix to perform neighbor node sampling on each first node to obtain a first sampling result;

and the first aggregation unit is used for calling the graph convolution neural network to perform neighborhood information aggregation operation on each first node based on the first sampling result so as to update each first node.

Further, the operation module 20 includes:

the operation unit is used for calling the graph convolution neural network to perform multiple sets of neighbor node sampling operation and neighborhood information aggregation operation on each first node in the first graph structure data respectively based on each set of sampling parameters in the graph convolution neural network, and correspondingly obtaining each set of initial update nodes;

and the merging unit is used for merging the information of the corresponding same nodes in each set of the initial update nodes to obtain each updated first node.

Further, the obtaining module 10 is further configured to obtain second graph structure data, and input the second graph structure data into the convolutional network of the graph to be trained;

the operation module 20 is further configured to invoke the convolutional network of the graph to be trained to perform neighbor node sampling operation and neighborhood information aggregation operation on each second node in the second graph result data, so as to update each second node;

the data processing device based on the graph convolution neural network further comprises:

the output module is used for calling the convolution network of the graph to be trained to obtain an output result based on each updated second node;

and the training module is used for updating the convolutional network of the graph to be trained based on the output result and the label data corresponding to the second graph structure data so as to train the convolutional network of the graph to be trained to obtain the graph convolutional neural network after training.

Further, the adjusting module 20 further includes:

the second generation unit is used for calling the convolutional network of the graph to be trained, generating sampling values respectively corresponding to each second node in the second graph structure data based on sampling parameters in the convolutional network of the graph to be trained and based on a derivative processing operation;

a second determining unit, configured to use the sampling values corresponding to the second nodes as second diagonal elements corresponding to the second nodes in a second diagonal matrix, respectively;

the second sampling unit is used for multiplying the node characteristic matrix in the second graph structure data by the second diagonal matrix to perform neighbor node sampling on each second node to obtain a second sampling result;

and the second aggregation unit is used for calling the convolutional network of the graph to be trained to perform neighborhood information aggregation operation on each second node based on the second sampling result so as to update each second node.

Further, the first graph structure data is social network data of a user to be predicted, and the data processing device based on the graph volume neural network further comprises:

and the prediction module is used for acquiring the processing result of the user to be predicted from the processing result as the fraud possibility prediction result of the user to be predicted.

The specific implementation of the data processing apparatus based on the convolutional neural network of the present invention is basically the same as the embodiments of the data processing method based on the convolutional neural network, and is not described herein again.

In addition, an embodiment of the present invention further provides a computer-readable storage medium, where a data processing program based on a convolutional neural network is stored on the storage medium, and when being executed by a processor, the data processing program based on a convolutional neural network implements the steps of the data processing method based on a convolutional neural network as described below.

In the embodiments of the data processing device based on the convolutional neural network and the computer-readable storage medium of the present invention, reference may be made to the embodiments of the data processing method based on the convolutional neural network of the present invention, and details are not described herein again.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.

The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims

1. A data processing method based on a graph convolution neural network is characterized by comprising the following steps:

2. The data processing method based on the graph convolution neural network as claimed in claim 1, wherein the step of invoking the graph convolution neural network to perform a neighbor node sampling operation and a neighborhood information aggregation operation on each first node in the first graph structure data to update each first node comprises:

3. The data processing method based on the graph convolution neural network as claimed in claim 2, wherein when the permutation and combination mode is a combination mode of one-time neighbor node sampling operation and one-time neighborhood information aggregation, the step of invoking the graph convolution neural network to perform the neighbor node sampling operation and the neighborhood information aggregation operation on each first node in the first graph structure data according to the permutation and combination mode to update each first node comprises:

4. The data processing method based on the graph convolution neural network as claimed in claim 1, wherein the step of invoking the graph convolution neural network to perform a neighbor node sampling operation and a neighborhood information aggregation operation on each first node in the first graph structure data to update each first node comprises:

5. The data processing method based on the graph convolution neural network as claimed in claim 1, wherein before the step of obtaining the graph structure data to be processed and inputting the graph structure data to the graph convolution neural network trained in advance, the method further comprises:

6. The data processing method based on graph convolution neural network of claim 5, wherein the step of invoking the graph convolution network to be trained to perform neighbor node sampling operation and neighborhood information aggregation operation on each second node in the second graph result data to update each second node comprises:

7. The data processing method based on the graph convolution neural network as claimed in any one of claims 1 to 6, wherein the first graph structure data is social network data of a user to be predicted, and after the step of invoking the graph convolution neural network to obtain a processing result of the first graph structure data based on each updated first node, the method further comprises:

8. A data processing apparatus based on a convolutional neural network, comprising:

9. A data processing apparatus based on a convolutional neural network, comprising: a memory, a processor and a data processing program based on a convolutional neural network stored on the memory and operable on the processor, the data processing program based on a convolutional neural network implementing the steps of the data processing method based on a convolutional neural network as claimed in any one of claims 1 to 7 when executed by the processor.

10. A computer-readable storage medium, on which a data processing program based on a atlas neural network is stored, which when executed by a processor implements the steps of the data processing method based on an atlas neural network according to any one of claims 1 to 7.