CN113822315A

CN113822315A - Attribute graph processing method and device, electronic equipment and readable storage medium

Info

Publication number: CN113822315A
Application number: CN202110671978.0A
Authority: CN
Inventors: 张文涛; 蒋杰; 盛则昂; 李晓森; 欧阳文; 陶阳宇; 杨智; 崔斌
Original assignee: Peking University; Shenzhen Tencent Computer Systems Co Ltd
Current assignee: Peking University; Shenzhen Tencent Computer Systems Co Ltd
Priority date: 2021-06-17
Filing date: 2021-06-17
Publication date: 2021-12-21

Abstract

The embodiment of the application provides a processing method and device of an attribute map, electronic equipment and a readable storage medium, and relates to the technical field of artificial intelligence and block chains. The method comprises the following steps: acquiring a training data set, wherein each sample in the training data set comprises an adjacency matrix of a sample attribute graph and initial feature representation of each node, and the initial feature representation of each node represents attribute information of the node; for each sample attribute graph, determining initial graph structure characteristics of the graph based on at least one item in the original information, and determining first similarity among nodes in the graph based on initial characteristic representation of the nodes in the graph; and repeatedly executing training operation on the initial graph convolution neural network model based on the training data set until the training loss value meets the training ending condition to obtain the trained graph convolution neural network model. The target feature representation obtained in the present application retains the original graph structure information and the similarity between the nodes.

Description

Attribute graph processing method and device, electronic equipment and readable storage medium

Technical Field

The application relates to the technical field of artificial intelligence and block chains, in particular to a processing method and device of an attribute map, an electronic device and a readable storage medium.

Background

At present, most attribute graph clustering methods are based on a graph convolution neural network, feature representation (which can be called as high-dimensional feature representation) with higher node dimensionality is compressed to feature representation (which can be called as low-dimensional feature representation) with lower dimensionality by a method for reconstructing an adjacency matrix, and finally, a traditional clustering algorithm (such as K-Means and K-Means clustering algorithm) is applied on the basis of the low-dimensional feature representation to generate a final clustering result.

However, since the graph structure of the attribute graph can be divided into an information part at a low level (lower level) and an information part at a high level (deeper level), the similarity information between nodes in the high-dimensional feature representation is not retained in the low-dimensional feature representation of the nodes in the prior art, and the finally obtained clustering result is inaccurate.

Disclosure of Invention

The application relates to a processing method and device of an attribute graph, an electronic device and a readable storage medium, which can enable the original graph structure information of the attribute graph and the similarity of the attribute information among nodes in the graph to be still kept in the obtained target feature representation.

In one aspect, an embodiment of the present application provides a method for processing an attribute map, where the method includes:

acquiring a training data set, wherein each sample in the training data set comprises original information of a sample attribute graph, the original information comprises an adjacency matrix of the sample attribute graph and initial feature representation of each node in the sample attribute graph, and the initial feature representation of one node represents the attribute information of the node;

for each sample attribute graph, determining initial graph structure characteristics of the graph based on at least one item in the original information, and determining first similarity among nodes in the graph based on initial characteristic representation of the nodes in the graph;

repeatedly executing the following training operations on the initial graph convolution neural network model based on the training data set until the training loss value meets the training ending condition to obtain a trained graph convolution neural network model:

for each sample attribute graph, inputting original information into a graph convolution neural network model to obtain target feature representation of each node in the graph, and determining the target graph structural feature of the graph and a second similarity between the nodes in the graph based on the target feature representation of each node in the graph;

for each sample attribute graph, determining a first difference between an initial graph structural feature and a target graph structural feature of the graph and a second difference between a first similarity and a second similarity between nodes in the graph;

and determining a training loss value of the graph convolution neural network model based on the first difference and the second difference corresponding to each sample attribute graph, and if the training loss value does not meet a training end condition, adjusting model parameters of the graph convolution neural network.

On the other hand, an embodiment of the present application provides a method for processing an attribute map, including:

acquiring an attribute graph to be processed and attribute information of each node in the attribute graph to be processed;

determining an adjacency matrix of the attribute graph to be processed;

for each node, determining an initial feature representation of the node based on the attribute information of the node;

and outputting the initial characteristic representation of the adjacency matrix and each node to a graph convolution neural network model to obtain the target characteristic representation of each node, wherein the graph convolution neural network model is obtained by training based on the mode of training the graph convolution neural network model described in the foregoing.

Optionally, the method further includes:

and classifying the nodes in the attribute graph to be processed based on the target characteristic representation of each node to obtain the classification result of each node.

In another aspect, an embodiment of the present application provides an apparatus for processing an attribute map, where the apparatus includes:

the data acquisition module is used for acquiring a training data set, each sample in the training data set comprises original information of a sample attribute graph, the original information comprises an adjacency matrix of the sample attribute graph and initial feature representation of each node in the sample attribute graph, and the initial feature representation of one node represents the attribute information of the node;

the initial information determining module is used for determining the initial graph structure characteristics of the graph based on at least one item of the original information and determining the first similarity between the nodes in the graph based on the initial characteristic representation of the nodes in the graph for each sample attribute graph;

the model training module is used for repeatedly executing the following training operations on the initial graph convolution neural network model based on the training data set until the training loss value meets the training ending condition to obtain the trained graph convolution neural network model:

Optionally, for each sample attribute graph, the model training module is configured to, when inputting the original information into the graph convolution neural network model to obtain the target feature representation of each node in the graph, and based on the target feature representation of each node in the graph, specifically:

based on the original information, the following operations are executed through an initial graph neural network model to obtain target feature representation of each node in the graph:

obtaining a first graph structural feature of the graph corresponding to at least two levels based on the adjacency matrix of the graph;

for the first graph structural feature of each hierarchy, obtaining a first feature matrix of the graph corresponding to the hierarchy based on an initial feature matrix of the graph and the first graph structural feature of the hierarchy, wherein the initial feature matrix comprises initial feature representations of nodes of the graph, and the first feature matrix comprises first feature representations of the nodes of the graph;

for each node in the graph, fusing first feature representations of the nodes of all levels in at least two levels to obtain second feature representations of the nodes;

and extracting the target feature representation of each node in the graph based on the second feature representation of each node in the sample attribute graph.

Optionally, when the graph-based adjacency matrix obtains the first graph structural features of the graph corresponding to the at least two levels, the model training module is specifically configured to:

obtaining a degree matrix of the graph;

adding the adjacency matrix and the unit matrix based on the graph to obtain a processed adjacency matrix;

normalizing the processed adjacency matrix based on the degree matrix of the graph to obtain a normalized adjacency matrix;

and taking the normalized adjacency matrix as a first graph structural feature of one level of the graph, and performing at least one self-multiplication operation on the normalized adjacency matrix to obtain the first graph structural feature of the graph corresponding to at least one level, wherein the self-multiplication powers of each multiplication process are different.

Optionally, for each sample attribute graph, when determining the target graph structural feature of the graph based on the target feature representation of each node in the graph, the model training module is specifically configured to:

determining a third similarity between the nodes in the graph based on the target feature representation of the nodes in the graph;

and determining the target graph structural characteristics of the graph based on the third similarity between the nodes in the graph.

Optionally, for each sample attribute graph, when determining the initial graph structural feature of the graph based on at least one item of the original information, the model training module is specifically configured to:

determining an initial graph structure characteristic of the graph based on the adjacency matrix of the graph;

and extracting to obtain a third feature representation of each node in the graph through a pre-trained feature extraction model based on the original information, and determining the initial graph structure feature of the graph based on the third feature representation of each node.

Optionally, when determining the initial graph structure feature of the graph based on the third feature representation of each node, the model training module is specifically configured to:

determining a fourth similarity between the nodes in the graph based on the third feature representation between the nodes;

based on a fourth similarity between nodes in the graph, determining an initial graph structure characteristic of the graph.

Optionally, for each sample attribute graph, when determining the first similarity between the nodes based on the initial feature representation of each node in the graph, the initial information determining module is specifically configured to:

for every two nodes in the graph, determining a first joint distribution probability based on standard Gaussian distribution between the nodes based on the initial characteristic representation of the two nodes in the graph, wherein the first joint distribution probability represents a first similarity between the two nodes;

for each sample attribute graph, when determining the second similarity between the nodes in the graph based on the target feature representation of the nodes in the graph, the model training module is specifically configured to:

for each two nodes in the graph, a second joint distribution probability based on t-distribution random neighbor embedding between the two nodes is determined based on the target feature representation of the two nodes in the graph, and the second joint distribution probability characterizes a second similarity between the two nodes.

Optionally, after each preset number of training operations is performed, for each sample attribute graph, inputting original information into the graph convolution neural network model to obtain a target feature representation of each node in the graph, where the initial model training module is further configured to:

dividing each node in the graph into at least two categories based on the target feature representation of each node in the graph;

for each of at least two classes, determining intra-class differences between nodes belonging to the class based on the target feature representation of the nodes belonging to the class, and determining inter-class differences between the different classes based on the target feature representation of the nodes belonging to the class and the target feature representation of the nodes belonging to each of the other classes, the other classes being classes of the at least two classes other than the class;

the model training module is specifically configured to, when determining a training loss value of the graph convolution neural network model based on the first difference and the second difference corresponding to each sample attribute graph:

and determining a training loss value of the graph convolution neural network model based on the first difference, the second difference, the intra-class difference and the inter-class difference corresponding to each sample attribute graph.

Optionally, for each of the at least two categories, the model training module is specifically configured to, when determining intra-class differences between nodes belonging to the categories based on the target feature representations of the nodes belonging to the categories, and determining inter-class differences between different categories based on the target feature representations of the nodes belonging to the categories and the target feature representations of the nodes belonging to each of the other categories, specifically:

determining a category characteristic representation of a category;

for each node belonging to the category, determining a third difference between the target feature representation of the node and the category feature representation of the category, and obtaining an intra-category difference between the nodes belonging to the category based on the third difference corresponding to each node belonging to the category;

and for each node belonging to the category, determining a fourth difference between the target characteristic representation of the node and the category characteristic identification of each category in other categories, and obtaining the inter-category difference between the category and each category in other categories based on the fourth difference corresponding to each node belonging to the category.

Optionally, the model training module is further configured to:

acquiring weights corresponding to the first difference, the second difference and the category difference respectively, wherein the category difference comprises an intra-category difference and an inter-category difference;

weighting the first difference, the second difference and the category difference based on the weights corresponding to the first difference, the second difference and the category difference respectively;

and summing the weighted first difference, the weighted second difference and the weighted category difference corresponding to each sample attribute map, and taking the sum as a training loss value.

the to-be-processed data acquisition module is used for acquiring the to-be-processed attribute graph and the attribute information of each node in the to-be-processed attribute graph;

the adjacency matrix determining module is used for determining an adjacency matrix of the attribute graph to be processed;

an initial feature representation determining module, configured to determine, for each node, an initial feature representation of the node based on the attribute information of the node;

and the target feature representation determining module is used for outputting the adjacency matrix and the initial feature representation of each node to a graph convolution neural network model to obtain the target feature representation of each node, wherein the graph convolution neural network model is obtained by training the graph convolution neural network model based on the mode of training the graph convolution neural network model described in the foregoing.

The apparatus also includes a classification module to:

In another aspect, an embodiment of the present application provides an electronic device, including a processor and a memory: the memory is configured to store a computer program which, when executed by the processor, causes the processor to perform the method of processing the property map as described above.

In still another aspect, an embodiment of the present application further provides a computer-readable storage medium for storing a computer program, which, when the computer program runs on a computer, causes the computer to execute the processing method of the property map.

A computer program product or computer program is provided that includes computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer readable storage medium, and the processor executes the computer instructions to cause the computer device to execute the processing method of the attribute map provided in any one of the alternative embodiments of the present application.

The technical scheme provided by the embodiment of the application has the following beneficial effects:

in the embodiment of the present application, for a sample attribute graph, because the initial graph structure feature of the graph represents the original graph structure information of the graph, and the initial feature of each node in the graph represents the attribute information of the node, a first similarity between nodes in the graph represents the similarity of the attribute information between nodes in the graph, at this time, training of the model is constrained by a difference between the initial graph structure feature and the target graph structure feature of the sample attribute graph and a difference between the first similarity and the second similarity between nodes in the graph, so that when the target feature representation of each node in the attribute graph is predicted by using the finally trained graph convolution neural network model, the obtained target feature representation can still retain the similarity of the original graph structure information of the attribute graph and the attribute information between nodes in the graph, therefore, when classification processing is carried out based on the target feature representation of the nodes, the accuracy of the classification result can be effectively improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings used in the description of the embodiments of the present application will be briefly described below.

Fig. 1 is a schematic flowchart of a processing method of an attribute map according to an embodiment of the present application;

FIG. 2 is a schematic flow chart illustrating a training operation performed on an initial convolutional neural network model according to an embodiment of the present disclosure;

FIG. 3 is a schematic diagram illustrating a computing principle of a convolutional neural network according to an embodiment of the present application;

FIG. 4 is a schematic diagram of an architecture of a training graph convolutional neural network provided in an embodiment of the present application;

FIG. 5 is a schematic diagram of a P profile and a Q profile provided by an embodiment of the present application;

fig. 6 is a schematic flowchart of another attribute map processing method according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of a processing apparatus of an attribute map according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of a processing apparatus of an attribute map according to an embodiment of the present application;

fig. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary only for the purpose of explaining the present application and are not to be construed as limiting the present application.

As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. As used herein, the term "and/or" includes all or any element and all combinations of one or more of the associated listed items.

Several terms that may be referred to in this application are first introduced and explained:

1. attribute graph: each node has its own graph structure of attributes (also referred to as features).

2. Clustering: the nodes in the graph are divided into different clusters, so that the nodes in the clusters are similar as much as possible, and the nodes among the clusters are relatively different as much as possible.

3. Clustering the attribute graph: and combining the graph structure information of the attribute graph and the attribute information of the nodes in the attribute graph to perform clustering.

Ps (parameter server): and the parameter server is used for storing (updating) the super-large scale parameters in a distributed manner in the field of machine learning.

Angel: a high performance distributed machine learning platform is developed based on the Parameter Server (Parameter Server) concept.

Spark: a fast general purpose computing engine designed specifically for large scale data processing.

Spark on Angel (SONA): a high-performance distributed computing platform combines the powerful Angel parameter server function with the large-scale data processing capability of Spark, and supports traditional machine learning, deep learning and various graph algorithms.

At present, when clustering is performed on an attribute graph, most of attribute graphs are based on a graph convolution neural network, high-dimensional feature representation of original nodes is compressed to a low-dimensional space through a method of reconstructing an adjacency matrix, and finally a final clustering result is generated by applying a traditional clustering algorithm on the basis of low-dimensional vector representation. However, according to the present invention, at present, when clustering attribute maps, at least the following problems exist:

(1) since the graph structure of the attribute graph can be divided into a low-level information part and a high-level information part, the prior art can only utilize the graph structure information of 1 st order, namely which nodes are directly connected, and does not utilize the high-level graph structure information. For example, two nodes having similar neighbors should obviously have similar low-dimensional feature representations, but the existing method does not have a constraint that the low-dimensional feature representations of the two nodes are as similar as possible in the training process, so that the part of information is lost.

(2) In addition to graph structure information, the input of the attribute graph also includes feature representations of nodes, and high-dimensional feature representations of the nodes have higher probability that similar nodes belong to the same cluster, so that keeping the similarity information in the high-dimensional feature representations is also beneficial to clustering results. However, in the prior art, the processing of the similarity information in the high-dimensional feature representation is equivalent to selecting only a part of positive samples with higher confidence (namely, connected nodes, which can be called as "false" positive samples), but the method has two problems, on one hand, node pairs belonging to different clusters exist in the positive samples, and the training of the "false" positive samples can cause the final clustering effect to slide down; on the other hand, there should be differences between different positive samples, and it is obvious that there is a defect in this method, considering that the probability that all connected nodes belong to the same cluster is 1 in the prior art;

(3) in practical application, edges in the attribute graph can be divided into two types, one type is an edge in a cluster, the other type is an edge between clusters, and the feature representation of a node for clustering in the prior art is relatively generalized low-dimensional node representation, corresponding optimization is not performed on a clustering task, and due to the existence of a false positive sample, the boundary between clusters tends to be fuzzy, and the clustering result is inaccurate.

Based on this, embodiments of the present application provide an attribute map processing method, an attribute map processing apparatus, an electronic device, and a readable storage medium, which are intended to solve some or all technical problems in the prior art.

Optionally, in this embodiment of the application, the obtained initial graph neural network model may be trained to obtain a trained graph neural network, and a target feature representation of each node in the attribute graph to be processed is obtained based on the trained graph neural network. The method specifically relates to a machine learning technology in an artificial intelligence technology when an initial graph neural network model is trained.

Among them, Artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making. The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

Machine Learning (ML) is a multi-domain cross subject, and relates to multiple subjects such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and formal education learning.

Optionally, the data processing/computing (e.g., computing for determining similarity between nodes) involved in the embodiment of the present application may be obtained based on a cloud computing manner. Cloud computing (cloud computing) is a computing model that distributes computing tasks over a resource pool formed by a large number of computers, so that various application systems can acquire computing power, storage space, and information services as needed. The network that provides the resources is referred to as the "cloud". Resources in the "cloud" appear to the user as being infinitely expandable and available at any time, available on demand, expandable at any time, and paid for on-demand.

As a basic capability provider of cloud computing, a cloud computing resource pool (called as a cloud Platform in general, an Infrastructure as a Service) Platform is established, and multiple types of virtual resources are deployed in the resource pool for selective use by external clients, the cloud computing resource pool mainly includes a computing device (including an operating system, for a virtualized machine), a storage device, and a network device, and is divided according to logical functions, a PaaS (Platform as a Service) layer may be deployed on an IaaS (Infrastructure as a Service) layer, a SaaS (Software as a Service) layer may be deployed on the PaaS layer, or the SaaS may be directly deployed on the IaaS layer, the PaaS may be a Platform running on Software, such as a web database, a container, and the like, as business Software of various websites, a web portal, and the like, SaaS and PaaS are upper layers relative to IaaS.

Optionally, in this embodiment of the present application, the training samples for the neural network model of the training graph may be stored based on a block chain. The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism and an encryption algorithm. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product services layer, and an application services layer.

The block chain underlying platform can comprise processing modules such as user management, basic service, intelligent contract and operation monitoring. The user management module is responsible for identity information management of all blockchain participants, and comprises public and private key generation maintenance (account management), key management, user real identity and blockchain address corresponding relation maintenance (authority management) and the like, and under the authorization condition, the user management module supervises and audits the transaction condition of certain real identities and provides rule configuration (wind control audit) of risk control; the basic service module is deployed on all block chain node equipment and used for verifying the validity of the service request, recording the service request to storage after consensus on the valid request is completed, for a new service request, the basic service firstly performs interface adaptation analysis and authentication processing (interface adaptation), then encrypts service information (consensus management) through a consensus algorithm, transmits the service information to a shared account (network communication) completely and consistently after encryption, and performs recording and storage; the intelligent contract module is responsible for registering and issuing contracts, triggering the contracts and executing the contracts, developers can define contract logics through a certain programming language, issue the contract logics to a block chain (contract registration), call keys or other event triggering and executing according to the logics of contract clauses, complete the contract logics and simultaneously provide the function of upgrading and canceling the contracts; the operation monitoring module is mainly responsible for deployment, configuration modification, contract setting, cloud adaptation in the product release process and visual output of real-time states in product operation, such as: alarm, monitoring network conditions, monitoring node equipment health status, and the like.

The platform product service layer provides basic capability and an implementation framework of typical application, and developers can complete block chain implementation of business logic based on the basic capability and the characteristics of the superposed business. The application service layer provides the application service based on the block chain scheme for the business participants to use.

The following describes the technical solutions of the present application and how to solve the above technical problems with specific embodiments. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments. Embodiments of the present application will be described below with reference to the accompanying drawings.

Optionally, the method provided by the present application may be executed by any electronic device, for example, the electronic device may be a server or a terminal device. The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud computing services. The terminal device may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, and the like. The terminal device and the server may be directly or indirectly connected through wired or wireless communication, and the application is not limited herein.

Fig. 1 is a flowchart illustrating a method for processing an attribute map provided in an embodiment of the present application. As shown in fig. 1, the method includes:

step S101, a training data set is obtained, each sample in the training data set comprises original information of a sample attribute graph, the original information comprises an adjacency matrix of the sample attribute graph and initial feature representations of all nodes in the sample attribute graph, and the initial feature representation of one node represents attribute information of the node.

The training data set refers to a data set used for training an initial atlas neural network model, and the training data set includes a large number of training samples. In the embodiment of the present application, the original information of a sample attribute map is used as a sample in a training data set. The original information of a sample attribute graph may include an adjacency matrix of the sample attribute graph and initial feature representations of nodes in the sample attribute graph, where an initial feature representation of a node characterizes the attribute information of the node, and the initial feature representations of nodes in the sample attribute graph may form an initial feature matrix, where each row in the initial feature matrix may represent an initial feature representation (i.e., an initial feature vector) of a node, the number of rows of the matrix is equal to the total number of nodes in the sample attribute graph, and the number of columns is a feature dimension represented by the initial feature, or the number of columns of the matrix is equal to the total number of nodes, each column represents an initial feature representation of a node, and the number of rows is a feature dimension represented by the initial feature.

In general, the initial feature representation of each node in the attribute graph is usually a high-dimensional feature vector, and for subsequent graph processing tasks, such as classifying the nodes included in the graph, the initial feature vector of each node in the graph needs to be processed to generate a low-dimensional node representation (i.e., a target feature representation). It should be noted that the above-mentioned high dimension and low dimension are relative concepts, for example, the dimension of the initial feature representation and the target feature representation of a node may be feature vectors of different orders of magnitude, for example, the feature dimension of the initial feature representation is tens of dimensions (i.e., the feature vector is a vector containing tens of feature values), and the feature dimension of the target feature representation is several dimensions (i.e., the feature vector is a vector containing several feature values).

For an attribute graph, a one-dimensional array may be used to store all vertex data in the graph, and a two-dimensional array may be used to store data of relationships (edges or arcs) between vertices, where the two-dimensional array obtained at this time is called as an adjacency matrix of the attribute graph, that is, a many-to-many nonlinear structure of the graph is expressed in a matrix manner, and specifically represents a connection situation of nodes in the graph. It will be appreciated that the meaning of the nodes and edges between the nodes in the sample property graph may well be different for different application scenarios. For example, for a social application, the sample attribute graph may be a graph structure corresponding to a user in the application, each node in the graph corresponds to a user of the application, if there is an association relationship between two users (e.g., information interaction is performed, such as information is sent to each other), there is a connecting edge between nodes corresponding to two users, and the attribute information corresponding to each node may be attribute information of the user corresponding to the node, such as age, gender, and preference of the user.

Step S102, for each sample attribute graph, determining initial graph structure characteristics of the graph based on at least one item in the original information, and determining first similarity between nodes in the graph based on initial characteristic representation of the nodes in the graph.

A graph structure is a more complex nonlinear structure (i.e., a discrete structure) than a tree structure, and the basic composition of the graph includes nodes in the graph and connected edges between the nodes. For a graph, a graph structural feature (i.e., a structural feature of a graph) refers to a feature vector that can characterize the overall structure of the graph. Since the initial feature representation of each node in the graph represents the attribute information of each node in the graph, and the adjacency matrix of the graph is obtained based on the attribute information of each node in the graph and the connection relationship between the nodes, for a sample attribute graph, the initial graph structure feature of the graph can be determined based on at least one of the adjacency matrix of the sample attribute graph or the initial feature representation of each node in the graph.

For a sample attribute graph, a specific manner of determining a first similarity between nodes based on an initial feature representation of each node in the graph is not limited in the embodiments of the present application. For example, for any two nodes in the graph, the similarity between the two nodes can be determined by calculating the distance (e.g., euclidean distance) between the initial feature representations of the two nodes.

And step S103, repeatedly executing the following training operation on the initial graph convolution neural network model based on the training data set until the training loss value meets the training ending condition to obtain the trained graph convolution neural network model.

The above-mentioned training operation repeatedly performed on the initial convolutional neural network model is shown in fig. 2, and may include steps S201 to S203:

step S201, inputting original information into the graph convolution neural network model for each sample attribute graph to obtain target feature representation of each node in the graph, and determining the target graph structure feature of the graph and the second similarity between the nodes in the graph based on the target feature representation of each node in the graph.

In step S202, for each sample attribute graph, a first difference between the initial graph structural feature and the target graph structural feature of the graph and a second difference between a first similarity and a second similarity between nodes in the graph are determined.

Step S203, determining a training loss value of the graph convolution neural network model based on the first difference and the second difference corresponding to each sample attribute graph, and if the training loss value does not meet the training end condition, adjusting model parameters of the graph convolution neural network.

After the training data set is obtained, each sample in the training data set (i.e., the original information of each sample attribute graph) may be input into the graph convolution neural network model, and the graph convolution neural network model may output a target feature representation of each node in each sample attribute graph. The target feature representation is a feature vector of each node obtained through model prediction, wherein the dimension of the target feature representation of one node is smaller than that of the initial feature representation of the node.

For each sample attribute graph, after obtaining the target feature representation of each node in the graph, the target graph structure feature of the graph may be determined based on the target feature representation of each node in the graph, and the similarity (i.e., the second similarity) between each node in the graph may be determined based on the target feature representation of each node in the graph.

For a sample attribute graph, the initial graph structure features of the graph characterize the original graph structure information of the graph, and the initial features of each node in the graph represent the attribute information of the node, so the first similarity between the nodes in the graph characterizes the similarity of the attribute information between the nodes in the graph. In order to enable the finally trained graph convolution neural network model to still retain the original graph structure information of the attribute graph and the similarity of the attribute information between the nodes in the graph when predicting the target feature representation of the nodes in the attribute graph, the training of the model can be constrained by the difference between the initial graph structure feature and the target graph structure feature of the sample attribute graph and the difference between the first similarity and the second similarity between the nodes in the graph.

The training end condition may be preset according to an actual requirement, for example, the training loss value may be set to be smaller than a preset threshold, and the like. After the training loss value of the graph convolution neural network is obtained through calculation based on the first difference and the second difference corresponding to all the sample attribute graphs, if the training loss value does not meet the training ending condition, the model parameter of the current graph convolution neural network does not meet the requirement, the graph convolution neural network still needs to be trained continuously, the model parameter of the graph convolution neural network can be adjusted, and the training operation is continuously repeated on the graph convolution neural network after the model parameter is adjusted based on the training data set until the training loss value meets the training ending condition.

In practical applications, different application scenarios and application requirements may also have different requirements on model performance of the graph convolution neural network model, for example, for the above two layers of similarity between graph structure information and attribute information between nodes in the graph, some scenarios may focus on the graph structure information more, and some scenarios may focus on the similarity between the attribute information between the nodes in the graph more. To better meet these application requirements, different levels of training loss may be given different weights. Optionally, the determining a training loss value of the graph convolution neural network model based on the first difference and the second difference corresponding to each sample attribute graph may include:

acquiring a first weight corresponding to the first difference and a second weight corresponding to the second difference;

weighting the first difference of each sample attribute graph according to the first weight, and weighting the second difference of each sample attribute graph according to the second weight;

and summing the weighted first difference and the weighted second difference corresponding to each sample attribute graph, and obtaining a training loss value of the graph convolution neural network model based on a summation result.

Optionally, when different weights may be given to training losses of different layers, a first weight corresponding to the first difference and a second weight corresponding to the second difference may be obtained, then the first difference of each sample attribute diagram is weighted according to the first weight and the second difference of each sample attribute diagram is weighted according to the second weight, then the first difference and the second difference obtained after weighting are summed, and at this time, the sum obtained is a training loss value of the graph convolution neural network model.

In an alternative embodiment of the present application, for each sample attribute graph, inputting original information into a graph convolution neural network model to obtain a target feature representation of each node in the graph, and based on the target feature representation of each node in the graph, the method includes:

for the first graph structural feature of each hierarchy, obtaining a first feature matrix of the graph corresponding to the hierarchy based on the initial feature matrix of the graph and the first graph structural feature of the hierarchy, wherein the initial feature matrix comprises initial feature representation of each node of the graph, and the first feature matrix comprises first feature representation of each node of the graph;

The adjacency matrix of the graph is obtained according to the association relationship between the attribute information of each node in the graph and the attribute information of each node, at this time, the adjacency matrix of the graph can be used as a graph structure feature of a basis of the graph, that is, can be used as graph structure information of a low order (that is, a low hierarchy), and the initial feature representation of the node in the graph is an initial feature vector of the node, so that the target feature representation fusing the attribute information of the node and the association relationship between the nodes can be obtained through the graph convolution neural network model by using the adjacency matrix containing the graph structure information and the initial feature representation of each node in the graph.

As can be seen from the foregoing description, since the adjacency matrix of the graph utilizes the graph structure information at the lower level in the graph, that is, which nodes in the graph are directly connected to each other, and does not utilize the graph structure information at the higher level in the graph, in order to obtain the target feature of the node with better feature expression capability, the higher-level graph structure information of the graph can be obtained based on the adjacency matrix of the graph, and the feature representations (i.e., the first feature representations) of the nodes corresponding to the respective hierarchies can be obtained based on the initial feature representations of the nodes in the graph and the graph structure feature of each hierarchy, and then, for each node in the graph, the second feature representations of the nodes and the graph structure information of the multiple hierarchies can be simultaneously obtained by fusing the feature representations of the nodes of each hierarchy in at least two hierarchies, and further based on the second feature representations of the nodes in the sample attribute graph, target feature representations of the nodes in the graph can be extracted.

It can be seen that, in the embodiment of the present application, the second feature representation of each node is the first graph structure feature in which at least two hierarchies are fused, and the first graph structure features of at least two hierarchies characterize the graph structure features of different hierarchies, so that when the target feature representation of each node in the graph is extracted based on the second feature representation of each node in the sample attribute graph, information characterized by the obtained target feature representation of each node will be richer and have better feature expression capability, and when further processing is performed based on the target feature representation, the processing effect can be further improved.

In an alternative embodiment of the present application, obtaining a first graph structural feature of a graph corresponding to at least two levels based on a graph adjacency matrix includes:

obtaining a degree matrix of the graph;

For a sample attribute graph, the degree matrix of the graph refers to a matrix for describing the degree of each node in the graph, the degree of the node represents the number of edges connected with the node, and the degree matrix is a diagonal matrix.

Optionally, for a sample attribute graph, because the opposite angles of the adjacency matrix of the graph are all 0, the inner product of the sum and the feature matrix is equivalent to weighted sum of the adjacency matrix, at this time, the value of the node feature becomes the weight of the adjacency matrix, and the feature of the node feature is ignored. To avoid this, the processed adjacency matrix may be obtained by adding the unit matrix and the adjacency matrix and the unit matrix of the figure, where the diagonal element of the processed adjacency matrix becomes 1. However, since the adjacency matrix a is not subjected to normalization processing, it is difficult to limit the data to a certain range, and in this case, the data can be made comparable by the normalization processing, but the relationship between the data is relatively maintained. Therefore, in order to avoid the original distribution of the feature from being changed by multiplying the adjacency matrix by the feature matrix inner product, it is necessary to normalize the processed adjacency matrix based on the degree matrix pair of the graph to obtain a normalized adjacency matrix, and the normalized adjacency matrix obtained at this time can be regarded as the first graph structural feature of one level of the graph. Furthermore, in order to obtain more levels of first graph structural features, at least one time of self-multiplication operations with different powers can be performed on the normalized adjacent matrix to obtain at least one operation result, and each operation result obtained at this time is regarded as the graph corresponding to one level of first graph structural features. For example, for the normalized adjacent matrix, 3 times at least one time of the self-multiplication operations with different powers may be performed on the pair of normalized adjacent matrices, in this case, the self-multiplication operation with the power of one time performed on the pair of normalized adjacent matrices may be regarded as the first graph structural feature of the first hierarchy, the self-multiplication operation with the power of two performed on the pair of normalized adjacent matrices may be regarded as the first graph structural feature of the second hierarchy, and the self-multiplication operation with the power of three performed on the pair of normalized adjacent matrices may be regarded as the first graph structural feature of the third hierarchy.

In an alternative embodiment of the present application, for each sample attribute graph, determining a target graph structural feature of the graph based on the target feature representation of each node in the graph includes:

For a sample attribute graph, the higher the similarity of any two nodes in the graph, the more similar the graph structures corresponding to the two nodes in the graph are, so that the target graph structure features of the graph can be determined based on the third similarity between the nodes in the attribute graph. Optionally, the similarity matrix corresponding to the sample attribute graph may be used as the target graph structural feature of the graph, or the element values (i.e., the third similarities) in the similarity matrix are normalized, and the similarity matrix after the normalization is used as the target graph structural feature of the graph, where the similarity matrix includes the third similarities between the nodes in the sample attribute graph.

The embodiment of the present application is not limited to the specific manner of determining the third similarity between the nodes based on the target feature representation of each node, for example, the cosine similarity between the target feature representations of two nodes may be used to measure the third similarity between the two nodes. Optionally, for any sample attribute graph, assuming that a target feature matrix of the graph is represented by Z, and the matrix includes a target feature representation of each node in the graph (e.g., a target feature representation of one node corresponding to each row), the structural feature of the target graph of the sample attribute graph may be represented as follows:

wherein S represents the structural feature of the target graph, Z represents a target feature matrix containing the target feature representation of each node in the sample attribute graph, and Z^TPerforming dimension transformation on the target feature matrix,

which means that the 2 norm of Z is squared.

In an optional embodiment of the present application, for each sample property graph, determining an initial graph structure feature of the graph based on at least one item of the original information includes any one of:

Optionally, the initial graph structure characteristic of a sample attribute map may be determined based on different manners, and an optional manner is to determine the initial graph structure characteristic of the sample attribute map directly based on the adjacency matrix of the map, for example, the adjacency matrix of the sample attribute map may be used as the initial graph structure characteristic of the sample attribute map, or the adjacency matrix of the sample attribute map may be obtained by further processing, for example, the initial graph structure characteristic of the sample attribute map may be obtained by normalizing the adjacency matrix of the sample attribute map and using the normalized adjacency matrix as the initial graph structure characteristic of the sample attribute map, or by self-multiplying the normalized adjacency matrix, where the normalized adjacency matrix is subjected to self-multiplication processing, which is not limited in the embodiments of the present application, the method of normalizing the adjacency matrix of the sample attribute map may be to add the adjacency matrix and the identity matrix of the sample attribute map to obtain a processed adjacency matrix, and normalize the processed adjacency matrix based on the degree matrix of the sample attribute map to obtain a normalized adjacency matrix.

Another optional implementation manner is that a pre-trained feature extraction model may be obtained, then the original information of the sample attribute graph is input to the feature extraction model, feature extraction is performed on the original information of the sample attribute graph through the feature extraction model to obtain a first feature representation of each node in the sample attribute graph, then an obtained third feature representation based on each node is extracted, and the initial graph structure feature of the sample attribute graph is determined. The feature extraction model may refer to a model capable of extracting the high-order graph structure information again, where a specific model structure of the feature extraction model is not limited in this embodiment, and may be, for example, a deep walk model.

In an alternative embodiment of the present application, determining an initial graph structure feature of the graph based on the third feature representation of each node includes:

For a sample attribute graph, the higher the similarity of any two nodes in the graph, the more similar the corresponding graph structures of the two nodes in the graph, so that the fourth similarity between the nodes in the graph can be determined based on the third feature representation between the nodes, and the initial graph structure feature of the graph can be determined based on the fourth similarity between the nodes in the attribute graph. Optionally, the similarity matrix corresponding to the sample attribute graph may be used as the initial graph structure feature of the graph, or the element values (i.e., the fourth similarity) in the similarity matrix are normalized, and the similarity matrix after the normalization processing is used as the initial graph structure feature of the graph, where the similarity matrix includes the fourth similarity between nodes in the sample attribute graph.

In an alternative embodiment of the present application, for each sample attribute graph, determining a first similarity between nodes based on an initial feature representation of the nodes in the graph includes:

for each sample attribute graph, determining a second similarity between nodes in the graph based on the target feature representation of the nodes in the graph, including:

Optionally, when determining the similarity between the nodes, the euclidean distance between the nodes may be converted into a joint probability distribution, and the similarity between the nodes is represented by each probability value in the joint probability distribution. At this time, for each sample attribute graph, for each two nodes in the graph, a first joint distribution probability based on a standard gaussian distribution (i.e., P distribution) between the nodes may be determined based on the initial feature representation of the two nodes, where the first joint distribution probability characterizes a first similarity between the two nodes, and after obtaining the target feature representation of each node in the graph, for each two nodes in the graph, a second joint distribution probability based on t-distribution random neighbor embedding (i.e., Q distribution in the following text) between the two nodes may be determined based on the target feature representation of each node in the graph, where the second joint distribution probability may be regarded as a second similarity between the two nodes.

In an optional embodiment of the present application, after each preset number of training operations is performed, for each sample attribute graph, inputting original information into a graph convolution neural network model to obtain a target feature representation of each node in the graph, the method further includes:

determining a training loss value of the graph convolution neural network model based on the first difference and the second difference corresponding to each sample attribute graph, wherein the training loss value comprises the following steps:

Optionally, for each sample attribute graph, after the target feature representation of each node in the graph is obtained, the nodes in the graph may be classified based on the target feature representation of each node, and each node is classified into at least two categories. When classifying the types of the nodes in the graph, a preset clustering algorithm can be selected according to requirements for clustering, for example, a K-Means algorithm can be adopted, and the embodiment of the application is not limited.

Further, for each class obtained by the division, the difference between the nodes belonging to the class may be determined, that is, the intra-class difference of the class is determined, and the intra-class difference may be determined based on the target feature representation of the nodes belonging to the class. Similarly, differences between different classes (i.e., inter-class differences) may also be determined for different classes, and in this case, when determining inter-class differences between a certain class and other classes, the differences may be determined based on the target feature representation of each node belonging to the class and the target feature representations of each node belonging to each of the other classes.

Accordingly, after the intra-class difference of each category corresponding to each sample attribute map and the inter-class difference between different categories are determined, the training loss value of the graph convolution neural network model may be determined based on the first difference, the second difference, the intra-class difference, and the inter-class difference corresponding to each sample attribute map.

In an alternative embodiment of the present application, for each of at least two classes, determining intra-class differences between nodes belonging to the class based on the target feature representation of the nodes belonging to the class, and determining inter-class differences between the different classes based on the target feature representation of the nodes belonging to the class and the target feature representation of the nodes belonging to each of the other classes, comprises:

determining a category characteristic representation of a category;

The class feature representation of a class is a class feature vector for representing a class, and characterizes the commonality between nodes belonging to the class, and the class feature representation of a class can be understood as a center feature vector of the class, i.e., a class center (i.e., a cluster center or a cluster centroid, each class can be referred to as a cluster).

Optionally, when determining the intra-class difference of a class, the class feature representation of the class may be determined, and the class feature of the class may be determined according to the target feature representation of the node belonging to the class, or may be determined in the process of performing class division on each node. A third difference between the target characteristic representation of each node belonging to the category and the category characteristic representation of the category may then be determined, and further based on the third difference between the target characteristic representation of each node and the category characteristic representation of the category, an intra-class difference between the nodes belonging to the category may be obtained. When the inter-class difference between one class and each of the other classes is determined, the difference (i.e., the fourth difference) between the target feature representation of each node belonging to the class and the class feature identifier of each of the other classes may be determined, and then the inter-class difference between the class and each of the other classes is obtained based on the fourth difference corresponding to each node belonging to the class.

In an alternative embodiment of the present application, the method further comprises:

In practical applications, different application scenarios and application requirements may also have different requirements on model performance of the graph convolution neural network model, for example, for the three layers of the graph structure information, similarity of attribute information between nodes in the graph, and category results, some scenarios may focus more on the graph structure information, and some scenarios may focus more on differences within each category result and differences between each category result. To better meet these application requirements, different levels of training loss may be given different weights.

Optionally, the intra-class difference and the inter-class difference may be divided into class differences, at this time, respective weight values of the first difference, the second difference, and the class differences may be obtained, then the first difference, the second difference, and the class differences are weighted based on the respective weights corresponding to the first difference, the second difference, and the class differences, then the weighted first difference, the weighted second difference, and the weighted class differences corresponding to each sample attribute map are summed, and an obtained sum is used as a training loss value.

In order to better understand the process of training the graph convolution neural network model in the embodiment of the present application, the following describes the training method in detail. In the prior art, the input of the graph convolution neural network is an adjacency matrix of an attribute graph and an initial feature representation of each node in the attribute graph, wherein a graph convolution neural network formula is as follows:

wherein, X^(l)And X^(l+1)Respectively, an object feature matrix including object feature representations of respective nodes obtained based on the l-th layer and the l + 1-th layer in the graph convolution neural network, and when l is 1, X is^(l)For an initial feature matrix comprising an initial feature representation of each node in the attribute map, I_NIs a matrix of the units,

is a matrix of the degrees, and the degree matrix,

to represent

The value of the element in the ith row and ith column, j represents

The number of columns in (1) is,

presentation pair

The elements in the ith row are summed by column, A is the adjacency matrix of the attribute map, W^(l)σ represents the nonlinear activation function for the weight matrix (i.e., model parameters) of the l-th layer. At this time, as can be seen from the formula, if the graph convolution neural network includes K layers, the target feature representation of each node obtained by the graph convolution neural network can capture graph information within the K layers.

However, since each node needs many other node feature representations in the training process of each layer of the graph convolution neural network, there is a problem that the expansibility is weak, and the efficiency is low when facing an attribute graph with a larger scale. Therefore, a decoupled atlas neural network is provided, and based on the atlas neural network formula, the un-decoupled atlas neural network can be seenEach layer can be divided into two processes, the first being a left multiplication

I.e. to realize the feature representation (P) of the aggregation neighbor nodes, and the second step is to right-multiply the weight matrix W^(l)That is, a dimension Transformation (abbreviated as T) is implemented, and the decoupled graph convolution neural network can decompose the two steps, and all the left-multiplication operations are completed first

Then, the operation is followed by a general logistic regression (i.e. dimension transformation), and the difference between the decoupled and un-decoupled convolutional neural networks can be seen from fig. 3, as shown in fig. 3, it can be known from fig. 3 that P and T in the un-decoupled convolutional neural network are sequentially adjacent, i.e. each time the left multiplication is performed first

Right multiplying by weight matrix W^(l)And all P in the decoupled graph convolution neural network are adjacent to all T in sequence, namely all left multiplication is realized

Then followed by a normal logistic regression. Correspondingly, the training cost of the decoupled graph convolution neural network is much lower than that of an un-decoupled graph convolution neural network, the graph convolution neural network has stronger expansibility, and good efficiency can be achieved when an attribute graph with larger scale is faced. If we assume that the un-decoupled convolutional neural network has k layers, then the decoupled convolutional neural network formula is as follows:

wherein Z represents an object feature matrix containing object feature representations of nodes in the attribute graph,

presentation pair

To the k power of an operation, the

May be based on the foregoing in relation to

The formula is calculated, and is not described herein again, X represents an initial feature matrix including initial feature representation of each node in the attribute graph, W represents a weight matrix (i.e., a model parameter), and σ represents a nonlinear activation function.

Optionally, the convolution neural network model in this example may be a decoupled convolution neural network, or may be in an un-decoupled convolution neural network, and the following takes the convolution neural network model as a decoupled convolution neural network as an example, and a process of training the decoupled convolution neural network is described in detail.

As shown in fig. 4, the embodiment of the present application provides an architecture diagram for training the decoupled atlas neural network. Optionally, when the decoupled graph convolution neural network is trained, the decoupled graph convolution neural network may be trained by using a Spark training framework through a PS based on angels to obtain the trained graph convolution neural network, or the decoupled graph convolution neural network may be trained by using a Spark on angels (SONA) distributed computing platform with high performance to obtain the trained graph convolution neural network, and then the target feature representation of each node in the attribute graph is obtained based on the trained graph convolution neural network.

In this example, the obtained original information of the sample attribute graph includes an adjacency matrix a of the sample attribute graph and an initial feature matrix X including an initial feature representation of each node in the sample attribute graph, where the initial feature representation of each node in the sample attribute graph and the adjacency matrix of the sample attribute graph may be obtained by performing information extraction on the sample attribute graph based on the information extractor. Further, each sample may be analyzedAnd inputting the original information of the attribute graph into a decoupled graph convolution neural network to obtain the target characteristic representation of each node in each sample attribute graph. Further, a target feature matrix Z corresponding to the sample attribute graph can be obtained based on the target feature representation of each node in the sample attribute graph, and then based on the target feature matrix Z corresponding to each sample attribute graph and the initial feature matrix X corresponding to each sample attribute graph, the adjacency matrix a corresponding to each sample attribute graph respectively determines the loss of the node feature representation part (through L)_sPresentation), loss of part of the auto-supervised training (by L)_cPresentation) and graph structure partial loss (L)_pRepresentation) to achieve unwrapping training of the decoupled graph-rolled neural network until convergence of the corresponding training loss value. When partial loss of the node feature representation is performed, the joint distribution probability of P distribution and the joint distribution probability of Q distribution can be calculated based on a target feature matrix Z corresponding to each sample attribute graph and an initial feature matrix X corresponding to each sample attribute graph, and then the decoupled graph convolution neural network is subjected to unfolding training based on the obtained result; when the self-supervision training part is lost, clustering is carried out on the basis of the target characteristic matrix Z corresponding to each sample attribute graph to obtain a clustering result, and then the decoupled graph convolution neural network is unfolded and trained on the basis of the obtained clustering result and the target characteristic matrix Z corresponding to each sample attribute graph; when partial loss of the graph structure is carried out, the similarity between nodes represented by target features can be determined based on the target feature matrix Z corresponding to each sample attribute graph, the similarity between nodes represented by initial features can be determined based on the adjacent matrix A of each sample attribute graph, and then the decoupled graph convolution neural network is unfolded and trained based on the obtained similarity difference.

Optionally, for each sample attribute map, a degree matrix of the map may be obtained

Then, the adjacency matrix A and the unit matrix I of the graph are combined_NAdding to obtain processed adjacency matrix

And standardizing the processed adjacent matrix based on the degree matrix of the sample attribute graph to obtain a standardized adjacent matrix

Further to normalized adjacency matrix

Performing self-multiplication operation for k times to obtain the first graph structure characteristics of the graph corresponding to k levels respectively

For each node, an initial feature matrix X and a first graph structure feature of each level can be set

Respectively fusing to obtain the first feature representation of the node

The first feature representations of the node at each level can then be stitched to obtain a second feature representation of the node

The formula of the decoupled graph convolution neural network is incorporated into the formula of the graph convolution neural network in the foregoing to obtain the target characteristic representation Z of the node, and the formula of the decoupled graph convolution neural network can be represented as

In particular, the following represents partial loss (by L) for a node by determining node characteristics_sPresentation), loss of part of the auto-supervised training (by L)_cPresentation) and graph structure partial loss (L)_pRepresentation) to implement a process for developing the decoupled atlas neural network unwrapping trainingAnd (5) clearing.

(1) Determining graph structure part loss:

specifically, for each sample property graph, the initial graph structure feature may be determined based on the adjacency matrix a of the graph (specifically, the initial graph structure feature may be determined by using

Characterization) of

The calculation method can be referred to the above description, and is not repeated herein. Further, a target graph structure feature of the graph may be determined based on the target feature representation Z of each node in the graph. When determining the structural features of the target graph, the structural features of the target graph can be specifically determined by the following formula:

wherein S represents the structural feature of the target graph, Z represents a target feature matrix containing the target feature representation of each node in the sample attribute graph, and Z^TRepresenting a dimensional transformation of the target feature matrix,

means squaring the 2 norm of Z

Further, a first difference between the initial graph structural feature and the target graph structural feature of the graph may be determined as a graph structural part loss, and specifically may be determined based on the following formula:

wherein L is_pValues representing loss of structural parts of the graph, N representing the number of nodes, S representing structural characteristics of the target graph,

Shows an initial diagramThe characteristics of the structure are that,

represent the pair

Each element in the obtained matrix is subjected to square operation and then subjected to summation processing, and then the obtained sum is subjected to evolution processing.

Optionally, when determining the initial graph structural feature of the graph, the adjacency matrix a of the graph and the initial feature representation of each node in the graph may be input to another model that can be pre-trained, so as to obtain a matrix Z including the third feature representation of each node in the graph_DWThen Z is_DWSubstitution formula

In the method, an initial graph structural feature S is obtained_DWFurther, an initial graph structure characteristic S of the graph may be determined_DWAnd the first difference between the target graph structure characteristic S is used as the graph structure part loss, and can be specifically determined based on the following formula:

wherein L is_pValues representing losses of structural parts of the graph, N representing the number of nodes, S representing structural characteristics of the target graph, S_DWShowing the structural features of the initial graph,

represents the pair S-S_DWEach element in the obtained matrix is subjected to square operation and then subjected to summation processing, and then the obtained sum is subjected to evolution processing.

In the embodiment of the present application,

to the k power of

Contains the graph structure information within the k layer of the decoupled graph convolution neural network

Row i of (1), when

The larger the value of (a) is, the more similar the graph structures of the node i and the node j in the k layer are. In this example, the model is trained by determining the difference between the initial graph structural feature and the target graph structural feature by using the power of the adjacency matrix or other pre-trained models, so that the target feature representation of each node acquired by the decoupled graph convolution neural network retains the graph structural information in the initial feature representation.

(2) Determining that the node signature represents a partial loss:

in practical applications, since the initial feature representation of a node has a relatively large difference (typically one to two orders of magnitude) from the final target feature representation in dimension, and the capacity of the target feature representation is much larger than that of the initial feature representation, for a node using the target feature representation, all other nodes can be abstractly classified into 3 classes according to the distance from the node: the near nodes, the medium-distance nodes and the far nodes are represented by adopting the target characteristics and the initial characteristics, and at the moment, because of the huge difference of the capacities when the target characteristics are adopted for representation and when the initial characteristics are adopted for representation, a part of the medium-distance nodes originally adopting the initial characteristics are possibly changed into the far nodes when the target characteristics are adopted for representation, so that the clustering structure when the initial characteristics are adopted for representation is damaged, and the clustering difficulty is increased. Based on this, in the embodiment of the application, the method of a classical visualization method t-SNE can be simulated, and euclidean distances between nodes in a sample attribute graph are converted into joint probability distribution, so as to represent the similarity between the nodes. Specifically, in the initial feature representation, the similarity (i.e., the first similarity) between the nodes may be represented based on a gaussian distribution joint distribution probability, and in the target feature representation, the similarity (i.e., the second similarity) between the nodes may be represented based on a t distribution (i.e., a Q distribution in fig. 4) joint distribution probability of one of the heavy tail distributions ", which may be specifically shown in fig. 5. As can be seen from fig. 5, when the degree of freedom of the t distribution is 1 (that is, df is 1) as compared with the degree of freedom of 10 (that is, df is 10), the distribution is similar to the standard gaussian distribution, and the probability distribution function decreases more slowly in the second half, and in this case, the Q distribution can be as close as possible to the P distribution (that is, gaussian distribution), and the similarity information when the initial features of each node are retained in the target feature representation of each node is ensured.

Wherein, the gaussian distribution joint distribution probability can be obtained by the following formula:

wherein p is_ijRepresenting the similarity between the initial feature representation of node i and the initial feature representation of node j, X_iRepresenting an initial feature representation, X, of node i_jRepresenting an initial characteristic representation, X, of node j_kRepresenting an initial characteristic representation, X, of node k_lRepresents the initial characteristic representation of node i, and σ represents the nonlinear activation function.

The Q-distribution joint distribution probability can be obtained by the following formula:

wherein q is_ijSimilarity between the target feature representation representing node i and the target feature representation representing node j, Z_iRepresenting the target feature representation, Z, of node i_jRepresenting the target feature representation of node j, Z_kRepresenting the target feature representation of node k, Z_lRepresenting the target feature representation of node i.

Further, the difference (i.e., the second difference) between the gaussian distribution joint distribution probability and the Q distribution joint distribution probability between the nodes in the sample attribute graph may be determined and used as the node feature to represent the partial loss, and the decoupled graph convolution neural network may be trained.

In this example, the KL divergence may be used as a node characteristic to represent the partial loss, which may be specifically expressed by the following formula:

L_s＝KL(P||Q)。

wherein L is_sThe expression node feature represents a value of partial loss, P represents a gaussian joint distribution probability of each node, and Q represents a Q distribution joint distribution probability of each node.

(3) Determining a loss of a portion of the unsupervised training:

specifically, after the target feature representation of each node in the sample attribute graph is obtained, the K-Means clustering algorithm may be run once after a fixed iteration interval (5 iterations or 10 iterations) based on the obtained target feature representation of each node in the sample attribute graph, and each node in the sample attribute graph is divided into at least two categories. Further, the classification result obtained by the classification and the target feature representation of the node belonging to each classification can be used for obtaining the classification difference corresponding to the classification result and losing the classification difference as the self-supervision training part. The category differences corresponding to the category results include intra-category differences between nodes in each category and inter-category differences between each category and each category in other categories, and specifically, a value of a loss of the self-supervision training part may be determined based on the following formula:

wherein the content of the first and second substances,

indicating inter-class differences between each class and ones of the other classes,

representing intra-class differences between nodes within each class, gamma representing a hyper-parameter, N representing the number of nodes, C_jRepresenting the node in class j as the centroid,

representing the target feature representation of a node in class j as centroid, Z_iRepresenting the target feature representation, Y, of node i_iRepresents a node that is a centroid in the class to which node i belongs,

and k represents the number of layers included in the decoupled graph convolution neural network.

In the embodiment of the present application, when the model is trained based on the partial loss of the self-supervised training, for each node, the distance from the node to the centroid of the class can be minimized based on the obtained classification result, and the distance from the node to all other classes can be maximized, that is, two forces are generated, one force makes each point closer to the class to which the node belongs, and the other force makes each point farther from other classes, and the two forces make the boundary between the classes clear. In addition, the decoupled atlas neural network can be optimized to a certain extent aiming at the clustering task through an automatic supervision training method, and the potential of the decoupled atlas neural network aiming at the clustering task is further developed.

Further, after the node characteristic representation part loss, the self-supervision training part loss and the graph structure part loss are obtained, a value of a loss function of the decoupled graph convolution neural network can be obtained based on the node characteristic representation part loss, the self-supervision training part loss and the graph structure part loss, and then the decoupled graph convolution neural network is trained based on the value of the loss function until a training end condition is met, so that the trained graph convolution neural network is obtained. Wherein the values of the loss function of the decoupled graph convolution neural network are as follows:

L＝L_s+αL_p+βL_c

wherein L represents the value of the loss function of the decoupled convolutional neural network, and α and β representWeight values, which can be adjusted according to different data sets, L_sValue, L, representing a node characteristic representing a partial loss_pValue L representing loss of structural part of the graph_cRepresenting the value lost by the self-supervised training portion.

In the application, a loss function is redesigned in the processing process of the graph structure of each node in the attribute graph and the initial characteristic representation of each node in the attribute graph to train the graph convolution neural network model, at the moment, similarity information between the nodes in the initial characteristic representation of each node is reserved in the target characteristic representation of each node obtained based on the trained graph neural network model, and due to the fact that a self-supervision training method is introduced, side effects of side bands among clusters are reduced, the potential of the graph convolution neural network model on a clustering task is further developed, and the graph convolution neural network model can be trained based on an end-to-end training mode.

Fig. 6 is a flowchart illustrating an attribute map processing method provided in an embodiment of the present application. As shown in fig. 6, the method includes:

step S601, acquiring the attribute graph to be processed and the attribute information of each node in the attribute graph to be processed.

Step S602, determining an adjacency matrix of the attribute map to be processed.

Step S603, for each node, determining an initial feature representation of the node based on the attribute information of the node.

The attribute graph to be processed may include attribute information of each node, and at this time, for each included node, the initial feature representation of the node may be determined according to the attribute information of the node, that is, the attribute information of one node may be characterized based on the initial feature representation. Further, an adjacency matrix of the to-be-processed attribute map may be determined, and the adjacency matrix may be determined by a map structure of the to-be-processed attribute map. For example, assuming that the adjacency matrix is an n × n matrix, if there is an edge between the node i and the node j, 1 is filled in the ith row and jth column and 0 is filled in the rest of the rows and jth columns, and each eigenvalue in the n × n matrix is determined based on the same principle, thereby obtaining the adjacency matrix.

Step S604, outputting the initial characteristic representation of the adjacent matrix and each node to a graph convolution neural network model to obtain the target characteristic representation of each node, wherein the graph convolution neural network model is obtained by training the initial graph convolution neural network model.

Specifically, after obtaining the adjacency matrix of the attribute graph to be processed and the initial feature representation of each node included in the adjacency matrix, the adjacency matrix and the initial feature representation of each node may be output to the graph convolution neural network model to obtain the target feature representation of each node, at this time, for a node, the dimension of the target feature representation of the node is smaller than the dimension of the initial feature representation of the node, and the target feature representation of each node in the attribute graph to be processed retains the similarity information in the initial feature of each node. The training method of the graph convolution neural network model is described in detail in the foregoing, and for details, reference may be made to the foregoing description, and details are not repeated again.

Optionally, after the target feature representation of each node is obtained, classification processing may be performed on each node in the attribute graph to be processed based on the target feature representation of each node, so as to obtain a classification result of each node. For example, when images are classified, all the images to be classified may be regarded as one attribute graph to be processed, and each image to be classified may be regarded as one node in the attribute graph to be processed, at this time, attribute information of each image to be classified may be acquired, an initial feature representation of each image to be classified is determined based on the attribute information of the image to be classified, an adjacency matrix of the attribute graph to be processed is determined, then the adjacency matrix and the initial feature representation of each image to be classified are input into a graph convolution neural network model, a target feature representation of each image to be classified is obtained, and finally, classification is performed by using a clustering algorithm based on the target feature representation of each image to be classified, and each classification result is obtained.

It can be understood that the application scenarios to which the attribute map processing method provided in the embodiment of the present application can be applied include, but are not limited to, image classification, and any efficient processing scenario pertaining to large-scale attribute problems can be applied, such as node classification, link prediction, business recommendation, and knowledge graph.

When the method is applied to link prediction, each node in the attribute graph is each node in a link which needs to be connected with edges, at this time, a target feature representation of each node which needs to be connected with edges can be obtained based on the attribute graph processing method provided in the embodiment of the application, then classification processing is performed based on the target feature representation of each node which needs to be connected with edges, each classification result is obtained, and further, each node belonging to the same classification result can be connected with edges, and each link is obtained; when the method is applied to business recommendation, each node in the attribute graph is business information to be recommended, at this time, a target feature representation of each business information to be recommended can be obtained based on the attribute graph processing method provided in the embodiment of the application, then classification processing is performed based on the target feature representation of each business information to be recommended, each classification result is obtained, and further, the business information to be recommended belonging to the same classification result can be recommended to a user as information of the same type; when the method is applied to the knowledge graph, each node in the attribute graph is each knowledge information, at this time, the target feature representation of each knowledge information can be obtained based on the attribute graph processing method provided in the embodiment of the application, then, classification processing is performed based on the target feature representation of each knowledge information to obtain each classification result, at this time, an association relation can be established for the knowledge information belonging to the same classification result, and thus, the knowledge graph formed by each knowledge information can be obtained.

An embodiment of the present application provides a processing apparatus for an attribute map, and as shown in fig. 7, the processing apparatus 70 for an attribute map may include: a data acquisition module 701, an initial information determination module 702, and a model training module 703, wherein,

a data obtaining module 701, configured to obtain a training data set, where each sample in the training data set includes original information of a sample attribute graph, the original information includes an adjacency matrix of the sample attribute graph and an initial feature representation of each node in the sample attribute graph, and the initial feature representation of one node represents attribute information of the node;

an initial information determining module 702, configured to determine, for each sample attribute graph, an initial graph structure feature of the graph based on at least one item of the original information, and determine a first similarity between nodes in the graph based on an initial feature representation of the nodes in the graph;

a model training module 703, configured to repeatedly perform the following training operations on the initial graph convolution neural network model based on the training data set until the training loss value satisfies the training end condition, to obtain a trained graph convolution neural network model:

Optionally, when obtaining the first graph structural feature of the graph corresponding to at least two levels based on the adjacency matrix of the graph, the model training module is specifically configured to:

obtaining a degree matrix of the graph;

determining a category characteristic representation of a category;

Optionally, the model training module is further configured to:

An embodiment of the present application provides a processing apparatus of an attribute map, and as shown in fig. 8, the processing apparatus 80 of an attribute map may include: a pending data acquisition module 801, an adjacency matrix determination module 802, an initial feature representation determination module 803, and a target feature representation determination module 804, wherein,

a to-be-processed data obtaining module 801, configured to obtain a to-be-processed attribute map and attribute information of each node in the to-be-processed attribute map;

an adjacency matrix determination module 802, configured to determine an adjacency matrix of the attribute map to be processed;

an initial feature representation determining module 803, configured to determine, for each node, an initial feature representation of the node based on the attribute information of the node;

and an object feature representation determining module 804, configured to output the adjacency matrix and the initial feature representation of each node to a graph convolution neural network model to obtain an object feature representation of each node, where the graph convolution neural network model is obtained by training based on the above-described manner of training the graph convolution neural network model.

The apparatus also includes a classification module to:

The processing apparatus for the attribute map in the embodiment of the present application can execute the processing method for the attribute map provided in the embodiment of the present application, and the implementation principles thereof are similar and will not be described herein again.

The processing means of the property graph may be a computer program (comprising program code) running in a computer device, e.g. the processing means of the property graph is an application software; the apparatus may be used to perform the corresponding steps in the methods provided by the embodiments of the present application.

In some embodiments, the processing Device of the property graph provided in the embodiments of the present invention may be implemented by a combination of hardware and software, and by way of example, the processing Device of the property graph provided in the embodiments of the present invention may be a processor in the form of a hardware decoding processor, which is programmed to execute the processing method of the property graph provided in the embodiments of the present invention, for example, the processor in the form of the hardware decoding processor may employ one or more Application Specific Integrated Circuits (ASICs), DSPs, Programmable Logic Devices (PLDs), Complex Programmable Logic Devices (CPLDs), Field Programmable Gate Arrays (FPGAs), or other electronic components.

An embodiment of the present application provides an electronic device, as shown in fig. 9, an electronic device 2000 shown in fig. 9 includes: a processor 2001 and a memory 2003. Wherein the processor 2001 is coupled to a memory 2003, such as via a bus 2002. Optionally, the electronic device 2000 may also include a transceiver 2004. It should be noted that the transceiver 2004 is not limited to one in practical applications, and the structure of the electronic device 2000 is not limited to the embodiment of the present application.

The processor 2001 is applied in the embodiment of the present application to implement the functions of the modules shown in fig. 7 and 8.

The processor 2001 may be a CPU, general purpose processor, DSP, ASIC, FPGA or other programmable logic device, transistor logic device, hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. The processor 2001 may also be a combination of computing functions, e.g., comprising one or more microprocessors, DSPs and microprocessors, and the like.

Bus 2002 may include a path that conveys information between the aforementioned components. The bus 2002 may be a PCI bus or an EISA bus, etc. The bus 2002 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in FIG. 9, but this does not indicate only one bus or one type of bus.

The memory 2003 may be, but is not limited to, ROM or other types of static storage devices that can store static information and computer programs, RAM or other types of dynamic storage devices that can store information and computer programs, EEPROM, CD-ROM or other optical disk storage, optical disk storage (including compact disk, laser disk, optical disk, digital versatile disk, blu-ray disk, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to carry or store a desired computer program or in the form of a data structure and that can be accessed by a computer.

The memory 2003 is used for storing computer programs for executing the application programs of the present scheme and is controlled in execution by the processor 2001. The processor 2001 is used to execute a computer program of an application program stored in the memory 2003 to realize the actions of the processing apparatus of the property diagrams provided by the embodiments shown in fig. 7 and 8.

An embodiment of the present application provides an electronic device, including a processor and a memory: the memory is configured to store a computer program which, when executed by the processor, causes the processor to perform any of the methods of the above embodiments.

The present application provides a computer-readable storage medium for storing a computer program, which, when run on a computer, enables the computer to execute any one of the above-mentioned methods.

Embodiments of the present application provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the method provided in the various alternative implementations described above.

The terms and implementation principles related to a computer-readable storage medium in the present application may specifically refer to a method for processing an attribute map in the embodiment of the present application, and are not described herein again.

It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and may be performed in other orders unless explicitly stated herein. Moreover, at least a portion of the steps in the flow chart of the figure may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a portion of the sub-steps or stages of other steps.

The foregoing is only a partial embodiment of the present application, and it should be noted that, for those skilled in the art, several modifications and decorations can be made without departing from the principle of the present application, and these modifications and decorations should also be regarded as the protection scope of the present application.

Claims

1. A method for processing an attribute map is characterized by comprising the following steps:

acquiring a training data set, wherein each sample in the training data set comprises original information of a sample attribute graph, the original information comprises an adjacency matrix of the sample attribute graph and initial feature representations of nodes in the sample attribute graph, and the initial feature representation of one node represents the attribute information of the node;

for each sample attribute graph, inputting the original information into a graph convolution neural network model to obtain target feature representation of each node in the graph, and determining target graph structure features of the graph and second similarity between the nodes in the graph based on the target feature representation of each node in the graph;

for each of the sample attribute graphs, determining a first difference between an initial graph structural feature and a target graph structural feature of the graph and a second difference between a first similarity and a second similarity between nodes in the graph;

and determining a training loss value of the graph convolution neural network model based on the first difference and the second difference corresponding to each sample attribute graph, and if the training loss value does not meet the training end condition, adjusting model parameters of the graph convolution neural network.

2. The method of claim 1, wherein for each of the sample property graphs, the inputting the raw information into a graph convolution neural network model to obtain a target feature representation of each node in the graph, and based on the target feature representation of each node in the graph, comprises:

based on the original information, the following operations are executed through the initial graph neural network model to obtain the target feature representation of each node in the graph:

for each node in the graph, obtaining a second feature representation of the node by fusing the first feature representations of the nodes of the respective levels of the at least two levels;

3. The method of claim 2, wherein obtaining the graph corresponding to at least two levels of first graph structural features based on the adjacency matrix of the graph comprises:

acquiring a degree matrix of the graph;

adding the adjacency matrix and the identity matrix based on the graph to obtain a processed adjacency matrix;

and taking the normalized adjacency matrix as a first graph structural feature of one hierarchy of the graph, and performing at least one self-multiplication operation on the normalized adjacency matrix to obtain the first graph structural feature of the graph corresponding to at least one hierarchy, wherein the self-multiplication power of each multiplication process is different.

4. The method of claim 1, wherein for each of the sample property graphs, determining a target graph structural feature of the graph based on a target feature representation of nodes in the graph comprises:

determining a third similarity between nodes in the graph based on the target feature representation of the nodes in the graph;

5. The method according to claim 1, wherein for each of the sample property graphs, the determining initial graph structure features of the graph based on at least one of the original information comprises any one of:

determining an initial graph structure feature of the graph based on the adjacency matrix of the graph;

6. The method of claim 5, wherein determining the initial graph structure feature of the graph based on the third feature representation of each node comprises:

determining an initial graph structure feature of the graph based on a fourth similarity between nodes in the graph.

7. The method of claim 1, wherein for each of the sample property graphs, determining a first similarity between nodes based on an initial feature representation of the nodes in the graph comprises:

for each two nodes in the graph, determining a first joint distribution probability based on a standard Gaussian distribution between the nodes based on the initial feature representations of the two nodes in the graph, the first joint distribution probability characterizing a first similarity between the two nodes;

for each of the sample attribute graphs, the determining a second similarity between nodes in the graph based on the target feature representation of the nodes in the graph includes:

for each two nodes in the graph, determining a second joint distribution probability based on t-distribution random neighbor embedding between the two nodes based on the target feature representation of the two nodes in the graph, wherein the second joint distribution probability characterizes a second similarity between the two nodes.

8. The method of claim 1, wherein after each predetermined number of training operations is performed, for each sample property graph, after the inputting the raw information into a graph convolution neural network model to obtain a target feature representation of each node in the graph, the method further comprises:

for each of said at least two classes, determining intra-class differences between nodes belonging to said class based on the target feature representation of nodes belonging to said class, and determining inter-class differences between different classes based on the target feature representation of nodes belonging to said class and the target feature representation of nodes belonging to each of the other classes, said other classes being classes of said at least two classes other than said class;

determining a training loss value of a graph convolution neural network model based on a first difference and a second difference corresponding to each sample attribute graph, including:

9. The method according to claim 8, wherein for each of the at least two classes, determining intra-class differences between nodes belonging to the class based on the target feature representation of the nodes belonging to the class, and determining inter-class differences between different classes based on the target feature representation of the nodes belonging to the class and the target feature representation of the nodes belonging to each of the other classes comprises:

determining a class feature representation for the class;

and for each node belonging to the category, determining a fourth difference between the target feature representation of the node and the category feature identification of each category in other categories, and obtaining the inter-category difference between the category and each category in other categories based on the fourth difference corresponding to each node belonging to the category.

10. The method of claim 9, further comprising:

obtaining weights corresponding to the first difference, the second difference and the category difference respectively, wherein the category difference comprises the intra-class difference and the inter-class difference;

weighting the first difference, the second difference, and the category difference based on their respective weights;

and summing the weighted first difference, the weighted second difference and the weighted category difference corresponding to each sample attribute map, and taking the sum as the training loss value.

11. A method for processing an attribute map is characterized by comprising the following steps:

acquiring a to-be-processed attribute graph and attribute information of each node in the to-be-processed attribute graph;

determining an adjacency matrix of the attribute graph to be processed;

for each node, determining an initial feature representation of the node based on attribute information of the node;

outputting the adjacency matrix and the initial feature representation of each node to a graph convolution neural network model to obtain a target feature representation of each node, wherein the graph convolution neural network model is trained on the basis of the method of any one of claims 1 to 10.

12. An apparatus for processing an attribute map, comprising:

the data acquisition module is used for acquiring a training data set, each sample in the training data set comprises original information of a sample attribute graph, the original information comprises an adjacency matrix of the sample attribute graph and initial feature representations of nodes in the sample attribute graph, and the initial feature representation of one node represents the attribute information of the node;

an initial information determining module, configured to determine, for each sample attribute graph, an initial graph structure feature of the graph based on at least one of the original information, and determine a first similarity between nodes in the graph based on an initial feature representation of the nodes in the graph;

a model training module, configured to repeatedly perform the following training operations on the initial graph convolution neural network model based on the training data set until the training loss value satisfies a training end condition, to obtain a trained graph convolution neural network model:

13. An apparatus for processing an attribute map, comprising:

the system comprises a to-be-processed data acquisition module, a to-be-processed data acquisition module and a processing module, wherein the to-be-processed data acquisition module is used for acquiring a to-be-processed attribute graph and attribute information of each node in the to-be-processed attribute graph;

and the target feature representation determining module is used for outputting the adjacency matrix and the initial feature representation of each node to a graph convolution neural network model to obtain a target feature representation of each node, wherein the graph convolution neural network model is trained on the basis of the method of any one of claims 1 to 10.

14. An electronic device, comprising a processor and a memory:

the memory is configured to store a computer program which, when executed by the processor, causes the processor to perform the method of any of claims 1-10 or to perform the method of any of claim 11.

15. A computer-readable storage medium, for storing a computer program which, when run on a computer, causes the computer to perform the method of any one of claims 1-10 or to perform the method of any one of claim 11.