CN114676821A

CN114676821A - Model determination method, device, equipment and computer readable storage medium

Info

Publication number: CN114676821A
Application number: CN202210297979.8A
Authority: CN
Inventors: 李扶阳; 张吉应; 罗迪君; 卞亚涛
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2022-03-24
Filing date: 2022-03-24
Publication date: 2022-06-28

Abstract

The application provides a model determination method, a model determination device, model determination equipment and a computer readable storage medium; the method comprises the following steps: acquiring a pre-constructed hypergraph, and determining a random walk state transition matrix corresponding to the hypergraph; determining a broken Markov kernel corresponding to the hypergraph based on the random walk state transition matrix, and determining a symmetrical state transition matrix corresponding to the random walk state transition matrix; determining a hypergraph signal convolution operator based on the symmetric state transition matrix and the broken Markov kernel; and acquiring a preset activation function, and generating a hypergraph processing network model by using the activation function and the hypergraph signal convolution operator. By the method and the device, the difference degree of the output information obtained after different input information is processed by the hypergraph processing network model can be improved.

Description

Model determination method, device, equipment and computer readable storage medium

Technical Field

The present application relates to artificial intelligence technologies, and in particular, to a model determination method, apparatus, device, and computer readable storage medium.

Background

A graph (graph) is a topology used in the real world to describe discrete data, with two elements, vertices and edges. In a general graph, an edge may connect two vertices and represent some association between the two vertices. In the hypergraph, the concept of redefining the hyper-edges is required, i.e. a hyper-edge in the hypergraph may contain multiple vertices. There are many real-world problems that require the use of graph or hypergraph structures to organize data, such as citation networks, social networks, protein molecular structure networks, and the like. Due to the development of the graph neural network, the representation capability and the information gathering capability of graph structure data are rapidly developed. The hypergraph is a topological structure with stronger representation capability, and the common graph can be regarded as a special hypergraph with the limit degree of 2, so that the hypergraph has more generalized representation capability than the graph.

Disclosure of Invention

The embodiment of the application provides a model determination method, a model determination device, model determination equipment and a computer readable storage medium, which can improve the difference degree of output information obtained after different input information is processed by a hypergraph processing network model.

The technical scheme of the embodiment of the application is realized as follows:

the embodiment of the application provides a model determination method, which comprises the following steps:

acquiring a hypergraph constructed in advance based on data to be processed, and determining a random walk state transition matrix corresponding to the hypergraph;

determining a broken Markov kernel corresponding to the hypergraph based on the random walk state transition matrix;

determining a symmetrical state transition matrix corresponding to the random walk state transition matrix;

determining a hypergraph signal convolution operator based on the symmetric state transition matrix and the broken Markov kernel;

and acquiring a preset activation function, and generating a hypergraph processing network model by using the activation function and the hypergraph signal convolution operator.

An embodiment of the present application provides a model determining apparatus, including: .

The first determination module is used for acquiring a hypergraph which is constructed in advance based on data to be processed and determining a random walk state transition matrix corresponding to the hypergraph;

a second determining module, configured to determine a broken markov kernel corresponding to the hypergraph based on the random walk state transition matrix;

a third determining module, configured to determine a symmetric state transition matrix corresponding to the random walk state transition matrix;

a fourth determination module to determine a hypergraph signal convolution operator based on the symmetric state transition matrix and the broken Markov kernel;

and the model determining module is used for acquiring a preset activation function and generating a hypergraph processing network model by using the activation function and the hypergraph signal convolution operator.

In some embodiments, the first determining module is further configured to:

based on the hypergraph, acquiring prior weight, out-degree and in-degree of each hyperedge, first hyperedge dependent vertex weight in the process that each vertex enters the hyperedge and second hyperedge dependent vertex weight in the process that each vertex exits the hyperedge;

acquiring a preset influence function, and determining the degree of each vertex according to the influence function, the out-degree and the in-degree of each super edge, wherein the influence function represents the influence type of the out-degree or the in-degree of the super edge on the degree of the vertex;

determining the migration probability of each vertex from random migration to other vertices based on the prior weight, the out-degree and the in-degree of each hyper-edge, and the degree, the first hyper-edge dependent vertex weight and the second hyper-edge dependent vertex weight of each vertex;

and determining a random walk state transition matrix based on the walk probability of each vertex randomly walking to other vertices.

In some embodiments, the second determining module is further configured to:

acquiring a depreciation access rate vector corresponding to each vertex based on the random walk state transition matrix;

determining a broken Markov diffusion distance between the vertex u and the vertex v based on the broken access rate vector corresponding to the vertex u and the broken access rate vector corresponding to the vertex v; wherein u is 0, 1, 2, …, (N-1); v is 0, 1, 2, …, (N-1), N is the total number of vertices in the hypergraph;

and determining a broken Markov core corresponding to the hypergraph based on the broken Markov diffusion distance.

In some embodiments, the second determining module is further configured to:

acquiring a preset breaking factor;

determining the average breaking access probability from a vertex u to a vertex v through t steps of random walk based on the random walk state transition matrix and the breaking factor; t is 0, 1, 2, …, (N-1),

and determining the average access probability of the break from the vertex u to each vertex through the random walk in the step t as the break access probability vector corresponding to the vertex u.

In some embodiments, the third determining module is further configured to:

determining a vertex degree matrix based on the hypergraph;

determining a weighted adjacency relation matrix corresponding to the hypergraph based on the random walk probability matrix and the vertex degree matrix;

and carrying out symmetric regularized conversion on the weighted adjacency relation matrix to obtain a symmetric state transition matrix.

In some embodiments, the fourth determining module is further configured to:

obtaining a convolution kernel formed by a filter between an input channel and an output channel;

determining a feature mapping relationship between an input space and an output space based on the broken Markov kernel;

determining a hypergraph signal convolution operator between output information and input information based on the symmetric state transition matrix, the feature mapping relationship, and the convolution kernel.

In some embodiments, the model determination module is further configured to:

acquiring residual connection hyper-parameters;

determining a convolution result of the (l +1) th layer by using the hypergraph signal convolution operator and the input information of the (l +1) th layer; wherein L is 0, 1, …, and L is a preset number of layers of the network;

performing fusion processing on the convolution result of the (l +1) th layer and the output information of the l-th layer by using the residual connection hyper-parameter to obtain a fusion result;

and acting the activation function on the fusion result to obtain a hypergraph processing network model.

In some embodiments, the apparatus further comprises:

the first acquisition module is used for acquiring training data from the hypergraph, wherein the training data comprises vertex information of a training vertex and label information of the training vertex;

the first processing module is used for processing the vertex information of each training vertex by using the hypergraph processing network model to obtain the expression vector of each training vertex;

the first prediction module is used for acquiring a trained classification model, and performing prediction processing on the expression vector of each training vertex by using the classification model to obtain prediction information corresponding to each training vertex;

and the model training module is used for training the hypergraph processing network model by utilizing the prediction information of each training vertex and the label information of each training vertex to obtain the trained hypergraph processing network model.

In some embodiments, the apparatus further comprises:

the second acquisition module is used for acquiring vertex information of the acquired test vertex in the hypergraph;

the second processing module is used for processing the vertex information of each test vertex by using the trained hypergraph processing network model to obtain a representation vector of the test vertex;

the second prediction module is used for performing prediction processing on the expression vectors of the test vertexes by using the trained classification model to obtain prediction information corresponding to the test vertexes;

and a fifth determining module, configured to determine the prediction information corresponding to each test vertex as the classification information of each test vertex.

An embodiment of the present application provides a computer device, including:

a memory for storing executable instructions;

and the processor is used for realizing the method provided by the embodiment of the application when executing the executable instructions stored in the memory.

The embodiment of the present application provides a computer-readable storage medium, which stores executable instructions for causing a processor to implement the method provided by the embodiment of the present application when the processor executes the executable instructions.

Embodiments of the present application provide a computer program product, which includes a computer program or instructions, and the computer program or instructions, when executed by a processor, implement the method provided by embodiments of the present application.

The embodiment of the application has the following beneficial effects:

after a hypergraph which is constructed in advance based on data to be processed is obtained, a random walk state transition matrix corresponding to the hypergraph is determined, then a broken Markov kernel corresponding to the hypergraph is determined based on the random walk state transition matrix, and a symmetrical state transition matrix corresponding to the random walk state transition matrix is determined; then determining a hypergraph signal convolution operator based on the symmetric state transition matrix and the broken Markov kernel; and finally, a preset activation function is obtained, a hypergraph processing network model is generated by using the activation function and the hypergraph signal convolution operator, a depreciation factor is introduced when a depreciation Markov kernel is determined, and then the influence of the hypergraph signal convolution operator on a far level can be weakened on the basis of a symmetrical state transition matrix and the depreciation Markov kernel, so that an over-smooth phenomenon is avoided, the difference of expression vectors output by different input signals through the generated hypergraph processing network model is improved, and the accuracy of results obtained by executing subsequent downstream tasks by using the expression vectors can be ensured.

Drawings

FIG. 1 is a schematic diagram of a network architecture of a hypergraph data processing system according to an embodiment of the present application;

fig. 2 is a schematic structural diagram of a server 400 according to an embodiment of the present application;

fig. 3 is a schematic flow chart of an implementation of the model determining method according to the embodiment of the present application;

fig. 4A is a schematic flow chart illustrating an implementation process of determining a broken markov core according to an embodiment of the present application;

fig. 4B is a schematic diagram of an implementation flow for determining a symmetric state transition matrix according to an embodiment of the present application;

FIG. 5 is a schematic diagram illustrating an implementation flow of training and predicting a determined hypergraph processing network model according to an embodiment of the present application;

FIG. 6 is a schematic representation of a supercide provided in accordance with an embodiment of the present application;

fig. 7 is a schematic diagram of a generation process of a hypergraph spectrum convolution neural network according to an embodiment of the present application.

Detailed Description

In order to make the purpose, technical solutions and advantages of the present application clearer, the present application will be described in further detail with reference to the accompanying drawings, the described embodiments should not be considered as limiting the present application, and all other embodiments obtained by a person of ordinary skill in the art without making creative efforts fall within the protection scope of the present application.

In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is understood that "some embodiments" may be the same subset or different subsets of all possible embodiments, and may be combined with each other without conflict.

In the following description, references to the terms "first \ second \ third" are only to distinguish similar objects and do not denote a particular order, but rather the terms "first \ second \ third" are used to interchange specific orders or sequences, where appropriate, so as to enable the embodiments of the application described herein to be practiced in other than the order shown or described herein.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the present application only and is not intended to be limiting of the application.

Before further detailed description of the embodiments of the present application, terms and expressions referred to in the embodiments of the present application will be described, and the terms and expressions referred to in the embodiments of the present application will be used for the following explanation.

1) The following drawings: a data structure is composed of vertexes and edges, wherein one edge can only connect two vertexes;

2) hypergraph: a data structure consisting of vertices and edges, wherein an edge may connect one or more vertices;

3) Edge-Dependent Vertex weights (EDVW, Edge-Dependent Vertex Weight): it is shown in a hypergraph that the weight of the same vertex depends on different hyperedges, i.e. in different hyperedges, the vertex has different weights.

4) A Markov process: the method is a random process, an original model of the random process is a Markov chain, the Markov process is an important method for researching a state space of a discrete event dynamic system, and the mathematical basis of the Markov process is a random process theory.

In order to better understand the model determination method provided by the embodiment of the present application, a hypergraph processing model and the existing disadvantages in the related art will be described first.

The information aggregation function of the hypergraph convolution network in the HGNN can be understood as a simplest hypergraph random walk model in mathematical concept, namely, the transition probability of the hypergraph random walk model only depends on the size of the number of the adjacent edges and the top points, so that fine-grained information which can be represented by a hypergraph topological structure is lost to a great extent, and the processing performance of the HGNN is poor.

On the other hand, in a hypergraph, to obtain information aggregation of "long-term" nodes, the number of layers of the convolutional neural network needs to be increased. However, in the existing hypergraph convolution neural network [1,8], as the number of convolution layers increases, the embedded characteristics of the nodes tend to be homogeneous, i.e. the characteristics among the nodes gradually converge to a fixed vector, which causes the performance of a downstream classification task to be sharply reduced.

The embodiment of the application provides a model determination method, a model determination device, model determination equipment and a computer readable storage medium, which can improve the difference degree of output information obtained after the hypergraph processing network model processing of different input information, thereby avoiding the problem of 'over-smoothness'. An exemplary application of the computer device provided in the embodiments of the present application is described below, and the computer device provided in the embodiments of the present application may be implemented as various types of user terminals such as a notebook computer, a tablet computer, a desktop computer, a set-top box, a mobile device (e.g., a mobile phone, a portable music player, a personal digital assistant, a dedicated messaging device, and a portable game device), and may also be implemented as a server. In the following, an exemplary application will be explained when the device is implemented as a server.

Referring to fig. 1, fig. 1 is a schematic structural diagram of a hypergraph processing system 100 provided in an embodiment of the present application, and as shown in fig. 1, the hypergraph processing system 100 includes a terminal 200, a network 300, and a server 400, the terminal 200 is connected to the server 400 through the network 300, and the network 300 may be a wide area network or a local area network, or a combination of the two.

In the embodiment of the present application, a hypergraph formed by processing cited data by a hypergraph processing system is described as an example. A user can send a to-be-sent list or to-be-classified paper data to a server 400 through a terminal 200 through a network 300, the server 400 obtains a plurality of paper data, constructs a hypergraph based on the plurality of paper data, determines a random walk state transition matrix based on topological structure information of the hypergraph, determines a break-away Markov kernel corresponding to the hypergraph based on the random walk state transition matrix, and determines a symmetrical state transition matrix corresponding to the random walk state transition matrix; then determining a hypergraph signal convolution operator based on the symmetric state transition matrix and the broken Markov kernel; and finally, acquiring a preset activation function, generating a hypergraph processing network model by using the activation function and the hypergraph signal convolution operator, introducing a depreciation factor when determining a depreciation Markov kernel, and further determining that the hypergraph signal convolution operator can weaken the influence of a far level based on a symmetrical state transition matrix and the depreciation Markov kernel, so that an over-smooth phenomenon is avoided, and the difference of expression vectors output by different input signals through the generated hypergraph processing network model is improved. After the hypergraph processing network model is generated, the vertex information with the label information in the hypergraph is used as training data to train the hypergraph processing network model to obtain a trained hypergraph processing model, and then the vertexes without the label information in the hypergraph are classified by the trained hypergraph processing model and the trained classification model to obtain the classification result of each vertex. Due to the fact that the difference of corresponding expression vectors of different input signals can be improved through the trained hypergraph processing model, the accuracy of classification results can be improved.

In some embodiments, the server 400 may be an independent physical server, may also be a server cluster or a distributed system formed by a plurality of physical servers, and may also be a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a network service, cloud communication, a middleware service, a domain name service, a security service, a CDN, and a big data and artificial intelligence platform. The terminal 200 may be a smart phone, a tablet computer, a notebook computer, a desktop computer, an intelligent voice interaction device, an intelligent watch, a vehicle-mounted intelligent terminal, an aircraft, and the like, but is not limited thereto. The terminal and the server may be directly or indirectly connected through wired or wireless communication, and the embodiment of the present application is not limited.

Referring to fig. 2, fig. 2 is a schematic structural diagram of a server 400 according to an embodiment of the present application, where the server 400 shown in fig. 2 includes: at least one processor 410, at least one network interface 420, a bus system 430, and a memory 440. The various components in server 400 are coupled together by a bus system 430. It is understood that the bus system 430 is used to enable connected communication between these components. The bus system 430 includes a power bus, a control bus, and a status signal bus in addition to the data bus. For clarity of illustration, however, the various buses are designated as bus system 430 in FIG. 2.

The Processor 410 may be an integrated circuit chip having Signal processing capabilities, such as a general purpose Processor, a Digital Signal Processor (DSP), or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like, wherein the general purpose Processor may be a microprocessor or any conventional Processor, or the like.

The memory 440 may be removable, non-removable, or a combination thereof. Exemplary hardware devices include solid state memory, hard disk drives, optical disk drives, and the like. Memory 440 optionally includes one or more storage devices physically located remote from processor 410.

Memory 440 includes volatile memory or nonvolatile memory, and can include both volatile and nonvolatile memory. The nonvolatile memory may be a Read Only Memory (ROM), and the volatile memory may be a Random Access Memory (RAM). The memory 440 described in embodiments herein is intended to comprise any suitable type of memory.

In some embodiments, memory 440 is capable of storing data to support various operations, examples of which include programs, modules, and data structures, or subsets or supersets thereof, as exemplified below.

An operating system 441 including system programs for handling various basic system services and performing hardware-related tasks, such as a framework layer, a core library layer, a driver layer, etc., for implementing various basic services and handling hardware-based tasks;

a network communication module 442 for communicating to other computing devices via one or more (wired or wireless) network interfaces 420, exemplary network interfaces 420 including: bluetooth, wireless compatibility authentication (WiFi), and Universal Serial Bus (USB), etc.;

in some embodiments, the apparatus provided by the embodiments of the present application may be implemented in software, and fig. 2 shows a model determining apparatus 443 stored in the memory 440, which may be software in the form of programs and plug-ins, and includes the following software modules: the first determination module 4431, the second determination module 4432, the third determination module 4433, the fourth determination module 4434, and the model determination module 4435 are logical and thus may be arbitrarily combined or further divided according to the functions implemented. The functions of the respective modules will be explained below.

In other embodiments, the apparatus provided in the embodiments of the present Application may be implemented in hardware, and for example, the apparatus provided in the embodiments of the present Application may be a processor in the form of a hardware decoding processor, which is programmed to execute the model determining method provided in the embodiments of the present Application, for example, the processor in the form of the hardware decoding processor may be implemented by one or more Application Specific Integrated Circuits (ASICs), DSPs, Programmable Logic Devices (PLDs), Complex Programmable Logic Devices (CPLDs), Field Programmable Gate Arrays (FPGAs), or other electronic components.

The model determination method provided by the embodiment of the present application will be described in conjunction with exemplary applications and implementations of the server provided by the embodiment of the present application.

The embodiment of the application provides a model determination method, which is applied to computer equipment, wherein the computer equipment can be a server or a terminal. In the embodiments of the present application, a computer device is taken as an example for description. Fig. 3 is a schematic flow chart of an implementation of the model determining method according to the embodiment of the present application, and the following describes each step with reference to fig. 3.

Step S101, a hypergraph which is constructed in advance based on data to be processed is obtained, and a random walk state transition matrix corresponding to the hypergraph is determined.

In the embodiment of the present application, the data to be processed may be data to be classified, for example, paper data to be classified, image data to be classified, video data to be classified, and the like. The data to be processed can be uploaded by the terminal or acquired from other servers. After the data to be processed are obtained, a hypergraph is constructed based on the incidence relation between the data to be processed. The data to be processed is taken as the paper data for example to explain, a hypergraph can be constructed by using the reference relationship between papers or the co-author relationship, and the prior weight of the vertex can be determined according to the importance degree of the author of the papers when the hypergraph is constructed. After the hypergraph is constructed, the degree of out-degree, the degree of in-degree and the degree of vertexes of each hypergraph can be obtained, and then the state transition probability of random walk among the vertexes is determined based on the degree of out-degree, the degree of in-degree and the degree of vertexes of the hypergraph, so that a random walk state transition matrix corresponding to the hypergraph is obtained.

And step S102, determining a broken Markov core corresponding to the hypergraph based on the random walk state transition matrix.

When the method is realized, firstly, a depreciation factor is introduced, the depreciation factor is a real number between 0 and 1, then, a depreciation average access probability vector starting from a vertex u is determined based on a random walk state transition matrix, the depreciation average access probability vector wandering to each vertex in t steps, a depreciation Markov distance between the vertex u and the vertex v is determined based on the depreciation average access probability vector corresponding to the vertex u and the depreciation average access probability vector corresponding to the vertex v, a depreciation Markov kernel is determined based on the depreciation Markov distance, a characteristic mapping relation between an input space and an output space can be obtained through the Markov kernel, and an input signal can be mapped to the output space through the characteristic mapping relation in the subsequent steps.

Step S103, determining a symmetric state transition matrix corresponding to the random walk state transition matrix.

When a symmetrical state transition matrix corresponding to the random walk state transition matrix is determined, a weighted adjacency relation matrix corresponding to the hypergraph is determined based on the random walk probability matrix and the vertex degree matrix of the hypergraph, and then the weighted adjacency relation matrix is subjected to symmetrical regularization conversion to obtain the symmetrical state transition matrix

And step S104, determining a hypergraph signal convolution operator based on the symmetric state transition matrix and the broken Markov kernel.

When the step is implemented, for a single input channel and a corresponding single output channel, firstly, performing feature decomposition on the symmetric state transition matrix to obtain a feature matrix, then, obtaining a vertex feature vector of an input vertex, a filter corresponding to the input channel and the output channel, and performing fourier transform on the vertex feature vector by using the feature matrix to obtain a first transform result, in the embodiment of the application, since a parameter of the filter is an unknown learnable parameter, a result of performing fourier transform on the filter cannot be directly determined, so that a second transform result is determined based on a break factor and a feature value of the feature matrix during implementation, where the second transform result is a result of performing fourier transform on the filter, and a convolution result of the vertex feature vector passing through the filter is determined by using the feature matrix, the first transform result and the second transform result, at this time, a hypergraph signal convolution operator between a single input channel and a single output channel can be obtained.

In order to determine a hypergraph signal convolution operator between a plurality of input channels and a plurality of output channels in the embodiment of the application, a convolution kernel formed by filters between the plurality of input channels and the plurality of output channels may be obtained, a feature mapping relationship between an input space and an output space is determined based on the broken markov kernel, and finally, a hypergraph signal convolution operator between output information and input information is determined based on the symmetric state transition matrix, the feature mapping relationship and the convolution kernel.

And step S105, acquiring a preset activation function, and determining a hypergraph processing network model by using the activation function and a hypergraph signal convolution operator.

The activation function may be a softmax function, a relu function, or the like. When the step is realized, in order to keep the original information of the vertex input features, the idea of residual connection can be adopted, the vertex input features of the l layer and the output features obtained by the l layer after convolution are subjected to fusion processing, and then an activation function is added to obtain the hypergraph processing network model.

In the model determination method provided in the embodiment of the present application, after a hypergraph pre-constructed based on data to be processed is acquired, a random walk state transition matrix corresponding to the hypergraph is determined, then a broken Markov kernel corresponding to the hypergraph is determined based on the random walk state transition matrix, and a symmetric state transition matrix corresponding to the random walk state transition matrix is determined; then determining a hypergraph signal convolution operator based on the symmetric state transition matrix and the broken Markov kernel; and finally, a preset activation function is obtained, a hypergraph processing network model is generated by using the activation function and the hypergraph signal convolution operator, a depreciation factor is introduced when a depreciation Markov kernel is determined, and then the hypergraph signal convolution operator determined based on a symmetric state transfer matrix and the depreciation Markov kernel can weaken the influence of a far level, so that the phenomenon of over-smoothing is avoided, the difference of the expression vectors output by the generated hypergraph processing network model of different input signals is improved, and the accuracy of the result obtained by executing a subsequent downstream task by using the expression vectors can be ensured.

In some embodiments, the step S101 of determining the random walk state transition matrix corresponding to the hypergraph may be implemented by:

step S1011, based on the hypergraph, the prior weight of each hyperedge, the out-degree of each hyperedge, the in-degree of each hyperedge, the first hyperedge dependent vertex weight in the process that each vertex enters the corresponding hyperedge, and the second hyperedge dependent vertex weight in the process that each vertex flows out of the corresponding hyperedge are obtained.

In the embodiment of the application, in the process of constructing the hypergraph, the topological structure of the hypergraph is determined based on the incidence relation between the vertexes, the prior weight of each hyperedge is predetermined, the weight of a first hyperedge dependent vertex in the process that each vertex enters the corresponding hyperedge and the weight of a second hyperedge dependent vertex in the process that each vertex flows out of the corresponding hyperedge are determined in advance. After the topological structure of the hypergraph is determined, the out-degree of each hyperedge can be determined based on the topological structure of the hypergraph and the weight of a first hyperedge dependent vertex in the process that each vertex enters the corresponding hyperedge, the in-degree of each hyperedge is determined based on the topological structure of the hypergraph and the weight of a second hyperedge dependent vertex in the process that each vertex flows out of the corresponding hyperedge, and the degree of the vertex is determined based on the topological structure of the hypergraph.

Wherein the out-degree delta of the excess edge e_out(e) Can be calculated by the formula (1-1):

wherein Q is_out(v, e) represents the' excess Edge Dependent Vertex Weight (EDVW) of the vertex v in the process of flowing out the excess edge e from the vertex v, and the output degree of the excess edge e is the sum of the excess edge dependent vertex weights of all the vertexes of the outflow excess edge e according to the formula (1-1).

Penetration delta of overcide e_in(e) Can be calculated by the following formula (1-2):

wherein Q is_in(v, e) represents the "super-Edge Dependent Vertex Weight (EDVW) of vertex v in the process of entering super-edge e from vertex v, that is, the degree of entry of super-edge e is the sum of the super-edge dependent vertex weights of all the vertices flowing into super-edge e.

Step S1012, a preset influence function is obtained, and the degree of each vertex is determined according to the influence function, the out-degree of each excess edge, and the in-degree of each excess edge.

Wherein the influence function characterizes the influence type of the out-degree of the super edge or the in-degree of the super edge on the degree of the vertex. For example, the influence function may be a linear function, a reciprocal function, or the like, and the influence function is used to represent different forms of influence of the degree of excess (out degree or in degree) on the degree of the vertex, for example, if ρ (x) ═ x is the influence that the out degree of excess is positively correlated to the degree of the vertex, the influence is different on the hypergraph constructed by different data, and the influence function may be regarded as a "hyperparameter".

This step, when implemented, may determine the degree of vertex u by equation (1-3):

where ρ (δ)_out(e) Indicates the type of influence of the out-degree of the excess edge on the degree of the vertex, ρ (δ)_in(e) Indicating the type of influence that the in-degree of the hyper-edge e has on the degree of the vertex.

Step S1013, based on the prior weight of each hyper-edge, the out-degree of each hyper-edge, the in-degree of each hyper-edge, the degree of each vertex, the weight of the first hyper-edge dependent vertex of each vertex, and the weight of the second hyper-edge dependent vertex of each vertex, determining the migration probability that each vertex randomly migrates to other vertices.

In implementation, the probability of vertex u wandering randomly to vertex v may be determined by equation (1-4):

where ω (e) is the prior weight of the super-edge.

In step S1014, a random walk state transition matrix is determined based on the walk probability of each vertex randomly walking to other vertices.

After the migration probability that each vertex randomly walks to other vertices is determined, a random migration state transition matrix can be constructed, wherein the element of the u-th row and the v-th column in the random migration state transition matrix represents the random migration state transition matrix from the vertex u to the vertex v in a random migration mode.

Through the steps S1011 to S1014, the random walk state transition matrix of the vertex piece can be determined based on the constructed hypergraph, and necessary data basis is provided for the subsequent calculation of the broken Markov kernel.

In some embodiments, the step S102 "of determining the broken markov kernel corresponding to the hypergraph based on the random walk state transition matrix" may be implemented by steps S1021 to S1023 shown in fig. 4A, and each step is described below with reference to fig. 4A.

Step S1021, the broken access rate vector corresponding to each vertex is obtained based on the random walk state transition matrix.

In practical applications, when this step is implemented, a preset breaking factor is obtained first, where the breaking factor may be a real number between 0 and 1, for example, the breaking factor may be 0.5, 0.8, and the like, and the value of the breaking factor is closer to 0, which indicates that the breaking rate is higher, whereas the value of the breaking factor is closer to 1, which indicates that the breaking rate is lower. Then determining the average breaking access probability from the vertex u to the vertex v through the random walk in the step t based on the random walk state transition matrix and the breaking factor; wherein u is 0,2, …, (N-1); and v is 0,2, …, (N-1), t is 0,2, …, (N-1), N is the total number of vertexes in the hypergraph, and finally, the average access probability of the discount loss randomly wandering to each vertex from the vertex u through t steps is determined as the access probability vector of the discount loss corresponding to the vertex u.

In the actual implementation process, starting from the vertex u, the average refractive-loss access rate of the random walk to the vertex v through t steps can be determined by the formula (1-5):

where Pr (s (τ) ═ k | s (0) ═ i) denotes the probability that the random walk starts from vertex u and reaches vertex v in steps τ, and its value is equal to the element in row i and column j to the power τ of state transition matrix P.

The break access rate vector corresponding to vertex u can be expressed as

In addition, the vertex u can only reach the vertex u itself after 0 step, so the method has the advantages of simple operation, low cost and the like

Wherein e_iIs a unit column vector. Due to the fact that

Is a matrix

Line i of (1)

Thus, it is possible to provide

Can be expressed as formulas (1-6):

in step S1022, a broken markov diffusion distance between vertex u and vertex v is determined based on the broken access rate vector corresponding to vertex u and the broken access rate vector corresponding to vertex v.

In practical applications, the broken Markov diffusion distance between vertex u and vertex v may be determined using equations (1-7):

the broken Markov diffusion distance between vertex u and vertex v may characterize how similar vertex u diffuses to other vertices and vertex v diffuses to other vertices.

And step S1023, determining a broken Markov kernel corresponding to the hypergraph based on the broken Markov diffusion distance.

When this step is implemented, equations (1-6) may be substituted into equations (1-7), resulting in equations (1-8):

in bookIn the application examples, Z (t) Z in (1-8) in the formula^T(t) determining as a broken Markov kernel, denoted K_MD。

In some embodiments, the step S103 "determining a symmetric state transition matrix corresponding to the random walk state transition matrix" may be implemented by steps S1031 to S1033 shown in fig. 4B, and the steps are described below with reference to fig. 4.

And step S1031, determining a vertex degree matrix based on the hypergraph.

Having determined the degrees of each vertex in step S1012, a vertex degree matrix may be determined based thereon, and in some embodiments, the vertex degree matrix may be used

To show that the vertex degree matrix is a diagonal matrix, the elements on the diagonal of which correspond to the degrees of the vertices determined according to equations (1-3).

Step S1032 determines a weighted adjacency relation matrix corresponding to the hypergraph based on the random walk probability matrix and the vertex degree matrix.

Because one edge in the hypergraph is connected with a plurality of vertexes, the adjacency relation matrix between the vertexes cannot be directly obtained through the topological structure of the hypergraph, and the weighted adjacency relation matrix corresponding to the hypergraph can be obtained through calculation of the formula (1-9):

wherein P is a random walk state transition matrix.

Step S1033, performing symmetric regularization conversion on the weighted adjacency matrix to obtain a symmetric state transition matrix.

When the step is realized, the weighted connection relation matrix can be subjected to symmetrical regularization conversion according to a formula (1-10) to obtain a symmetrical state transition matrix Psym:

the symmetric state transition matrix and the symmetric Laplace matrix L of the hypergraph obtained by the formula_symSatisfies the relationship shown in the formulas (1-11):

L_sym＝I-P_sym (1-11)；

symmetric supergraph Laplace matrix L_symHas good spectral properties, and the range of characteristic value is [0,2 ]]Thus, the symmetric state transition matrix has a value range of [ -1,1 ]]Therefore, theoretical guarantee is provided for the stability of the subsequently designed hyperspectral convolution neural network.

In some embodiments, the step S104 "determining a hypergraph signal convolution operator based on the symmetric state transition matrix and the broken markov kernel" can be implemented by:

in step S1041, a convolution kernel formed by a filter between the input channel and the output channel is obtained.

In the embodiment of the present application, assuming there are Cin input channels and Cout output channels, the filter between the input channels and the output channels forms a convolution kernel with a dimension Cin Cout.

Step S1042, determining a characteristic mapping relation between the input space and the output space based on the broken Markov kernel.

According to the above formulas (1-8), the broken Markov kernel is Z (t) Z^T(t), in the embodiment of the present application, z (t) in the broken markov kernel is determined as a feature mapping relationship between the input space and the output space.

Step S1043, determining a hypergraph signal convolution operator between the output information and the input information based on the symmetric state transition matrix, the feature mapping relation and the convolution kernel.

For a single input channel and a corresponding single output channel, when determining filter parameters between the input channel and the output channel, performing characteristic decomposition on the symmetric state transition matrix to obtain a characteristic matrix; and acquiring a vertex eigenvector of an input vertex, a filter corresponding to an input channel and an output channel, and then performing Fourier transform on the vertex eigenvector by using the eigenvector to obtain a first transform result, in the embodiment of the present application, since the filter is unknown, the result of the Fourier transform corresponding to the filter cannot be directly determined, and therefore in the embodiment of the present application, a second transform result is determined based on a loss factor and an eigenvalue of the eigenvector, the second transform result is the result of performing Fourier transform on the filter, and the convolution result of the vertex eigenvector passing through the filter is determined by using the eigenvector, the first transform result and the second transform result, thereby obtaining formulas (1-13):

in the equations (1-13), there is only one unknown parameter θ_gOne of which is theta_gA learnable parameter representing a filter between a pair of input and output channels.

Assuming there are Cin input channels and Cout output channels, the dimension of the convolution kernel Θ formed by the filter between the input channel and the output channel is Cin × Cout, and the hypergraph signal convolution operator between the output information and the input information on the hypergraph can be obtained by the formula (1-13) as follows:

from the above steps S1041 to S1043, that is, determining a hypergraph signal convolution operator based on the symmetric state transition matrix and the break markov kernel, and according to the formula (1-14), it can be seen that the larger i is, the larger α is under the effect of the break factorⁱThe smaller the value of (2), the smaller the influence of the input information on the output information at the moment, so that the influence of the far input information can be weakened, and the difference of the expression vectors output by the generated hypergraph processing network model of different input signals can be improved through the hypergraph convolution operator.

In some embodiments, the "determining a hypergraph processing network model using the activation function and the hypergraph signal convolution operator" in the above step S105 can be implemented by:

step S1051, obtaining residual connection hyper-parameters.

Wherein, the residual connection hyperparameter is a real number between 0 and 1.

And step 1052, determining a convolution result of the ith layer by using the hypergraph signal convolution operator and the input information of the ith layer.

Wherein L is 0, 1, …, and L is a preset number of layers of the network; in the embodiment of the present application, it is assumed that the input information of the l-th layer is X^(l)Substituting the input information of the l layer into the formula (1-14), namely obtaining the convolution result of the l layer as:

and S1053, carrying out fusion processing on the convolution result of the l layer and the input information of the l layer by using the residual connection hyper-parameter to obtain a fusion result.

When implemented, this step can be as follows

And carrying out fusion processing on the convolution result of the l layer and the input information of the l layer to obtain a fusion result. That is, the residual connection hyperparameter is used for adjusting the fusion degree of the convolution result and the input information of the l-th layer, wherein the closer the residual connection hyperparameter is to 0.5, the higher the fusion degree of the residual connection hyperparameter is, the closer to 0 or the closer to 1, the lower the fusion degree is.

And S1054, acting the activation function on the fusion result to obtain a hypergraph processing network model.

Assuming that the activation function is ψ (), the output information of the l-th layer (i.e., the input information of the (l +1) -th layer) can be obtained by the following equations (1-15):

because the corresponding relation between the input information of the l-th layer and the input information of the (l +1) -th layer (namely the output information of the l-th layer) is obtained, the hypergraph processing network model is obtained. In the process of generating the hypergraph processing network model, a Markov diffusion process is introduced, a broken-down Markov kernel is provided on the basis, and a hypergraph convolution operator is determined based on the broken-down Markov kernel, so that the problem of 'over-smoothing' caused by aggregation of 'long-order' information on a hypergraph can be solved, fine-grained hypergraph topology information, namely, the hyper-edge dependence vertex weight is utilized when a random walk state transfer matrix corresponding to the hypergraph is calculated, the same as a symmetrical state transfer matrix determined based on the random walk state transfer matrix and used for calculating the hypergraph convolution operator, and the fine-grained information is also utilized, so that the hypergraph processing model can be guaranteed to have good processing performance.

In some embodiments, after the hypergraph processing network model is generated through the above steps S101 to S105, the generated hypergraph processing network model may be trained through the steps S106 to S109 shown in fig. 5, and the steps S106 to S109 are explained below with reference to fig. 5.

Step S106, training data is obtained from the hypergraph.

The training data includes vertex information for training vertices and label information for the training vertices, that is, the training vertices are vertices with label information. The label information of the training vertices may be classification information of the training vertices. For example, the hypergraph is constructed based on papers, the vertex in the hypergraph is each paper, and the label information of the training vertex can be the classification information of the paper, and the classification information can be the classification number.

When the step is implemented, whether each vertex in the hypergraph has the classification information or not can be determined, the vertex with the classification information is determined as a training vertex, and the classification information is determined as label information of the training vertex; vertices that do not have classification information are determined as test vertices.

And S107, processing the vertex information of each training vertex by using the hypergraph processing network model to obtain the expression vector of each training vertex.

The vertex information of the training vertex may include multidimensional information, for example, information of the vertex itself, attribute information of the vertex, and the like, and in case that the vertex is a paper, the vertex information may include a title of the paper, a summary of the paper, and may also include information of a text, an author, and the like of the paper. When the step is realized, the title, abstract, text and author of the thesis can be used as input information of different input channels of the hypergraph processing network model, the hypergraph processing network model processes the input information of each input channel to obtain corresponding output information, namely, the expression vector of each training vertex is obtained, and the expression vector can be understood as a feature vector of the training vertex in an output space.

And step S108, acquiring the trained classification model, and performing prediction processing on the expression vector of each training vertex by using the trained classification model to obtain prediction information corresponding to each training vertex.

The trained classification model can be a neural network model or other classification models. For example, a decision tree model, a support vector machine model, a bayesian classification model, etc. The representation vector of the training vertex is subjected to prediction processing through the trained classification model, and the obtained prediction information can be prediction classification of the training vertex.

And step S109, training the hypergraph processing network model by using the prediction information of each training vertex and the label information of each training vertex to obtain the trained hypergraph processing network model.

When the step is realized, the prediction information and the label information are reversely propagated into the hypergraph processing network model, and the parameters of the hypergraph processing network model are adjusted by using a preset loss function until a training end condition is reached, so that the trained hypergraph processing network model is obtained.

After the trained hypergraph processing network model is obtained in the above steps S106 to S109, vertices in the hypergraph that do not have label information can be predicted by the trained hypergraph processing network model in steps S109 to S112 shown in fig. 5, and the steps will be described below with reference to fig. 5.

Step S110, vertex information of the test vertex is obtained from the hypergraph.

The test vertex does not have classification information, so the test vertex is the vertex to be classified.

And step S111, processing the vertex information of each test vertex by using the trained hypergraph processing network model to obtain the expression vector of each test vertex.

And step S112, performing prediction processing on the expression vectors of the test vertexes by using the trained classification model to obtain prediction information corresponding to the test vertexes.

Step S113, determining the prediction information corresponding to each test vertex as the classification information of each test vertex.

Through the steps S110 to S113, the trained hypergraph processing model can be used to process the test vertices without classification information in the hypergraph to extract the representation vectors of the test vertices, the trained hypergraph processing model is used to process different input signals to obtain representation vectors with large difference, and the representation vectors are not easy to generate an over-smooth phenomenon, so that the accuracy of the classification information obtained when the trained classification model is used to classify the representation vectors of different input signals can be improved.

Next, an exemplary application of the embodiment of the present application in a practical application scenario will be described.

The model determination method provided by the embodiment of the application can be applied to scenes such as a citation network or 3D object classification, and the two scenes are explained below.

First, citation network:

in academic literature, mutual citations or co-authors between papers constitute some kind of relationship between papers. In a Citation Network (position Network), vertices are defined as papers, and edges are formed by Citation relationships or co-authors between papers. However, it is difficult to directly express some higher-order implicit interactive relationships simply by the reference relationships between two papers, so that constructing a super edge through some a priori knowledge has been proved to obtain a stronger representation capability. For example, in a paper database, (a) co-author (co-author ed) (as shown in fig. 6, there are four vertices for the super edge 1 of author a, which indicates that the paper database includes 4 papers published by author a): all papers (vertices) signed by the same author constitute a hyper-edge; (b) co-referencing (co-localization) all papers (vertices) that reference the same paper constitute a hyper-edge. However, the same vertex (paper) is also of different importance to its different authors (superedge), e.g., one paper has a relatively important meaning in the research field to determine its first author (superedge), but may have a relatively small meaning to other authors (superedge). This is EDVW information, i.e. the weight of a top point in a hypergraph is represented differently in different hypercopes.

Second, 3D-visual object classification:

in the 3D-Visual Object Classification (3D-Visual Object Classification), each 3D Object is a vertex in order to construct the hypergraph. These 3D objects are photographed from multiple perspectives, and vertex features are extracted using different image processing techniques. In order to extract higher-order similarity information between objects, a hypergraph needs to be constructed to represent them. The general techniques for constructing the super-edge are: defining a distance on a vertex feature space, taking any object (vertex) j as a central point, selecting K most adjacent vertexes by utilizing a K neighbor algorithm, and constructing a super edge with the super edge degree (the number of vertexes in the super edge) of K by taking the j and the feature distances of the K neighbors as vertex weights. It can be seen that the EDVW information is naturally introduced into the constructed hypergraph by such hypergraph construction method: because the Euclidean distances between the vertex feature of the same object i and the vertex features of other objects are generally different on a fine-grained scale, that is, the weights of i are different in different overedges constructed by taking other different objects as centers.

In the embodiment of the application, based on the view of the generalized hypergraph random walk, a 'break-away Markov diffusion kernel' and a feature mapping thereof are defined, and then a new hypergraph spectrum convolution neural network is derived: a simple hyper-spectral convolutional neural network (SHSC). In the embodiment of the present application, a hypergraph is denoted as H, a vertex set is denoted as V, and a hyper-edge set is denoted as ε. Let M be a Markov process, and P be the state transition probability matrix over M. Fig. 7 is a schematic diagram of a generation process of a hyperspectral convolutional neural network according to an embodiment of the present application, where as shown in fig. 7, the generation process includes:

step S701, creating an EDVW hypergraph.

Step S702, determining the EDVW hypergraph random walk state transition probability matrix.

In the embodiment of the present application, a new way for calculating the wandering probability (from vertex u to vertex v) defined by "unified hypergraph random wandering" is proposed, which can be represented by formula (2-1):

where ω (e) is the prior weight of the super-edge; q_in(u, e) represents the "extreme Edge Dependent Vertex Weight (EDVW), Q of vertex u during entry of vertex u into extreme edge e_out(v, e) represents the "super-Edge Dependent Vertex Weights (EDVW), Q of vertex v in the process of flowing out of super-edge e from vertex v_in(v, e) represents the "super-Edge Dependent Vertex Weight (EDVW), Q for vertex v as it enters super-edge e from vertex v_out(u, e) indicates that vertex u "overedge dependent vertex weight (EDVW)" and δ is in the process of flowing out overedge e from vertex v_out(e) Represents the out-degree, delta, of the excess edge e_in(e) The method is characterized in that the degree of the super edge e is expressed, d (u) expresses the degree of the vertex u, and rho (·) is an arbitrary function and has the function of reflecting the influence of the super edge degree delta (e) on different forms of the vertex out degree d (u), for example, if rho (x) ═ x, the influence shows positive correlation, and the influence is different on a hypergraph constructed by different data and can be regarded as a 'super parameter'.

Wherein, the out-degree of the excess edge e can be expressed by the formula (2-2):

the out-degree of the excess edge e can be expressed by the formula (2-3):

the degree of the vertex u can be expressed by the formula (2-4):

based on the random walk probability from vertex u to vertex v shown in formula (2-1), a probability transition matrix corresponding to the hypergraph can be obtained:

wherein the content of the first and second substances,

and

are diagonal matrices, whose diagonal elements correspond to the above-defined out-degree, in-degree and vertex-degree of the edge respectively.

Step S703, determining a symmetric state transition matrix based on the hypergraph random walk state transition probability matrix.

The probability transition Matrix shown in equation (2-5) corresponds to a Weighted Adjacency Matrix (Weighted Adjacency Matrix) of the vertex on the hypergraph, as shown in equation (2-6):

using symmetric row regularization technique (symmetric row regularization trim) on equations (2-6) to obtain a "symmetric state transition matrix", as shown in equations (2-7):

it is emphasized here that P is obtained using the above-described symmetry technique_symThe reason for (A) is that P_symThe symmetric laplace matrix of the rear hypergraph, as shown in equations (2-8):

the symmetric hypergraph Laplace matrix has good spectral properties, L_symThe range of the eigenvalue of (A) is [0,2 ]]This provides theoretical guarantee for the stability of the hyperspectral convolution neural network designed later.

Step S704, the symmetric state transition matrix is subjected to spectrum decomposition to obtain a feature matrix.

In the embodiment of the present application, let us say that the feature matrix U ═ U₁,u₂,...,u_n) Wherein u is_iIs L_symFeature vector of (note also as P)_symThe feature vector of).

Step S705, the feature matrix is used to perform a hypergraph fourier transform and an inverse transform on the feature x.

In the embodiment of the present application, the fourier transform defined on the hypergraph is as shown in equations (2-9), and the inverse transform is as shown in equations (2-10):

wherein x ∈ RⁿIs a frequency domain signal at the vertex of the hypergraph,

is a transformed spectral domain signal.

Step S706, determining the average discount access rate.

In the embodiment of the present application, the "discounted average access rate" is defined by the following formula (2-11):

where Pr (s (τ) ═ k | s (0) ═ i) denotes the probability that the random walk starts from vertex u and reaches vertex v in steps τ, and its value is equal to the element in row i and column j to the power τ of state transition matrix P. In general, equation (2-11) represents the "discounted average probability of visit" for a random walk on the hypergraph starting from vertex u and walking to vertex v in steps t. A compromise factor α is introduced in the formula (2-11) to alleviate the over-smoothing problem caused by the information aggregation of the supergraph "long-term".

Thus, for an arbitrary vertex u, the "average lossy access rate" reaching each point on the hypergraph can be expressed as a vector:

in particular, it is defined in the examples of the present application

If u is equal to v, the sum of the values v,

otherwise

Namely that

Wherein e_iIs a unit column vector. Due to the fact that

Is a matrix

Line i of (1)

Thus, it is possible to provide

Can be expressed as formula (2-12):

in step S707, the catadioptric markov diffusion distance is determined.

In the embodiment of the present application, "broken Markov diffusion distance (discrete) is defined as shown in equations (2-13):

the formulas (2-13) show that in the t-th step, the vertex u diffuses to other vertexes and the vertex v diffuses to the similar degree among other vertexes, and if the distance determined by the formulas (2-13) is small, the vertex u and the vertex v have similar influence on other vertexes in the diffusion process. This distance can describe how similar the information between vertices is gathered, flowing.

In step S708, a broken markov kernel and feature map are determined.

In this step, by substituting the formula (2-12) into (2-13), it is possible to obtain:

from the broken Markov diffusion distance shown in equations (2-14), a "broken Markov diffusion kernel" can be obtained, as shown in equations (2-15):

here, K_MDNamely the "broken markov diffusion kernel", and z (t) is the feature map (feature map) on this kernel space, i.e. the signal X on any hypergraph, which is mapped into this kernel space by z (t) X.

Step S709, a hypergraph convolution operation is performed using the broken markov kernel feature mapping polynomial, the filter g, and the feature x.

In the embodiment of the present application, the convolution operation on the definition hypergraph is shown in the following formula (2-16):

(g*x)_H＝U((U^Tg)⊙(U^Tx)) (2-16)；

here, the number of the first and second electrodes,

is a filter from one input channel to one output channel on the hypergraph.

In implementation, the form of the feature map in equation (2-15) is combined, as with GCN, with P_sym＝I-L_symCharacteristic value lambda_p(value range [ -1, 1)]) Polynomial of (2)

To estimate

Wherein one theta_gA filter representing the distance between a pair of input and output channels can learn parameters.

And step S710, determining a hypergraph signal convolution operator.

Assuming input signals on a hypergraph

Has C_inIndividual Channel (Channel), output signal X_outHas C_outA channel is then used

Representing the convolution kernel formed by all filters, the convolution operator between the input signal and the output signal on the hypergraph is obtained from the formula (2-17):

step S711, a simple hyperspectral convolutional neural network is generated.

In order to retain the original information of the vertex input features, in the embodiment of the application, the idea of residual error connection is adopted, the original information of the vertex input features is fused into the output features after convolution, then the activation function psi is added, and the input function psi is substituted

The final form of the "simple hyper-spectral convolution neural network (SHSC)" is obtained:

wherein X^(l)Is an input to the l-th layer of the network; and theta is a learnable convolution kernel parameter in the network, psi is a selected activation function, beta is a hyper-parameter, and the degree of fusion of the convolved features and the original features is adjusted.

Through the steps from S701 to S711, a simple hypergraph spectrum convolution neural network can be generated, the hypergraph spectrum convolution neural network is used for obtaining the expression vector of the object to be classified, and then the expression vector is classified by using the trained classification submodule to obtain a classification result.

TABLE 1 citation network identification experiment result-Classification accuracy

As can be seen from table 1, the performance (accuracy) of the hyperspectral convolutional neural network (SHSC) provided by the embodiment of the present application greatly surpasses the existing hyperspectral convolutional neural network in all 5 data sets.

TABLE 2 3D-visual object recognition experiment results-Classification accuracy

As can be seen from table 2, SHSC is under 6 hypermap construction methods for both datasets, 5 of which surpass the other hypermap neural network methods that exist.

TABLE 3 over-smoothing of hypergraph convolution models in citation networks

As can be seen from table 3, both the existing hypergraph convolutional neural networks HGNN and HyperGCN suffer from an over-smoothing phenomenon, and as the number of network Layers (Layers) increases, it can be seen that the classification accuracy is decreasing, which is not the case with the SHSC provided in the embodiment of the present application.

In the embodiment of the application, a Markov diffusion process is introduced into a hypergraph, a broken-down Markov kernel is proposed on the basis, and then a simple hypergraph spectrum convolution neural network model is generated by combining a hypergraph dependency vertex weight and hypergraph random walk, the model obtains the best performance in the current hypergraph neural network community on a downstream graph vertex classification task, the 'over-smoothing' problem of the current hypergraph neural network model can be solved through the introduced broken-down Markov kernel, fine-grained hypergraph topology information can be captured on the basis of the 'hypergraph dependency vertex weight' -hypergraph random walk, and the fine-grained information provides guarantee for the model to obtain the current best performance.

It is understood that, in the embodiments of the present application, the content related to the user information, for example, the data related to the content of the treatises and the like, when the embodiments of the present application are applied to specific products or technologies, user permission or consent needs to be obtained, and the collection, use and processing of the related data need to comply with the related laws and regulations and standards of the related countries and regions.

Continuing with the exemplary structure of the model determining apparatus 443 provided by the embodiments of the present application as implemented as a software module, in some embodiments, as shown in fig. 2, the software module stored in the model determining apparatus 443 of the memory 440 may include:

a first determining module 4431, configured to acquire a hypergraph pre-constructed based on data to be processed, and determine a random walk state transition matrix corresponding to the hypergraph;

a second determining module 4432, configured to determine a broken Markov kernel corresponding to the hypergraph based on the random walk state transition matrix,

a third determining module 4433, configured to determine a symmetric state transition matrix corresponding to the random walk state transition matrix;

a fourth determination module 4434 for determining a hypergraph signal convolution operator based on the symmetric state transition matrix and the broken markov kernel;

and the model determining module 4435 is configured to obtain a preset activation function, and generate a hypergraph processing network model by using the activation function and the hypergraph signal convolution operator.

In some embodiments, the first determining module is further configured to:

acquiring a preset influence function, and determining the degree of each vertex according to the influence function, the out-degree and the in-degree of each super edge, wherein the influence function represents the influence type of the out-degree of the super edge or the in-degree of the super edge on the degree of the vertex;

determining the migration probability of each vertex from random migration to other vertexes based on the prior weight, the out-degree and the in-degree of each hyper-edge, the degree of each vertex, the first hyper-edge dependent vertex weight of each vertex and the second hyper-edge dependent vertex weight of each vertex;

In some embodiments, the second determining module is further configured to:

acquiring a preset breaking factor;

determining the average breaking access probability from a vertex u to a vertex v through t steps of random walk based on the random walk state transition matrix and the breaking factor; t-0, 1, 2, …, (N-1);

In some embodiments, the third determining module is further configured to:

determining a vertex degree matrix based on the hypergraph;

and carrying out symmetrical regularization conversion on the weighted adjacency relation matrix to obtain a symmetrical state transition matrix.

In some embodiments, the fourth determining module is further configured to:

In some embodiments, the model determination module is further configured to:

acquiring residual connection hyper-parameters;

In some embodiments, the apparatus further comprises:

the second acquisition module is used for acquiring vertex information of a test vertex acquired from the hypergraph, wherein the test vertex does not have label information;

the second prediction module is used for predicting the expression vector of each test vertex by using the trained classification model to obtain the prediction information corresponding to each test vertex;

It should be noted that the description of the model determination apparatus in the embodiment of the present application is similar to the description of the method embodiment described above, and has similar beneficial effects to the method embodiment. For technical details not disclosed in the embodiments of the apparatus, reference is made to the description of the embodiments of the method of the present application for understanding.

Embodiments of the present application provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer readable storage medium, and the processor executes the computer instructions, so that the computer device executes the model determining method described above in the embodiment of the present application.

Embodiments of the present application provide a computer-readable storage medium having stored therein executable instructions that, when executed by a processor, cause the processor to perform a model determination method provided by embodiments of the present application, for example, the method as illustrated in fig. 3, 4A, 4B, and 5.

In some embodiments, the computer-readable storage medium may be memory such as FRAM, ROM, PROM, EPROM, EEPROM, flash, magnetic surface memory, optical disk, or CD-ROM; or may be various devices including one or any combination of the above memories.

In some embodiments, executable instructions may be written in any form of programming language (including compiled or interpreted languages), in the form of programs, software modules, scripts or code, and may be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.

By way of example, executable instructions may correspond, but do not necessarily have to correspond, to files in a file system, and may be stored in a portion of a file that holds other programs or data, such as in one or more scripts in a hypertext Markup Language (HTML) document, in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub-programs, or portions of code).

By way of example, executable instructions may be deployed to be executed on one computing device or on multiple computing devices at one site or distributed across multiple sites and interconnected by a communication network.

The above description is only an example of the present application, and is not intended to limit the scope of the present application. Any modification, equivalent replacement, and improvement made within the spirit and scope of the present application are included in the protection scope of the present application.

Claims

1. A method of model determination, the method comprising:

and acquiring a preset activation function, and determining a hypergraph processing network model by using the activation function and the hypergraph signal convolution operator.

2. The method of claim 1, wherein determining the random walk state transition matrix corresponding to the hypergraph comprises:

acquiring a preset influence function, and determining the degree of each vertex according to the influence function and the out-degree and in-degree of each super edge, wherein the influence function represents the influence type of the out-degree or in-degree of the super edge on the degree of the vertex;

determining the migration probability of each vertex migrating to other vertexes randomly based on the prior weight, the out-degree and the in-degree of each super-edge, the degree of each vertex, the first super-edge dependent vertex weight and the second super-edge dependent vertex weight of each vertex;

3. The method of claim 2, wherein determining the break markov kernel for the hypergraph based on the random walk state transition matrix comprises:

4. The method according to claim 3, wherein the obtaining the depreciation access rate vector corresponding to each vertex based on the random walk state transition matrix comprises:

acquiring a preset breaking factor;

determining the average breaking access probability from a vertex u to a vertex v through t steps of random walk based on the random walk state transition matrix and the breaking factor; t ═ 0, 1, 2, …, (N-1);

5. The method of claim 4, wherein determining the symmetric state transition matrix to which the random walk state transition matrix corresponds comprises:

determining a vertex degree matrix based on the hypergraph;

6. The method of claim 1, wherein said determining a hypergraph signal convolution operator based on said symmetric state transition matrix and said lossy markov kernel comprises:

7. The method of claim 1, wherein said generating a hypergraph processing network model using said activation function and said hypergraph signal convolution operator comprises:

acquiring residual connection hyper-parameters;

determining a convolution result of the (L +1) th layer by using the hypergraph signal convolution operator and input information of the (L +1) th layer, wherein L is 0, 1, …, and L is a preset number of layers of the network;

8. The method according to any one of claims 1 to 7, further comprising:

acquiring training data from the hypergraph, wherein the training data comprises vertex information and label information of training vertexes;

processing the vertex information of each training vertex by using the hypergraph processing network model to obtain a representation vector of each training vertex;

obtaining a trained classification model, and performing prediction processing on the expression vector of each training vertex by using the classification model to obtain prediction information corresponding to each training vertex;

and training the hypergraph processing network model by using the prediction information and the label information of each training vertex to obtain the trained hypergraph processing network model.

9. The method of claim 8, further comprising:

acquiring vertex information of an acquired test vertex in the hypergraph;

processing the vertex information of each test vertex by using the trained hypergraph processing network model to obtain a representation vector of the test vertex;

predicting the expression vectors of the test vertexes by using the trained classification model to obtain prediction information corresponding to the test vertexes;

and determining the prediction information corresponding to each test vertex as the classification information of each test vertex.

10. A model determination apparatus, characterized in that the apparatus comprises:

the first determination module is used for acquiring a pre-constructed hypergraph and determining a random walk state transition matrix corresponding to the hypergraph;

and the model determining module is used for acquiring a preset activation function and determining a hypergraph processing network model by using the activation function and the hypergraph signal convolution operator.

11. A computer device, characterized in that the computer device comprises:

a memory for storing executable instructions;

a processor for implementing the method of any one of claims 1 to 9 when executing executable instructions stored in the memory.

12. A computer-readable storage medium storing executable instructions, wherein the executable instructions, when executed by a processor, implement the method of any one of claims 1 to 9.

13. A computer program product comprising a computer program or instructions, wherein the computer program or instructions, when executed by a processor, implement the method of any one of claims 1 to 9.