CN117953258A - Training method of object classification model, object classification method and device - Google Patents
Training method of object classification model, object classification method and device Download PDFInfo
- Publication number
- CN117953258A CN117953258A CN202311255275.5A CN202311255275A CN117953258A CN 117953258 A CN117953258 A CN 117953258A CN 202311255275 A CN202311255275 A CN 202311255275A CN 117953258 A CN117953258 A CN 117953258A
- Authority
- CN
- China
- Prior art keywords
- target
- node
- training
- feature extraction
- object classification
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012549 training Methods 0.000 title claims abstract description 183
- 238000013145 classification model Methods 0.000 title claims abstract description 158
- 238000000034 method Methods 0.000 title claims abstract description 125
- 238000000605 extraction Methods 0.000 claims abstract description 164
- 239000011159 matrix material Substances 0.000 claims abstract description 121
- 230000006870 function Effects 0.000 claims description 83
- 238000012545 processing Methods 0.000 claims description 38
- 230000008569 process Effects 0.000 claims description 33
- 238000003860 storage Methods 0.000 claims description 22
- 101100134058 Caenorhabditis elegans nth-1 gene Proteins 0.000 claims description 4
- 238000010586 diagram Methods 0.000 description 16
- 238000013528 artificial neural network Methods 0.000 description 9
- 230000006872 improvement Effects 0.000 description 8
- 238000004590 computer program Methods 0.000 description 7
- 230000002776 aggregation Effects 0.000 description 5
- 238000004220 aggregation Methods 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 5
- 238000007781 pre-processing Methods 0.000 description 5
- 238000004364 calculation method Methods 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 238000010606 normalization Methods 0.000 description 3
- 230000004913 activation Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000009499 grossing Methods 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 description 1
- 230000003213 activating effect Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000001934 delay Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000011478 gradient descent method Methods 0.000 description 1
- 238000003475 lamination Methods 0.000 description 1
- 239000012633 leachable Substances 0.000 description 1
- 230000005055 memory storage Effects 0.000 description 1
- 230000002085 persistent effect Effects 0.000 description 1
- 229920001296 polysiloxane Polymers 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 239000010979 ruby Substances 0.000 description 1
- 229910001750 ruby Inorganic materials 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
- 238000009827 uniform distribution Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/042—Knowledge-based neural networks; Logical representations of neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/09—Supervised learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the application provides a training method of an object classification model, an object classification method and a device, wherein the training method comprises the following steps: acquiring a first initial adjacency matrix of sample graph data and first node characteristics of a plurality of nodes in the sample graph data; classifying a plurality of target objects corresponding to the sample graph data according to the first initial adjacency matrix, the first node characteristics and the current structure parameters to be trained of each candidate structure in the network to be trained to obtain a first classification result; and if the training ending condition is determined to be met, determining a target structure of each feature extraction layer according to the structural parameters when the training ending condition is met, and obtaining an object classification model. By the embodiment of the application, the performance of the object classification model is improved.
Description
Technical Field
The present application relates to the field of artificial intelligence technologies, and in particular, to a training method for an object classification model, an object classification method and an object classification device.
Background
The graph data is a data structure in which entities and relationships are represented by nodes (Vertex/Node) and edges (Edge), and can be used for classifying entities because the graph data includes rich information. In recent years, processing of graph data through graph neural networks has become a research direction for hot hands. The graph neural network used for carrying out entity classification on the graph data often has the problems of poor stability, sensitivity to small disturbance, low accuracy of classification results and the like.
Disclosure of Invention
The application provides a training method of an object classification model, an object classification method and a device, so as to improve the performance of the object classification model and further improve the accuracy of classification results.
In a first aspect, an embodiment of the present application provides a training method for an object classification model, including:
Acquiring a first initial adjacency matrix of sample graph data and first node characteristics of a plurality of nodes in the sample graph data; the nodes are in one-to-one correspondence with the target objects;
Performing iterative training on the network to be trained by using the first initial adjacency matrix and the first node characteristics to obtain an object classification model;
The network to be trained comprises N feature extraction layers, wherein each feature extraction layer comprises M candidate structures; the iterative training includes: classifying the plurality of target objects according to the first initial adjacency matrix, the first node characteristics and the structural parameters to be trained currently of each candidate structure to obtain a first classification result; if the training ending condition is determined to be met, determining a target structure of each feature extraction layer according to the structural parameters when the training ending condition is met, and obtaining an object classification model; determining that the training ending condition is met comprises determining that the training ending condition is met according to the first classification result; n and M are integers greater than 1.
It can be seen that in the embodiment of the present application, a network to be trained is proposed, where the network includes N feature extraction layers, and each feature extraction layer includes M candidate structures; in the process of training the network to be trained, classifying a target object corresponding to sample graph data according to a first initial adjacent matrix, first node characteristics and current structure parameters to be trained of each candidate structure of the sample graph data to obtain a first classification result; and when the training ending condition is determined to be met, determining a target structure of each feature extraction layer according to the structural parameters when the training ending condition is met, and obtaining an object classification model. In the training mode, because the structure parameters of the candidate structures are parameters to be trained, the structure parameters can be continuously and automatically learned and optimized in the iterative training process, and the optimal structure parameters are achieved when the training ending condition is met; the object classification model comprises the target structure which is determined based on the optimal structural parameters, so that the stability of the target structure is guaranteed, the stability of the object classification model is further guaranteed, and the disturbance caused by various factors can be avoided by the object classification model. That is, the performance of the object classification model is improved, and the accuracy of the classification result of the object classification model is further improved.
In a second aspect, an embodiment of the present application provides an object classification method, including:
acquiring a second initial adjacency matrix of the graph data to be processed and fourth node characteristics of a plurality of nodes in the graph data to be processed; the nodes in the graph data to be processed are in one-to-one correspondence with the objects to be classified;
Inputting the second initial adjacency matrix and the fourth node characteristic into an object classification model for classification treatment to obtain a second classification result; the object classification model is trained according to the training method of the object classification model provided in the first aspect.
It can be seen that, in the embodiment of the present application, since the object classification model used for classifying the to-be-processed image data is obtained by training the to-be-trained network including N feature extraction layers, where each feature extraction layer includes M candidate structures, and since the structural parameters of each candidate structure are to-be-trained parameters, the structural parameters are continuously and automatically learned and optimized in the iterative training process, and reach the optimum when the training end condition is satisfied; the object classification model comprises a target structure which is determined based on the optimal structural parameters, so that the stability of the target structure is ensured, and the stability of the object classification model is further ensured; the object classification model with high stability is used for classifying the image data to be processed, so that disturbance caused by various factors can be avoided, and the accuracy of classification results is ensured.
In a third aspect, an embodiment of the present application provides a training apparatus for an object classification model, including:
the acquisition module is used for acquiring a first initial adjacency matrix of the sample graph data and first node characteristics of a plurality of nodes in the sample graph data; the nodes are in one-to-one correspondence with the target objects;
the training module is used for carrying out iterative training on the network to be trained by utilizing the first initial adjacency matrix and the first node characteristics to obtain an object classification model;
The network to be trained comprises N feature extraction layers, wherein each feature extraction layer comprises M candidate structures; the iterative training includes: classifying the plurality of target objects according to the first initial adjacency matrix, the first node characteristics and the structural parameters to be trained currently of each candidate structure to obtain a first classification result; if the training ending condition is determined to be met, determining a target structure of each feature extraction layer according to the structural parameters when the training ending condition is met, and obtaining an object classification model; determining that the training ending condition is met comprises determining that the training ending condition is met according to the first classification result; n and M are integers greater than 1.
In a fourth aspect, an embodiment of the present application provides an object classification apparatus, including:
The acquisition module is used for acquiring a second initial adjacency matrix of the graph data to be processed and fourth node characteristics of a plurality of nodes in the graph data to be processed; the nodes in the graph data to be processed are in one-to-one correspondence with the objects to be classified;
The classification module is used for inputting the second initial adjacency matrix and the fourth node characteristic into an object classification model for classification processing to obtain a second classification result; the object classification model is trained according to the training method of the object classification model provided in the first aspect.
In a fifth aspect, an embodiment of the present application provides an electronic device, including:
A processor; and a memory arranged to store computer executable instructions configured to be executed by the processor, the executable instructions comprising steps in a training method for performing the object classification model provided in the first aspect described above, or the executable instructions comprising steps in an object classification method provided in the second aspect described above.
In a sixth aspect, an embodiment of the present application provides a storage medium storing computer-executable instructions that cause a computer to perform the training method of the object classification model provided in the first aspect, or cause a computer to perform the object classification method provided in the second aspect.
Drawings
In order to more clearly illustrate one or more embodiments of the present application or the prior art solutions, the drawings that are required in the embodiments or the prior art descriptions will be briefly described below, it being obvious that the drawings in the following description are only some embodiments described in the present application, and that other drawings may be obtained according to these drawings without inventive faculty for a person skilled in the art.
Fig. 1 is a schematic flow chart of a first method for training an object classification model according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a first initial adjacency matrix according to an embodiment of the present application;
FIG. 3 is a second flow chart of a training method of an object classification model according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a network to be trained according to an embodiment of the present application;
FIG. 5 is a third flow chart of a training method of an object classification model according to an embodiment of the present application;
FIG. 6 is a fourth flowchart of a training method of an object classification model according to an embodiment of the present application;
fig. 7 is a schematic flow chart of an object classification method according to an embodiment of the present application;
FIG. 8 is a schematic diagram of module components of a training device for an object classification model according to an embodiment of the present application;
Fig. 9 is a schematic block diagram of an object classification device according to an embodiment of the present application;
fig. 10 is a schematic structural diagram of an electronic device according to one or more embodiments of the present application.
Detailed Description
In order that those skilled in the art will better understand the technical solutions of one or more embodiments of the present application, the technical solutions of one or more embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in one or more embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from one or more embodiments of the application, are intended to be within the scope of this document.
Currently, in order to avoid the problem of overcomplete (Over-Smooth), a neural network (Graph Convolutional Network, GCN) for classifying entities of graph data generally includes two graph roll layers (GCNConv), but the two graph roll layers generally consider only a part of nodes as a supervisory training signal, and the accuracy of classification results is not high for classification tasks. In order to give consideration to the problem of over-smoothing and the accuracy of classification results, a multi-layer graph neural network PPNP is also provided in the prior art, wherein the PPNP is a neural network based on GCN and PageRank, and can alleviate the problem of over-smoothing of GCN in the characteristic extraction process under the condition of not introducing additional learnable parameters. Although PPNP can alleviate the overcomplete phenomenon, it cannot obtain a richer node representation when the sparseness of the graph data is high; and the PPNP has poor stability and is sensitive to small disturbance, and even if small changes are made to the input of a certain layer, the classification result of the model can be greatly influenced. Based on the above, in the embodiment of the present application, a network to be trained is proposed, where the network includes N feature extraction layers, and each feature extraction layer includes M candidate structures; in the process of training the network to be trained, classifying a target object corresponding to sample graph data according to a first initial adjacent matrix, first node characteristics and current structure parameters to be trained of each candidate structure of the sample graph data to obtain a first classification result; and when the training ending condition is determined to be met, determining a target structure of each feature extraction layer according to the structural parameters when the training ending condition is met, and obtaining an object classification model. In the training mode, because the structure parameters of the candidate structures are parameters to be trained, the structure parameters can be continuously and automatically learned and optimized in the iterative training process, and the optimal structure parameters are achieved when the training ending condition is met; the object classification model comprises the target structure which is determined based on the optimal structural parameters, so that the stability of the target structure is guaranteed, the stability of the object classification model is further guaranteed, and the disturbance caused by various factors can be avoided by the object classification model. That is, the performance of the object classification model is improved, and the accuracy of the classification result of the object classification model is further improved.
Specifically, fig. 1 is a flow chart of a training method of an object classification model according to one or more embodiments of the present application, where the method in fig. 1 can be performed by a training device of the object classification model, and the training device may be provided in an electronic device. The electronic device may be a terminal device or a server. The terminal equipment can be a mobile phone, a tablet computer, a desktop computer, a portable notebook computer and the like; the server may be an independent server, or may be a server cluster formed by a plurality of servers. As shown in fig. 1, the method comprises the steps of:
Step S102, a first initial adjacent matrix of sample graph data and first node characteristics of a plurality of nodes in the sample graph data are obtained; a plurality of nodes in the sample graph data are in one-to-one correspondence with a plurality of target objects;
The sample graph data comprises a plurality of nodes and at least one edge, each node corresponds to one target object, and each edge represents the association relationship between the target objects corresponding to the connected nodes. The target object may be any entity of a user, account, vehicle, image, video, merchandise, etc. In practical application, object data of each target object may be obtained first, sample graph data may be constructed according to the object data, and first node characteristics of corresponding nodes may be determined according to the object data. As one example, the target object is a user, each node corresponds to one user, and whether edges exist between the corresponding nodes is determined according to whether a transaction (e.g., a transaction, a conversation, a colleague relationship, etc.) exists between the users; accordingly, the first node characteristic may include age, graduation institution, work address, cell phone number, etc. As another example, the target object is an account, each node corresponds to one account, and according to whether there is a transaction (e.g., resource transfer, instant messaging, sending an image, etc.) between the accounts, it is determined whether there is an edge between the corresponding nodes, and accordingly, the first node characteristic may include login time, login location, number of resource transfers, transaction time, transaction location, etc. The target object and the corresponding first node feature are not specifically described, and can be set according to the needs in practical application, so that the application is not particularly limited.
Acquiring a first initial adjacency matrix of sample graph data may include: and constructing a first initial adjacency matrix according to the connection relation between the nodes in the sample graph data. Specifically, each row and each column of the first initial adjacency matrix to be constructed respectively correspond to one node, whether an edge exists between any two nodes (namely, whether a connection relationship exists) is determined, if the edge exists, an element corresponding to the corresponding row is set to 1, and if the edge does not exist, the element corresponding to the corresponding row is set to 0. As an example, the sample graph data includes node 1, node 2, node 3, and node 4; edges exist between the node 1 and the nodes 2 and 4, and edges exist between the node 2 and the nodes 3 and 4; the first row and the first column correspond to node 1, the second row and the second column correspond to node 2, the third row and the third column correspond to node 3, and the fourth row and the fourth column correspond to node 4, so that the obtained first initial adjacency matrix is shown in fig. 2.
Step S104, performing iterative training on the network to be trained by using the first initial adjacency matrix and the first node characteristics to obtain an object classification model; the network to be trained comprises N feature extraction layers, wherein each feature extraction layer comprises M candidate structures; the iterative training comprises: classifying the target object according to the first initial adjacency matrix, the first node characteristics and the current structure parameters to be trained of each candidate structure to obtain a first classification result; if the training ending condition is determined to be met, determining a target structure of each feature extraction layer according to the structural parameters when the training ending condition is met, and obtaining an object classification model; determining that the training ending condition is met includes determining that the training ending condition is met according to a first classification result; n and M are integers greater than 1.
Wherein the N feature extraction layers are connected in sequence, and each candidate structure may include zero graph convolution layers, or at least one graph convolution layer. The M candidate structures included in the same feature extraction layer may be in a side-by-side relationship. The structure of M candidate structures in the same feature extraction layer is different; candidate structures in different feature extraction layers may be the same or different.
In one or more embodiments of the present application, a network to be trained including N feature extraction layers and each feature extraction layer including M candidate structures is provided in the embodiments of the present application; in the process of training the network to be trained, classifying a target object corresponding to sample graph data according to a first initial adjacent matrix, first node characteristics and current structure parameters to be trained of each candidate structure of the sample graph data to obtain a first classification result; and when the training ending condition is determined to be met, determining a target structure of each feature extraction layer according to the structural parameters when the training ending condition is met, and obtaining an object classification model. In the training mode, because the structure parameters of the candidate structures are parameters to be trained, the structure parameters can be continuously and automatically learned and optimized in the iterative training process, and the optimal structure parameters are achieved when the training ending condition is met; the object classification model comprises the target structure which is determined based on the optimal structural parameters, so that the stability of the target structure is guaranteed, the stability of the object classification model is further guaranteed, and the disturbance caused by various factors can be avoided by the object classification model. That is, the performance of the object classification model is improved, and the accuracy of the classification result of the object classification model is further improved.
Considering that when the number of graph convolution layers of the traditional graph neural network is more than 2, nodes of graph data after passing through 2 graph convolution layers are smoother; the accuracy of the classification result is lower when the image neural network only comprising 2 image convolution layers performs object classification; existing graph neural networks comprising more than 2 graph convolutional layers are less stable. Based on this, in the embodiment of the present application, a new network to be trained (the network structure is described in detail in the following related description) including more than 2 graph convolution layers is proposed, and it is expected to search the optimal network structure (i.e. each of the foregoing target structures) from the new network to be trained. That is, in the embodiment of the present application, the search problem of the optimal network structure may be expressed as the following formula one:
Equation one: min a∈UminwaL(a,wa);
Wherein U represents a search space, a represents an optimal network structure, w a is a network parameter of the optimal network structure, that is, the meaning of formula one is that it is desired to find an optimal network structure a e U, so that after training the network parameter w a of the optimal network structure a, the loss L (a, w a) can be minimized. In order to solve the above technical problems in the prior art, the following factors are paid attention to in the searching process of the optimal network structure in the embodiment of the present application: searching the space U as complete as possible, considering a time-consuming loss function of hardware and high-efficiency searching speed. The following details a search process of an optimal network structure, that is, a training process of an object classification model, provided by the embodiment of the present application.
In order to ensure stability of the object classification model obtained by training, in the embodiment of the application, each candidate structure corresponds to a structural parameter to be trained, and in the classification processing process, a first target node characteristic of sample graph data is determined according to a first initial adjacent matrix, a first node characteristic and the structural parameter to be trained of each candidate structure currently, and the target object corresponding to the sample graph data is classified according to the first target node characteristic. Specifically, as shown in fig. 3, step S104 may include the following steps S104-2 to S104-8:
step S104-2, inputting the first initial adjacent matrix and the first node characteristics into the network to be trained, and generating second node characteristics of a plurality of nodes of the sample graph data according to the first initial adjacent matrix and the first node characteristics; the network to be trained comprises N feature extraction layers, each feature extraction layer comprises M candidate structures, and N and M are integers greater than 1.
Specifically, as shown in fig. 4, the network to be trained includes a preprocessing module, and correspondingly, step S104-2 may include: and inputting the first initial adjacency matrix and the first node characteristics into a preprocessing module of the network to be trained, and generating a plurality of node second node characteristics of the sample graph data according to the first initial adjacency matrix and the first node characteristics through the preprocessing module.
Further, in order to avoid the problem that the nodes show smoothness, in the embodiment of the application, the graph edge weights applicable to each graph volume layer are learned on the premise of keeping the structure of the sample graph data unchanged. Specifically, as shown in FIG. 5, step S104-2 may include the following steps S104-22 and S104-24:
Step S104-22, inputting the first initial adjacent matrix and the first node characteristics into the network to be trained, and determining a first target adjacent matrix of the sample graph data according to the first initial adjacent matrix and the current network parameters of the network to be trained;
specifically, normalization processing is performed on the first initial adjacency matrix and current network parameters of the network to be trained, so as to obtain a first target adjacency matrix of the sample graph data. The determination of the first target adjacency matrix can be expressed as the following equation two:
Formula II: a=softmax (a 0*W0);
Wherein, represents element multiplication; w 0∈RV×V is subjected to uniform distribution and represents the current network parameters of the network to be trained, namely the network parameters to be trained; the Softmax function was used for normalization; a 0 represents a first initial adjacency matrix, and as described above, the value of its element is 0 or 1, which represents the connection situation between the corresponding nodes; a represents a first target adjacency matrix, which contains leachable graph edge weights, and the value of each weight is between 0 and 1; v represents the number of nodes in the sample graph data. Because A exists in each graph volume lamination in the subsequent processing of one round of training, and network parameters are optimized in each round of training, graph edge weights can be adjusted through training on the basis of not changing the structure of sample graph data so as to obtain a first target adjacent matrix suitable for the multi-layer graph neural network, and therefore the problem of node representation smoothness is relieved to a certain extent. It should be noted that, for the specific normalization process, reference is made to the related art, and details thereof will not be described here. It may be understood that the parameters to be trained in the present application include the structure parameters of the candidate structure and the network parameters of the network to be trained, where the network parameters include learning rate, regularization parameters, and the like.
Further, as shown in fig. 4, the preprocessing module includes an embedded layer, and accordingly, the step S104-22 may include: and inputting the first initial adjacency matrix and the first node characteristics into an embedded layer of the network to be trained, and determining a first target adjacency matrix of the sample graph data through the embedded layer according to the first initial adjacency matrix and the current network parameters of the network to be trained.
Step S104-24, generating second node characteristics of a plurality of nodes of the sample graph data according to the first node characteristics and the first target adjacency matrix.
As shown in fig. 4, the preprocessing module further includes a first graph convolution layer, and accordingly, steps S104-24 may include: and generating second node characteristics of a plurality of nodes of the sample graph data according to the first node characteristics and the first target adjacency matrix through the first graph convolution layer. Specifically, according to the activation function, the first graph convolution layer generates a plurality of node second node features of the sample graph data based on the first node features, the first target adjacency matrix and current network parameters of the first convolution layer. The generation of the second node feature may be expressed as the following equation three:
And (3) a formula III: h (1)=GCN(A,X(0))=σ(AX(0)W(0));
Wherein H (1) represents a second node feature; GCN represents the first graph convolutional layer; a represents a first target adjacency matrix; x (0) represents a first node feature; w (0) represents the current network parameters of the first graph convolutional layer for affine transformation of X (0); sigma represents the activation function. For a specific procedure of activating the function, reference is made to the related art, and detailed description thereof will not be given here.
Step S104-4, generating first target node characteristics of a plurality of nodes of the sample graph data according to the second node characteristics and the structure parameters to be trained currently of each candidate structure;
Specifically, as shown in fig. 4, the network to be trained includes a feature extraction module, and accordingly, step S104-4 may include: and generating first target node characteristics of a plurality of nodes of the sample graph data according to the second node characteristics and the structural parameters to be trained currently of each candidate structure through the characteristic extraction module. In one or more embodiments of the present application, in order to ensure accuracy of a final determined target structure, a selection weight of each candidate structure is determined based on a structure parameter to be currently trained of each candidate structure, and a first target node characteristic of a plurality of nodes of the sample graph data is determined based on the selection weight. Specifically, as shown in FIG. 5, step S104-4 may include the following steps S104-42 and S104-44:
Step S104-42, determining the selection weight of each candidate structure based on the structure parameters to be trained currently of each candidate structure;
Specifically, according to a preset function, the selection weight of each candidate structure is determined based on the structure parameters to be trained currently of each candidate structure. The preset function may be a gummel Softmax function.
More specifically, as shown in fig. 4, in one or more embodiments of the present application, each feature extraction layer includes 3 branches, one branch corresponds to one candidate structure, and candidate structures included in each feature extraction layer are the same, that is, each feature extraction layer includes: the first candidate structure comprising the second and third graph convolutions (branch 1 from left to right, i.e. candidate 1), the second candidate structure comprising the fourth graph convolutions (branch 2 from left to right, i.e. candidate 2), the third candidate structure comprising zero graph convolutions (branch 3 from left to right, i.e. candidate 3, shown as a straight line). Accordingly, the aforementioned search space U may be expressed as:
U={Block1,1,Block1,2,Block1,3,Block2,1……BlockN,1,BlockN,2,BlockN,3};
Wherein Block 1,1 represents a first candidate structure in a first feature extraction layer from top to bottom, block 1,2 represents a second candidate structure in a first feature extraction layer from top to bottom, block 1,3 represents a third candidate structure in a first feature extraction layer from top to bottom, block 2,1 represents a first candidate structure … Block N,3 in a second feature extraction layer from top to bottom, and Block 3834 represents a third candidate structure in an nth feature extraction layer from top to bottom. In each feature extraction layer, only one candidate structure is selected as the target structure; in general, the selection weight for a candidate structure to be selected as a target structure may be expressed as the following equation four:
Equation four:
wherein i e {1,2,3}, j e {1,2 … N }; k represents the selection weight and, Representing the selection weight of the ith candidate structure in the jth feature extraction layer; b (j) denotes the target structure of the j-th feature extraction layer,/>Representing the ith candidate structure in the selected jth feature extraction layer,/>Indicating that the ith candidate structure in the jth feature extraction layer is selected as the selection weight of the target structure under the variable of the structure parameter θ; /(I)Representing structural parameters of the ith candidate structure in the jth feature extraction layer, and theta (j) represents structural parameters of each candidate structure in the jth feature extraction layer; exp represents an exponential function based on a natural constant e.
Since the definition of the selection weights k is discrete, in order to accurately determine the target structure of each feature extraction layer, the selection weights need to be differentiable, i.e., learnable, updatable. Based on this, in one or more embodiments of the present application, the repartitioning is performed based on gummel Softmax function on the above formula four, i.e. the selection weights of the candidate structures are determined in each training round based on the following formula five:
Formula five:
Wherein i e {1,2,3}, j e {1,2 … N }; g i =gummel (0, 1) is random noise distributed by gummel; τ is a temperature parameter that, when τ approaches 0, Similar to one-shot, near discrete class sampling; when τ is greater,/>Similar to a continuous random variable.
Therefore, based on Gumbel Softmax function, the selection weight of the candidate structure can be directly micro to the structure parameters of the candidate structure, and the target structure determined based on the selection weight is ensured to be the optimal structure of the corresponding feature extraction layer. It should be noted that the preset function is not limited to the gummel Softmax function, and can be set according to the needs in practical application.
Step S104-44, generating first target node characteristics of the plurality of nodes of the sample graph data according to the second node characteristics and the selection weights.
Specifically, feature extraction processing is performed through the N feature extraction layers, so that first target node features of a plurality of nodes are obtained; the input data of the first feature extraction layer is a second node feature and a first target adjacent matrix, and the input data of the nth feature extraction layer is a third node feature and a first target adjacent matrix which are output by the nth-1 feature extraction layer, wherein N is more than or equal to 2 and less than or equal to N; in some embodiments, the data input may be different for different candidate structures of the same feature extraction layer. Taking the network to be trained as the network structure shown in fig. 4 as an example, for the first feature extraction layer, the second node feature and the first target adjacency matrix in the input data of the layer can be input into the first candidate structure and the second candidate structure, and the second node feature in the input data can be input into the third candidate structure; for the nth feature extraction layer, a third node feature in the input data of the layer and the first target adjacency matrix may be input into the first candidate structure and the second candidate structure, and the third node feature in the input data may be input into the third candidate structure.
Further, the feature extraction processing performed by the j-th feature extraction layer may include: inputting input data into M candidate structures of a j-th feature extraction layer, and performing feature extraction processing on the input data through the M candidate structures to obtain M child node features, wherein j is more than or equal to 1 and less than or equal to N; and determining output data of the j-th feature extraction layer according to the selection weights of the M candidate structures of the j-th feature extraction layer and the M sub-node features.
It will be appreciated that when j < N, the output data includes the third node characteristic and the first target adjacency matrix; when j=n, the output data includes first target node characteristics of the plurality of nodes of the sample graph data. For each layer of the graph convolution in the candidate structure, the data processing manner is the same as that of the first graph convolution layer, and the description thereof can be referred to, and the repetition is omitted here.
Further, determining output data of the jth feature extraction layer according to the selection weights of the M candidate structures and the M child node features of the jth feature extraction layer may include: according to the selection weights of the M candidate structures of the j-th feature extraction layer, carrying out aggregation treatment on the M sub-node features to obtain a third node feature; determining the third node feature and the first target adjacency matrix as output data when j < N; when j=n, the third node characteristic is determined as the first target node characteristic. Taking the network to be trained as an example for the network structure shown in fig. 4, the third node characteristic may be expressed as the following formula six:
formula six:
Wherein i e {1,2,3}, j e {1,2 … N }; x (j) denotes input data of the j-th feature extraction layer; Representing the child node characteristics obtained by the ith candidate structure of the jth characteristic extraction layer; GCN (a, H (j))) represents the child node feature obtained by the 1 st candidate structure of the j-th feature extraction layer; GCN (A, H (j)) represents the child node characteristics obtained by the 2 nd candidate structure of the j-th characteristic extraction layer; /(I) The H (j) in (a) represents the sub-node feature obtained by the 3 rd candidate structure of the j-th feature extraction layer, it can be understood that, since the third candidate structure includes zero graph convolution layers, it is equivalent to that the third candidate structure does not process the input data, i.e. the input data is the same as the output data; /(I)A selection weight representing an ith candidate structure of the jth feature extraction layer; h (j+1) denotes the third node feature obtained by the j-th feature extraction layer.
In some embodiments, taking the network to be trained as an example of the network structure shown in fig. 4 as an illustration, for convenience of description, each feature extraction layer from top to bottom is sequentially denoted as a first feature extraction layer, a second feature extraction layer …, an nth feature extraction layer, three candidate structures in each feature extraction layer are sequentially denoted as a first candidate structure, a second candidate structure and a third candidate structure, a sub-node feature obtained by the first candidate structure is denoted as a first sub-node feature, a sub-node feature obtained by the second candidate structure is denoted as a second sub-node feature, and a sub-node feature obtained by the third candidate structure is denoted as a third sub-node feature. After the first initial adjacency matrix and the first node feature are input into the embedding layer, the embedding layer determines a first target adjacency matrix (i.e. the aforementioned a) of the sample graph data according to the first initial adjacency matrix and the current network parameters of the network to be trained, and inputs the first node feature (i.e. the aforementioned X (0)) and the first target adjacency matrix into the first graph convolutional layer. The first graph convolution layer generates a second node feature of the sample graph data (i.e., H (1) described above) from the first node feature and the first target adjacency matrix, and inputs the second node feature and the first target adjacency matrix to the first feature extraction layer (i.e., n=1). The first candidate structure and the second candidate structure in the first feature extraction layer respectively perform feature extraction processing on the input second node feature and the first target adjacent matrix to obtain corresponding first sub-node features (namely GCN (A, GCN (A, H (1)))) and second sub-node features (namely GCN (A, H (1))), and the third candidate structure in the first feature extraction layer performs feature extraction processing on the input second node features to obtain third sub-node features (because the third candidate structure is zero-diagram convolution layer, the input and the output are the same, namely H (1)); according to the determined first selection weight of the first candidate structure, the determined second selection weight of the second candidate structure and the determined third selection weight of the third candidate structure, the first sub-node feature, the second sub-node feature and the third sub-node feature are subjected to aggregation processing to obtain a third node feature (H (2)) of the first feature extraction layer; the third node feature and the first target adjacency matrix are input to the second feature extraction layer (i.e., n=2) as output data of the first feature extraction layer. The first candidate structure and the second candidate structure in the second feature extraction layer respectively perform feature extraction processing on the input third node feature and the first target adjacent matrix to obtain corresponding first sub-node features (namely GCN (A, GCN (A, H (2)))) and second sub-node features (namely GCN (A, H (2))), and the third candidate structure in the second feature extraction layer performs feature extraction processing on the input third node features to obtain third sub-node features (namely H (2)); according to the determined first selection weight of the first candidate structure, the determined second selection weight of the second candidate structure and the determined third selection weight of the third candidate structure, the first sub-node feature, the second sub-node feature and the third sub-node feature are subjected to aggregation processing to obtain a third node feature (H (3)) of the second feature extraction layer; inputting the third node feature and the first target adjacent matrix as output data of the second feature extraction layer to the second feature extraction layer (i.e. n=3) …, and so on until the third node feature and the first target adjacent matrix are used as output data of the nth-1 feature extraction layer and are input to the nth feature extraction layer (i.e. n=n), and the first candidate structure and the second candidate structure in the nth feature extraction layer respectively perform feature extraction processing on the input third node feature and the first target adjacent matrix to obtain corresponding first sub-node features (i.e. GCN (a, H (N -1))) and second sub-node features (i.e. GCN (a, H (N-1))), and the third candidate structure in the nth feature extraction layer performs feature extraction processing on the input third node features to obtain third sub-node features (i.e. H (N-1)); and according to the determined first selection weight of the first candidate structure, the determined second selection weight of the second candidate structure and the determined third selection weight of the third candidate structure, the first sub-node feature, the second sub-node feature and the third sub-node feature are subjected to aggregation processing to obtain a first target node feature (namely H (N+1)).
Step S104-6, classifying the target object according to the first target node characteristics to obtain a first classification result;
As shown in fig. 4, the network to be trained includes a fifth graph convolutional layer, and accordingly, step S104-6 may include: and classifying the target object according to the first target node characteristics through a fifth graph convolution layer to obtain a first classification result. The specific process of classification may refer to the related art, and the present application is not particularly limited thereto.
And step S104-8, if the training ending condition is determined to be met, determining a target structure of each feature extraction layer according to the structural parameters when the training ending condition is met, and obtaining an object classification model.
In one or more embodiments of the present application, determining the target structure of the j-th feature extraction layer may include: determining the maximum selection weight in the selection weights of M candidate structures included in the current jth feature extraction layer, wherein j is more than or equal to 1 and less than or equal to N; and determining the candidate structure corresponding to the maximum selection weight as the target structure of the j-th feature extraction layer.
Further, considering sample graph data of different scales, the time delays corresponding to the network to be trained are often different. In order to better reflect the overall performance of the network to be trained, two loss functions are provided in the embodiment of the application, and a target loss function matched with the sample graph data is selected from the two loss functions according to the scale of the sample graph data to perform loss calculation. Specifically, as shown in fig. 6, the following steps S104-72 and S104-74 may be further included after step S104-6:
step S104-72, determining a target loss function matched with the sample graph data in the first loss function and the second loss function;
The more nodes and edges the sample graph data comprise, the more complex the sample graph data is, the longer the corresponding time delay will be. Based on the first number of nodes and the second number of edges included in the sample graph data, the target loss function corresponding to the sample graph data is determined in the embodiment of the application. Specifically, steps S104-72 may include: determining a first number of nodes included in the sample graph data and a second number of edges included in the sample graph data; if the first number is smaller than the first preset number and the second number is smaller than the second preset number, determining the first loss function as a target loss function matched with the sample graph data; if the first number is not less than the first preset number and the second number is not less than the second preset number, determining the second loss function as a target loss function matched with the sample graph data; if the first number is not smaller than the first preset number and the second number is smaller than the second preset number, determining the first loss function as a target loss function matched with the sample graph data; if the first number is smaller than the first preset number and the second number is not smaller than the second preset number, determining the second loss function as a target loss function matched with the sample graph data. The first preset number and the second preset number can be set automatically according to the needs in practical application. For example, the first preset number is 10, the second preset number is 20, etc.
Step S104-74, determining the target loss according to the target loss function and the first classification result.
Specifically, if the target loss function is a first loss function, determining target loss according to the first loss function, the first classification result and the label of each node in the sample graph data; if the target loss function is the second loss function, determining the target loss according to the second loss function, the first classification result, the label of each node in the sample graph data and the time delay corresponding to the current network to be trained.
More specifically, according to the target loss function, a loss calculation process is performed based on the first classification result, so as to obtain the target loss. If the target loss function is a first loss function, carrying out loss calculation processing according to the first loss function based on the first classification result and the label of each node in the sample graph data to obtain target loss; and if the target loss function is a second loss function, performing loss calculation processing according to the second loss function based on the first classification result, the label of each node in the sample graph data and the time delay corresponding to the current network to be trained, so as to obtain the target loss.
In some embodiments, the first loss function may be a cross entropy loss function, i.e., the first loss function may be expressed as: l (c, w c)=CE(c,wc); wherein L (c, w c) represents the target loss of the network to be trained under the network structure c with the network parameter w c; CE (c, w c) represents a cross entropy loss function, i.e. measures the difference between the node label (i.e. the category to which the predicted node belongs) and the real label (i.e. the category to which the node actually belongs) in the first classification result under the current network structure c.
In some embodiments, the second loss function increases the delay based on the first loss function. I.e. the second loss function can be expressed as L (c, w c)=CE(c,wc) +γlat (c); where LAT (c) represents the latency of the current network structure c on the target device (i.e., the device where the network to be trained is located). Gamma is the amplitude of the adjustment delay and can be set according to the needs in practical application. The delay may be determined by estimating the overall time consumption of the network. I.e. in some embodiments the first and second embodiments,Wherein/>Representing the selection weight of each candidate structure in the j-th feature extraction layer in the network structure c; /(I)Representing each candidate structure in the j-th feature extraction layer in the network structure c; /(I)Is a constant and can be measured and determined in advance in practical application.
Considering that the LAT (c) is also derivable for the selection weight k and the structural parameter θ, the first and second loss functions are also derivable for the selection weight k and the structural parameter θ, in order to increase the search speed of the optimal network structure, in one or more embodiments of the present application, SGD (Stochastic GRADIENT DESCENT, gradient descent method) may be used to efficiently optimize the loss function.
Thus, when the scale of the sample graph data is smaller than the preset scale, the model effect is more focused, namely, the target loss is calculated by using the first loss function; and when the scale of the sample graph data is not smaller than the preset scale, the model effect is considered, and meanwhile, the time delay corresponding to the model is concerned, so that the overall performance of the finally obtained object classification model can be improved.
It should be noted that, in combination with the above description and fig. 4, it can be seen that the number of the graph roll layers in the network to be trained provided by the embodiment of the present application is greater than two, that is, an object classification model including more than two graph roll layers is implemented, so that the requirements of classification tasks can be met. By arranging the embedded layer and generating the first target adjacency matrix through the embedded layer, the problem of over-smoothness in the prior art is avoided. When the loss is calculated, the whole performance of the obtained object classification model is further improved by considering the scale of the sample graph data. In addition, in the training process, the network parameters and the structure parameters of the candidate structures are trained at the same time, rather than being trained respectively, so that the training efficiency is greatly improved. Furthermore, it can be understood that, with continuous iteration of training, the structure parameters of the candidate structure are more and more accurate, and when the training end condition is satisfied, the structure parameters reach the optimum, and the target structure determined based on the optimum structure parameters has higher stability, so that the obtained object classification model has higher stability; therefore, the object classification is carried out based on the object classification model with high stability, and the accuracy of classification results is ensured. It should be noted that fig. 4 is only for illustration and not for limitation, and the number of candidate structures included in each feature extraction layer, and the composition of each candidate structure may be set as required in practical applications.
Corresponding to the foregoing steps S104-72 and S104-74, as shown in FIG. 6, the foregoing step S104-8 may include the following steps S104-82:
Step S104-82, if the target loss is smaller than the preset loss, determining that the training ending condition is met; and determining a target structure of each feature extraction layer according to the structural parameters when the training ending condition is met, and obtaining an object classification model.
Considering that the overall performance of the network to be trained is already substantially stable when the training round reaches a certain round, in some embodiments, determining that the training end condition is met may include: and if the current total training round reaches the preset round, determining that the training ending condition is met. The preset training rounds can be set automatically according to the needs in practical application.
In one or more embodiments of the present application, a network to be trained is proposed that includes N feature extraction layers, and each feature extraction layer includes M candidate structures; in the process of training the network to be trained, classifying a target object corresponding to sample graph data according to a first initial adjacent matrix, first node characteristics and current structure parameters to be trained of each candidate structure of the sample graph data to obtain a first classification result; and when the training ending condition is determined to be met, determining a target structure of each feature extraction layer according to the structural parameters when the training ending condition is met, and obtaining an object classification model. In the training mode, because the structure parameters of the candidate structures are parameters to be trained, the structure parameters can be continuously and automatically learned and optimized in the iterative training process, and the optimal structure parameters are achieved when the training ending condition is met; the object classification model comprises the target structure which is determined based on the optimal structural parameters, so that the stability of the target structure is guaranteed, the stability of the object classification model is further guaranteed, and the disturbance caused by various factors can be avoided by the object classification model. That is, the performance of the object classification model is improved, and the accuracy of the classification result of the object classification model is further improved.
Corresponding to the training method of the object classification model described above, one or more embodiments of the present application further provide an object classification method based on the same technical concept. FIG. 7 is a flow diagram of a method of classifying objects according to one or more embodiments of the application, where the method of FIG. 7 can be performed by an object classification device; the object classification device may be provided in an electronic apparatus. The electronic device may be a terminal device or a server. The terminal equipment can be a mobile phone, a tablet computer, a desktop computer, a portable notebook computer and the like; the server may be an independent server or a server cluster composed of a plurality of servers. As shown in fig. 7, the method includes the steps of:
Step S202, a second initial adjacent matrix of the graph data to be processed and fourth node characteristics of a plurality of nodes in the graph data to be processed are obtained; a plurality of nodes in the graph data to be processed are in one-to-one correspondence with a plurality of objects to be classified;
The method for obtaining the second initial adjacency matrix is the same as the method for obtaining the first initial adjacency matrix, and the description thereof can be common to the related description, and the repetition is omitted here. The fourth node feature may include the same content as the first node feature, or may be a part of the content included in the first node feature.
And S204, inputting the second initial adjacent matrix and the fourth node characteristic into an object classification model for classification processing to obtain a second classification result.
Specifically, the second initial adjacent matrix and the fourth node characteristic are input into an object classification model, and second target node characteristics of a plurality of nodes of the graph data to be processed are determined according to the second initial adjacent matrix and the fourth node characteristic through the object classification model; classifying the object to be classified according to the second target node characteristics to obtain a second classification result. The object classification model is obtained through training according to the training method of the object classification model provided by the embodiment.
Taking the network to be trained as the network shown in fig. 4 as an example, more specifically, step S204 may include: inputting the second initial adjacent matrix and the fourth node characteristics into an embedded layer of the object classification model, and generating a second target adjacent matrix of the image data to be classified according to model parameters of the second initial adjacent matrix and the object classification model through the embedded layer; generating fifth node characteristics of the graph data to be processed according to the second target adjacent matrix and the fourth node characteristics through the first graph convolution layer; performing feature extraction processing on the fifth node feature through N feature extraction layers to obtain a second target node feature; and classifying the object to be classified according to the characteristics of the second target node through a fifth graph convolution layer to obtain a second classification result. The specific implementation manner of each step may be referred to the related description, and the repetition is not repeated here.
It should be noted that, since each feature extraction layer of the object classification model is a trained target structure, the foregoing aggregation processing of the sub-node features obtained from each candidate structure according to the selection weights is not required. That is, for each feature extraction layer, the feature extraction layer performs feature extraction processing on the input data to obtain output data; the input data of the first feature extraction layer is a fifth node feature and a second target adjacent matrix, and the input data of the nth feature extraction layer is a sixth node feature and the first target adjacent matrix which are output by the nth-1 feature extraction layer, wherein N is more than or equal to 2 and less than or equal to N; the nth feature extraction layer outputs a second target node feature.
In the embodiment of the application, when the image data to be processed is obtained, a second initial adjacent matrix of the image data to be processed and a fourth node characteristic of each node in the image data to be processed are obtained, and the second initial adjacent matrix and the fourth node characteristic are input into an object classification model for classification processing, so that a second classification result is obtained. The object classification model used for classifying the image data to be processed is obtained by training a network to be trained comprising N feature extraction layers, wherein each feature extraction layer comprises M candidate structures, and the structural parameters of each candidate structure are parameters to be trained, so that the structural parameters can be continuously and automatically learned and optimized in the iterative training process, and the structural parameters are optimal when the training ending condition is met; the object classification model comprises a target structure which is determined based on the optimal structural parameters, so that the stability of the target structure is ensured, and the stability of the object classification model is further ensured; the object classification model with high stability is used for classifying the image data to be processed, so that disturbance caused by various factors can be avoided, and the accuracy of classification results is ensured.
Corresponding to the above-described training method of the object classification model, one or more embodiments of the present application further provide a training device of the object classification model based on the same technical concept. FIG. 8 is a schematic block diagram of a training apparatus for object classification models according to one or more embodiments of the present application, where the apparatus includes:
An obtaining module 301, configured to obtain a first initial adjacency matrix of sample graph data, and first node features of a plurality of nodes in the sample graph data; the nodes are in one-to-one correspondence with the target objects;
The training module 302 is configured to perform iterative training on a network to be trained by using the first initial adjacency matrix and the first node feature, so as to obtain an object classification model;
The network to be trained comprises N feature extraction layers, wherein each feature extraction layer comprises M candidate structures; the iterative training includes: classifying the target object according to the first initial adjacency matrix, the first node characteristics and the structure parameters to be trained currently of each candidate structure to obtain a first classification result; if the training ending condition is determined to be met, determining a target structure of each feature extraction layer according to the structural parameters when the training ending condition is met, and obtaining an object classification model; determining that the training ending condition is met comprises determining that the training ending condition is met according to the first classification result; n and M are integers greater than 1.
When training a network to be trained, which comprises N feature extraction layers and each feature extraction layer comprises M candidate structures, the training device for the object classification model provided by the embodiment of the application classifies target objects corresponding to sample graph data according to a first initial adjacent matrix of the sample graph data, first node features and current structure parameters to be trained of each candidate structure to obtain a first classification result; and when the training ending condition is determined to be met, determining a target structure of each feature extraction layer according to the structural parameters when the training ending condition is met, and obtaining an object classification model. In the training mode, because the structure parameters of the candidate structures are parameters to be trained, the structure parameters can be continuously and automatically learned and optimized in the iterative training process, and the optimal structure parameters are achieved when the training ending condition is met; the object classification model comprises the target structure which is determined based on the optimal structural parameters, so that the stability of the target structure is guaranteed, the stability of the object classification model is further guaranteed, and the disturbance caused by various factors can be avoided by the object classification model. That is, the performance of the object classification model is improved, and the accuracy of the classification result of the object classification model is further improved.
It should be noted that, the embodiment of the training device related to the object classification model in the present application and the embodiment of the training method related to the object classification model in the present application are based on the same inventive concept, so the specific implementation of the embodiment may refer to the implementation of the foregoing corresponding training method of the object classification model, and the repetition is omitted.
The respective modules in the training apparatus of the object classification model may be implemented in whole or in part by software, hardware, and a combination thereof. The above modules may be embedded in hardware or independent of a processor in the terminal device or a processor in the server, or may be stored in software in a memory in the terminal device or a memory in the server, so that the processor may call and execute operations corresponding to the above modules.
Further, corresponding to the above-described object classification method, one or more embodiments of the present application further provide an object classification device based on the same technical concept. Fig. 9 is a schematic block diagram of an object classification device according to one or more embodiments of the present application, where, as shown in fig. 9, the device includes:
An obtaining module 401, configured to obtain a second initial adjacency matrix of graph data to be processed and a fourth node characteristic of a plurality of nodes in the graph data to be processed; the nodes in the graph data to be processed are in one-to-one correspondence with the objects to be classified;
The classification module 402 is configured to input the second initial adjacency matrix and the fourth node feature into an object classification model for classification processing, so as to obtain a second classification result; the object classification model is obtained by training according to the training method of the object classification model provided by any embodiment.
When the object classification device provided by the embodiment of the application obtains the image data to be processed, the second initial adjacent matrix of the image data to be processed and the fourth node characteristic of each node in the image data to be processed are obtained, and the second initial adjacent matrix and the fourth node characteristic are input into the object classification model for classification processing, so that a second classification result is obtained. The object classification model used for classifying the image data to be processed is obtained by training a network to be trained comprising N feature extraction layers, wherein each feature extraction layer comprises M candidate structures, and the structural parameters of the candidate structures are parameters to be trained, so that the structural parameters can be continuously and automatically learned and optimized in the iterative training process, and the structural parameters are optimal when the training ending condition is met; the object classification model comprises a target structure which is determined based on the optimal structural parameters, so that the stability of the target structure is ensured, and the stability of the object classification model is further ensured; the object classification model with high stability is used for classifying the image data to be processed, so that disturbance caused by various factors can be avoided, and the accuracy of classification results is ensured.
It should be noted that, the embodiment of the object classification device according to the present application and the embodiment of the object classification method according to the present application are based on the same inventive concept, so that the specific implementation of the embodiment may refer to the implementation of the corresponding object classification method, and the repetition is omitted.
The respective modules in the above-described object classification apparatus may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or independent of a processor in the terminal device or a processor in the server, or may be stored in software in a memory in the terminal device or a memory in the server, so that the processor may call and execute operations corresponding to the above modules.
Further, according to the training method and the object classification method of the object classification model described above, based on the same technical concept, one or more embodiments of the present application further provide an electronic device, where the electronic device is configured to execute the training method and the object classification method of the object classification model described above, and fig. 10 is a schematic structural diagram of an electronic device provided by one or more embodiments of the present application.
As shown in fig. 10, the electronic device may have a relatively large difference due to different configurations or performances, and may include one or more processors 501 and a memory 502, where the memory 502 may store one or more storage applications or data. Wherein the memory 502 may be transient storage or persistent storage. The application programs stored in memory 502 may include one or more modules (not shown), each of which may include a series of computer-executable instructions in the electronic device. Still further, the processor 501 may be configured to communicate with the memory 502 and execute a series of computer executable instructions in the memory 502 on an electronic device. The electronic device may also include one or more power supplies 503, one or more wired or wireless network interfaces 504, one or more input/output interfaces 505, one or more keyboards 506, and the like.
In one particular embodiment, an electronic device includes a memory, and one or more programs, where the one or more programs are stored in the memory, and the one or more programs may include one or more modules, and each module may include a series of computer-executable instructions for the electronic device, and execution of the one or more programs by one or more processors includes instructions for:
Acquiring a first initial adjacency matrix of sample graph data and first node characteristics of a plurality of nodes in the sample graph data; the nodes are in one-to-one correspondence with the target objects;
Performing iterative training on the network to be trained by using the first initial adjacency matrix and the first node characteristics to obtain an object classification model;
The network to be trained comprises N feature extraction layers, wherein each feature extraction layer comprises M candidate structures; the iterative training includes: classifying the target object according to the first initial adjacency matrix, the first node characteristics and the structure parameters to be trained currently of each candidate structure to obtain a first classification result; if the training ending condition is determined to be met, determining a target structure of each feature extraction layer according to the structural parameters when the training ending condition is met, and obtaining an object classification model; determining that the training ending condition is met comprises determining that the training ending condition is met according to the first classification result; n and M are integers greater than 1.
When training a network to be trained, which comprises N feature extraction layers and each feature extraction layer comprises N candidate structures, according to a first initial adjacent matrix of sample graph data, first node features and current structure parameters to be trained of each candidate structure, classifying target objects corresponding to the sample graph data to obtain a first classification result; and when the training ending condition is determined to be met, determining a target structure of each feature extraction layer according to the structural parameters when the training ending condition is met, and obtaining an object classification model. In the training mode, because the structure parameters of the candidate structures are parameters to be trained, the structure parameters can be continuously and automatically learned and optimized in the iterative training process, and the optimal structure parameters are achieved when the training ending condition is met; the object classification model comprises the target structure which is determined based on the optimal structural parameters, so that the stability of the target structure is guaranteed, the stability of the object classification model is further guaranteed, and the disturbance caused by various factors can be avoided by the object classification model. That is, the performance of the object classification model is improved, and the accuracy of the classification result of the object classification model is further improved.
In another particular embodiment, an electronic device includes a memory, and one or more programs, wherein the one or more programs are stored in the memory, and the one or more programs may include one or more modules, and each module may include a series of computer-executable instructions for the electronic device, and configured to be executed by one or more processors, the one or more programs comprising computer-executable instructions for:
Acquiring a second initial adjacency matrix of the graph data to be processed and fourth node characteristics of a plurality of nodes in the graph data to be processed; a plurality of nodes in the graph data to be processed are in one-to-one correspondence with objects to be classified;
Inputting the second initial adjacency matrix and the fourth node characteristic into an object classification model for classification treatment to obtain a second classification result; the object classification model is obtained by training according to the training method of the object classification model provided by any embodiment.
When the electronic device provided by the embodiment of the application acquires the image data to be processed, acquiring a second initial adjacent matrix of the image data to be processed and a fourth node characteristic of each node in the image data to be processed, and inputting the second initial adjacent matrix and the fourth node characteristic into an object classification model for classification processing to obtain a second classification result. The object classification model used for classifying the image data to be processed is obtained by training a network to be trained comprising N feature extraction layers, wherein each feature extraction layer comprises M candidate structures, and the structural parameters of the candidate structures are parameters to be trained, so that the structural parameters can be continuously and automatically learned and optimized in the iterative training process, and the structural parameters are optimal when the training ending condition is met; the object classification model comprises a target structure which is determined based on the optimal structural parameters, so that the stability of the target structure is ensured, and the stability of the object classification model is further ensured; the object classification model with high stability is used for classifying the image data to be processed, so that disturbance caused by various factors can be avoided, and the accuracy of classification results is ensured.
It should be noted that, in the embodiment of the present application related to the electronic device and the embodiment of the present application related to the training method and the object classification method of the object classification model are based on the same inventive concept, so the specific implementation of the embodiment may refer to the implementation of the foregoing corresponding training method and the object classification method of the object classification model, and the repetition is not repeated.
Further, according to the training method and the object classification method of the object classification model described above, based on the same technical concept, one or more embodiments of the present application further provide a storage medium, which is used to store computer executable instructions, in a specific embodiment, the storage medium may be a U disc, an optical disc, a hard disc, etc., where the computer executable instructions stored in the storage medium can implement the following flows when executed by a processor:
Acquiring a first initial adjacency matrix of sample graph data and first node characteristics of a plurality of nodes in the sample graph data; the nodes are in one-to-one correspondence with the target objects;
Performing iterative training on the network to be trained by using the first initial adjacency matrix and the first node characteristics to obtain an object classification model;
The network to be trained comprises N feature extraction layers, wherein each feature extraction layer comprises M candidate structures; the iterative training includes: classifying the target object according to the first initial adjacency matrix, the first node characteristics and the structure parameters to be trained currently of each candidate structure to obtain a first classification result; if the training ending condition is determined to be met, determining a target structure of each feature extraction layer according to the structural parameters when the training ending condition is met, and obtaining an object classification model; determining that the training ending condition is met comprises determining that the training ending condition is met according to the first classification result; n and M are integers greater than 1.
The computer executable instructions stored by the storage medium provided by one or more embodiments of the present application, when executed by the processor, train a network to be trained including N feature extraction layers, each feature extraction layer including N candidate structures, and classify a target object corresponding to sample graph data according to a first initial adjacency matrix of the sample graph data, a first node feature, and a structure parameter to be currently trained of each candidate structure in a training process, so as to obtain a first classification result; and when the training ending condition is determined to be met, determining a target structure of each feature extraction layer according to the structural parameters when the training ending condition is met, and obtaining an object classification model. In the training mode, because the structure parameters of the candidate structures are parameters to be trained, the structure parameters can be continuously and automatically learned and optimized in the iterative training process, and the optimal structure parameters are achieved when the training ending condition is met; the object classification model comprises the target structure which is determined based on the optimal structural parameters, so that the stability of the target structure is guaranteed, the stability of the object classification model is further guaranteed, and the disturbance caused by various factors can be avoided by the object classification model. That is, the performance of the object classification model is improved, and the accuracy of the classification result of the object classification model is further improved.
In another specific embodiment, the storage medium may be a usb disk, an optical disk, a hard disk, or the like, where the computer executable instructions stored in the storage medium when executed by the processor implement the following procedures:
acquiring a second initial adjacency matrix of the graph data to be processed and fourth node characteristics of a plurality of nodes in the graph data to be processed; the nodes in the graph data to be processed are in one-to-one correspondence with the objects to be classified;
Inputting the second initial adjacency matrix and the fourth node characteristic into an object classification model for classification treatment to obtain a second classification result; the object classification model is obtained by training according to the training method of the object classification model provided by any embodiment.
When the computer executable instructions stored in the storage medium provided by one or more embodiments of the present application are executed by the processor, when the to-be-processed graph data is obtained, a second initial adjacency matrix of the to-be-processed graph data and a fourth node characteristic of each node in the to-be-processed graph data are obtained, and the second initial adjacency matrix and the fourth node characteristic are input into the object classification model to be classified, so as to obtain a second classification result. The object classification model used for classifying the image data to be processed is obtained by training a network to be trained comprising N feature extraction layers, wherein each feature extraction layer comprises M candidate structures, and the structural parameters of the candidate structures are parameters to be trained, so that the structural parameters can be continuously and automatically learned and optimized in the iterative training process, and the structural parameters are optimal when the training ending condition is met; the object classification model comprises a target structure which is determined based on the optimal structural parameters, so that the stability of the target structure is ensured, and the stability of the object classification model is further ensured; the object classification model with high stability is used for classifying the image data to be processed, so that disturbance caused by various factors can be avoided, and the accuracy of classification results is ensured.
It should be noted that, in the embodiment of the present application related to the storage medium and the embodiment of the training method and the object classification method related to the object classification model in the present application are based on the same inventive concept, so the specific implementation of the embodiment may refer to the implementation of the foregoing corresponding training method and the object classification method of the object classification model, and the repetition is not repeated.
The foregoing describes certain embodiments of the present application. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
In the 90 s of the 20 th century, improvements to one technology could clearly be distinguished as improvements in hardware (e.g., improvements to circuit structures such as diodes, transistors, switches, etc.) or software (improvements to the process flow). However, with the development of technology, many improvements of the current method flows can be regarded as direct improvements of hardware circuit structures. Designers almost always obtain corresponding hardware circuit structures by programming improved method flows into hardware circuits. Therefore, an improvement of a method flow cannot be said to be realized by a hardware entity module. For example, a programmable logic device (Programmable Logic Device, PLD) (e.g., field programmable gate array (Field Programmable GATE ARRAY, FPGA)) is an integrated circuit whose logic functions are determined by user programming of the device. A designer programs to "integrate" a digital system onto a PLD without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Moreover, nowadays, instead of manually manufacturing integrated circuit chips, such programming is mostly implemented with "logic compiler (logic compiler)" software, which is similar to the software compiler used in program development and writing, and the original code before being compiled is also written in a specific programming language, which is called hardware description language (Hardware Description Language, HDL), but HDL is not just one, but a plurality of kinds, such as ABEL(Advanced Boolean Expression Language)、AHDL(Altera Hardware Description Language)、Confluence、CUPL(Cornell University Programming Language)、HDCal、JHDL(Java Hardware Description Language)、Lava、Lola、MyHDL、PALASM、RHDL(Ruby Hardware Description Language), and VHDL (Very-High-SPEED INTEGRATED Circuit Hardware Description Language) and Verilog are currently most commonly used. It will also be apparent to those skilled in the art that a hardware circuit implementing the logic method flow can be readily obtained by merely slightly programming the method flow into an integrated circuit using several of the hardware description languages described above.
The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer readable medium storing computer readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, application SPECIFIC INTEGRATED Circuits (ASICs), programmable logic controllers, and embedded microcontrollers, examples of controllers include, but are not limited to, the following microcontrollers: ARC 625D, atmel AT91SAM, microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic of the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller in a pure computer readable program code, it is well possible to implement the same functionality by logically programming the method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers, etc. Such a controller may thus be regarded as a kind of hardware component, and means for performing various functions included therein may also be regarded as structures within the hardware component. Or even means for achieving the various functions may be regarded as either software modules implementing the methods or structures within hardware components.
The system, apparatus, module or unit set forth in the above embodiments may be implemented in particular by a computer chip or entity, or by a product having a certain function. One typical implementation is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
For convenience of description, the above devices are described as being functionally divided into various units, respectively. Of course, the functions of each unit may be implemented in the same piece or pieces of software and/or hardware when implementing the embodiments of the present application.
It will be appreciated by those skilled in the art that one or more embodiments of the application may be provided as a method, system, or computer program product. Accordingly, one or more embodiments of the application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.
Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.
One or more embodiments of the application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. One or more embodiments of the application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
The embodiments of the present application are described in a progressive manner, and the same and similar parts of the embodiments are all referred to each other, and each embodiment is mainly described in the differences from the other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, as relevant to see a section of the description of method embodiments.
The foregoing description is by way of example only and is not intended to limit the present disclosure. Various modifications and changes may occur to those skilled in the art. Any modifications, equivalent substitutions, improvements, etc. that fall within the spirit and principles of the present document are intended to be included within the scope of the claims of the present document.
Claims (15)
1. A method of training an object classification model, comprising:
Acquiring a first initial adjacency matrix of sample graph data and first node characteristics of a plurality of nodes in the sample graph data; the nodes are in one-to-one correspondence with the target objects;
Performing iterative training on the network to be trained by using the first initial adjacency matrix and the first node characteristics to obtain an object classification model;
The network to be trained comprises N feature extraction layers, wherein each feature extraction layer comprises M candidate structures; the iterative training includes: classifying the plurality of target objects according to the first initial adjacency matrix, the first node characteristics and the structural parameters to be trained currently of each candidate structure to obtain a first classification result; if the training ending condition is determined to be met, determining a target structure of each feature extraction layer according to the structural parameters when the training ending condition is met, and obtaining an object classification model; determining that the training ending condition is met comprises determining that the training ending condition is met according to the first classification result; n and M are integers greater than 1.
2. The method of claim 1, wherein classifying the plurality of target objects according to the first initial adjacency matrix, the first node characteristics, and the structural parameters currently to be trained for each of the candidate structures to obtain a first classification result comprises:
Generating second node features of the plurality of nodes according to the first initial adjacency matrix and the first node features;
Generating first target node characteristics of the plurality of nodes according to the second node characteristics and the structure parameters to be trained currently of each candidate structure;
and classifying the plurality of target objects according to the first target node characteristics to obtain a first classification result.
3. The method of claim 2, wherein generating a second node characteristic of the plurality of nodes from the first initial adjacency matrix and the first node characteristic comprises:
determining a first target adjacency matrix of the sample graph data according to the first initial adjacency matrix and the current network parameters of the network to be trained;
and generating second node characteristics of the plurality of nodes according to the first node characteristics and the first target adjacency matrix.
4. A method according to claim 3, wherein said generating a first target node characteristic of said plurality of nodes based on said second node characteristic and a structural parameter currently to be trained for each of said candidate structures comprises:
determining the selection weight of each candidate structure based on the structure parameters to be trained currently of each candidate structure;
And generating first target node characteristics of the nodes according to the second node characteristics and the selection weights.
5. The method of claim 4, wherein generating a first target node feature of the plurality of nodes based on the second node feature and the selection weight comprises:
Performing feature extraction processing through the N feature extraction layers to obtain first target node features of the plurality of nodes; the input data of the first feature extraction layer is the second node feature and the first target adjacent matrix, and the input data of the nth feature extraction layer is the third node feature and the first target adjacent matrix which are output by the nth-1 feature extraction layer, wherein N is more than or equal to 2 and less than or equal to N;
the feature extraction processing is performed by the j-th feature extraction layer, including:
Inputting input data into M candidate structures of a j-th feature extraction layer, and performing feature extraction processing on the input data through the M candidate structures to obtain M sub-node features, wherein j is more than or equal to 1 and less than or equal to N;
and determining output data of the j-th feature extraction layer according to the selection weights of M candidate structures of the j-th feature extraction layer and the M sub-node features.
6. The method of claim 4, wherein determining the target structure of the j-th feature extraction layer comprises:
Determining the maximum selection weight in the selection weights of the M candidate structures included in the current jth feature extraction layer; j is more than or equal to 1 and N is more than or equal to 1
And determining the candidate structure corresponding to the maximum selection weight as a target structure of the j-th feature extraction layer.
7. The method of claim 6, wherein determining that the end training condition is met based on the first classification result comprises:
determining a target loss function matched with the sample graph data in the first loss function and the second loss function;
determining target loss according to the target loss function and the first classification result;
and if the target loss is smaller than the preset loss, determining that the training ending condition is met.
8. The method of claim 7, wherein determining a target loss function of the first and second loss functions that matches the sample map data comprises:
Determining a first number of nodes included in the sample graph data and a second number of edges included in the sample graph data;
If the first number is smaller than a first preset number and the second number is smaller than a second preset number, determining the first loss function as a target loss function matched with the sample graph data;
if the first number is not smaller than a first preset number and the second number is not smaller than a second preset number, determining a target loss function matched with the sample graph data by the second loss function;
If the first number is not smaller than a first preset number and the second number is smaller than a second preset number, determining the first loss function as a target loss function matched with the sample graph data;
And if the first quantity is smaller than a first preset quantity and the second quantity is not smaller than a second preset quantity, determining the second loss function as a target loss function matched with the sample graph data.
9. The method of claim 7, wherein the objective loss function is the first loss function, and wherein determining the objective loss based on the objective loss function and the first classification result comprises: determining target loss according to the first loss function, the first classification result and the label of each node in the sample graph data;
the target loss function is the second loss function, and the determining the target loss according to the target loss function and the first classification result includes: and determining target loss according to the second loss function, the first classification result, the label of each node in the sample graph data and the time delay corresponding to the current network to be trained.
10. An object classification method, comprising:
acquiring a second initial adjacency matrix of the graph data to be processed and fourth node characteristics of a plurality of nodes in the graph data to be processed; the nodes in the graph data to be processed are in one-to-one correspondence with the objects to be classified;
Inputting the second initial adjacency matrix and the fourth node characteristic into an object classification model for classification treatment to obtain a second classification result; the object classification model is trained according to the training method of the object classification model of any one of claims 1-9.
11. The method of claim 10, wherein the performing a classification process to obtain a second classification result comprises:
determining second target node characteristics of a plurality of nodes of the graph data to be processed according to the second initial adjacency matrix and the fourth node characteristics;
and classifying the object to be classified according to the second target node characteristics to obtain a second classification result.
12. A training device for an object classification model, comprising:
the acquisition module is used for acquiring a first initial adjacency matrix of the sample graph data and first node characteristics of a plurality of nodes in the sample graph data; the nodes are in one-to-one correspondence with the target objects;
the training module is used for carrying out iterative training on the network to be trained by utilizing the first initial adjacency matrix and the first node characteristics to obtain an object classification model;
The network to be trained comprises N feature extraction layers, wherein each feature extraction layer comprises M candidate structures; the iterative training includes: classifying the plurality of target objects according to the first initial adjacency matrix, the first node characteristics and the structural parameters to be trained currently of each candidate structure to obtain a first classification result; if the training ending condition is determined to be met, determining a target structure of each feature extraction layer according to the structural parameters when the training ending condition is met, and obtaining an object classification model; determining that the training ending condition is met comprises determining that the training ending condition is met according to the first classification result; n and M are integers greater than 1.
13. An object classification apparatus, comprising:
The acquisition module is used for acquiring a second initial adjacency matrix of the graph data to be processed and fourth node characteristics of a plurality of nodes in the graph data to be processed; the nodes in the graph data to be processed are in one-to-one correspondence with the objects to be classified;
the classification module is used for inputting the second initial adjacency matrix and the fourth node characteristic into an object classification model for classification processing to obtain a second classification result; the object classification model is trained according to the training method of any one of claims 1-9.
14. An electronic device, comprising:
A processor; and
A memory arranged to store computer executable instructions configured to be executed by the processor, the executable instructions comprising steps for performing the method of training the object classification model according to any of claims 1-9, or the executable instructions comprising steps for performing the method of object classification according to any of claims 10-11.
15. A computer-readable storage medium storing computer-executable instructions that cause a computer to perform the method of training the object classification model according to any one of claims 1-9 or that cause a computer to perform the method of object classification according to any one of claims 10-11.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311255275.5A CN117953258A (en) | 2023-09-26 | 2023-09-26 | Training method of object classification model, object classification method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311255275.5A CN117953258A (en) | 2023-09-26 | 2023-09-26 | Training method of object classification model, object classification method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117953258A true CN117953258A (en) | 2024-04-30 |
Family
ID=90800124
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311255275.5A Pending CN117953258A (en) | 2023-09-26 | 2023-09-26 | Training method of object classification model, object classification method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117953258A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118193797A (en) * | 2024-05-17 | 2024-06-14 | 之江实验室 | Method and device for executing service, storage medium and electronic equipment |
-
2023
- 2023-09-26 CN CN202311255275.5A patent/CN117953258A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118193797A (en) * | 2024-05-17 | 2024-06-14 | 之江实验室 | Method and device for executing service, storage medium and electronic equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN117953258A (en) | Training method of object classification model, object classification method and device | |
CN113822482A (en) | Method and device for establishing load prediction model of comprehensive energy system | |
CN117973544B (en) | Text unit reasoning method device based on semantic distance, storage medium and terminal | |
CN117456028A (en) | Method and device for generating image based on text | |
CN117370536B (en) | Task execution method and device, storage medium and electronic equipment | |
CN117910542A (en) | User conversion prediction model training method and device | |
CN116306855B (en) | Data processing method and device based on memory and calculation integrated system | |
CN117113174A (en) | Model training method and device, storage medium and electronic equipment | |
CN116993185A (en) | Time sequence prediction method, device, equipment and storage medium | |
CN116363418A (en) | Method and device for training classification model, storage medium and electronic equipment | |
CN114792256B (en) | Crowd expansion method and device based on model selection | |
CN110321433B (en) | Method and device for determining text category | |
CN116109008B (en) | Method and device for executing service, storage medium and electronic equipment | |
CN117271611B (en) | Information retrieval method, device and equipment based on large model | |
CN118379605B (en) | Deployment method, device and storage medium of image recognition large model | |
CN116340852B (en) | Model training and business wind control method and device | |
CN117744731B (en) | Model training method, device, medium and equipment based on resistive random access memory | |
CN115017178B (en) | Training method and device for data-to-text generation model | |
CN118193757B (en) | Task execution method and device, storage medium and electronic equipment | |
CN114268965B (en) | Mobile communication network coverage rate calculation method and device based on deep learning | |
CN117593151A (en) | Large user short-term electricity consumption prediction method, device and medium | |
CN117011624A (en) | Training method of target detection model, target detection method and device | |
CN117933707A (en) | Wind control model interpretation method and device, storage medium and electronic equipment | |
CN116597348A (en) | Training method and device for video classification model | |
CN117592507A (en) | Model selection method, device and equipment for hardware equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |