CN116402108A - Model training and graph data processing method, device, medium and equipment - Google Patents

Model training and graph data processing method, device, medium and equipment Download PDF

Info

Publication number
CN116402108A
CN116402108A CN202310264181.8A CN202310264181A CN116402108A CN 116402108 A CN116402108 A CN 116402108A CN 202310264181 A CN202310264181 A CN 202310264181A CN 116402108 A CN116402108 A CN 116402108A
Authority
CN
China
Prior art keywords
model
loss
training sample
training
target model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310264181.8A
Other languages
Chinese (zh)
Inventor
许轶珂
吴若凡
田胜
但家旺
王宝坤
孟昌华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alipay Hangzhou Information Technology Co Ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to CN202310264181.8A priority Critical patent/CN116402108A/en
Publication of CN116402108A publication Critical patent/CN116402108A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0475Generative networks

Abstract

The specification discloses a method, a device, a medium and equipment for processing model training and graph data. The generation of the countermeasure network includes a target model and a countermeasure model. And inputting each training sample determined according to the graph data into a target model to obtain a prediction result of the target model on each training sample, and determining the loss of the target model for predicting each training sample according to the prediction result and the labels of each training sample as a first loss. And inputting each training sample into a countermeasure model, and adjusting the distribution of each training sample through countermeasure model simulation. And predicting the loss of each training sample after the simulated adjustment distribution according to the first loss prediction target model, and taking the predicted loss as a second loss. And training a target model with the minimum second loss as a target, and training an countermeasure model with the maximum second loss as a target. And by generating a countermeasure training mode between the target model and the countermeasure model in the countermeasure network, the minimum performance of the target model is maximized, and the robustness of the target model is improved.

Description

Model training and graph data processing method, device, medium and equipment
Technical Field
The present disclosure relates to the field of machine learning technologies, and in particular, to a method, an apparatus, a medium, and a device for processing model training and graph data.
Background
Currently, users are increasingly concerned about their own private data. In the field of machine learning, training data employed in training a model is often different from task data applied when performing tasks using a trained model.
If the difference between the training data and the task data causes the distribution difference between the training data and the task data, the model can have a trip performance reduced when predicting according to the task data.
Particularly, when the data is data of a graph structure, the factors which can cause the difference between different graph data are more various, so that the distribution difference between the training data and the task data is more likely to occur, and the performance of the model is more difficult to maintain.
Therefore, how to guarantee the performance of a model when predicting from graph data of different distributions is a problem to be solved.
Disclosure of Invention
The present disclosure provides a method, apparatus, medium and device for processing model training and graph data, so as to partially solve the foregoing problems in the prior art.
The technical scheme adopted in the specification is as follows:
the present specification provides a model training method for generating an countermeasure network comprising a target model and a countermeasure model, the method comprising:
determining each training sample according to the graph data, and obtaining labels of each training sample;
inputting each training sample into the target model to obtain a prediction result of the target model on each training sample, and determining the loss of the target model on the prediction of each training sample as a first loss according to the prediction result and the labels of each training sample;
inputting each training sample into a countermeasure model to adjust the distribution of each training sample through the countermeasure model simulation;
according to the first loss, predicting the loss of each training sample subjected to the simulated adjustment distribution by the target model as a second loss;
and training the target model by taking the minimum second loss as a target, and training the countermeasure model by taking the maximum second loss as a target.
The specification provides a graph data processing method, which comprises the following steps:
receiving graph data to be processed;
Inputting the graph data to be processed into a trained target model; the target model is a target model in a generated countermeasure network, which is obtained through training by the model training method;
and obtaining a processing result of the image data to be processed, which is output by the target model.
The present specification provides a model training apparatus for generating an countermeasure network including a target model and a countermeasure model, the apparatus comprising:
the sample determining module is used for determining each training sample according to the graph data and obtaining labels of each training sample;
the prediction module is used for inputting each training sample into the target model to obtain a prediction result of the target model on each training sample, and determining the loss of the target model for predicting each training sample as a first loss according to the prediction result and the labels of each training sample;
the simulation module is used for inputting the training samples into a countermeasure model so as to simulate and adjust the distribution of the training samples through the countermeasure model;
the estimating module is used for estimating the estimated loss of each training sample after the simulated adjustment distribution of the target model according to the first loss, and taking the estimated loss as a second loss;
And the training module is used for training the target model by taking the minimum second loss as a target and training the countermeasure model by taking the maximum second loss as a target.
The present specification provides a processing apparatus of map data, including:
the receiving module is used for receiving the graph data to be processed;
the processing module is used for inputting the graph data to be processed into the trained target model; the target model is a target model in a generated countermeasure network, which is obtained through training by the model training method;
and the prediction module is used for obtaining a processing result of the image data to be processed, which is output by the target model.
The present specification provides a computer readable storage medium storing a computer program which when executed by a processor implements the above model training or graph data processing method.
The present specification provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the above-described method of model training or graph data processing when executing the program.
The above-mentioned at least one technical scheme that this specification adopted can reach following beneficial effect:
In the model training method provided in the present specification, the generated countermeasure network includes the target model and the countermeasure model. And inputting each training sample determined according to the graph data into a target model to obtain a prediction result of the target model on each training sample, and determining the loss of the target model for predicting each training sample according to the prediction result and the labels of each training sample as a first loss. And inputting each training sample into a countermeasure model, and adjusting the distribution of each training sample through countermeasure model simulation. And predicting the loss of each training sample after the simulated adjustment distribution according to the first loss by the target model to be used as a second loss. Finally, the challenge model is trained by targeting the second loss minimum and targeting the second loss maximum.
As can be seen from the above method, by generating a countermeasure training pattern in which the target model and the countermeasure model are "harder" to each other in the countermeasure network, the distribution of training samples can be adjusted to a distribution that makes the target model perform worst by the countermeasure model simulation. And can maximize the accuracy of the prediction result output by the target model when the data input to the target model is the data of the distribution which makes the performance of the target model worst. Namely, the minimum performance of the target model can be maximized, and the robustness of the target model is improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the specification, illustrate and explain the exemplary embodiments of the present specification and their description, are not intended to limit the specification unduly. Attached at
In the figure:
FIG. 1 is a schematic flow chart of a model training method in the present specification;
FIG. 2 is a schematic diagram of a model training process provided herein;
FIG. 3 is a schematic flow chart of a method for processing graph data provided in the present disclosure;
FIG. 4 is a schematic diagram of a model training apparatus provided herein;
FIG. 5 is a schematic diagram of a processing device for graph data provided in the present specification;
fig. 6 is a schematic structural diagram of an electronic device provided in the present specification.
Detailed Description
In the specification, the target model is obtained through training by the model training method provided by the specification. The model training method can be used for wind control scenes. The target model may be used to output wind control results.
In order to at least partially solve the existing problems, the performance of a target model obtained through training when facing specific data with different distributions is improved, the target model is trained by adopting the model training method provided by the specification by adopting a distributed robust optimization method, so that the performance lower limit of the target model obtained through training is improved.
In this specification, the object model may generate classifiers in the reactive network. An countermeasure (hereinafter referred to as a countermeasure model) may also be included in the generated countermeasure network. When training the target model, the target model is trained with the opposing model with the opposing training targets. That is, the present specification trains the target model and the countermeasure model in a manner of mutually opposing, so that the target model and the countermeasure model can be trained to obtain excellent performance and have better robustness even if the other party interferes with the model itself.
Specifically, the reason for the performance degradation of the existing model is as follows: the distribution as training samples is fixed while the model is trained, while the input data distribution when the model is applied for prediction changes. If training samples with different distributions can be used in training the model, the performance of the model can be improved.
Therefore, in order to improve the performance of the target model, the present specification trains the target model by changing the distribution of training samples. So that the target model can be trained to the highest performance based on training samples with a distribution that can make the target model itself perform the worst.
In order to reduce training complexity and calculation cost, the specification does not pre-determine that multiple groups of training samples with different distributions respectively train the target model. The distribution of the training samples is adjusted through the countermeasure model simulation, and the countermeasure model is trained by taking the training samples with the distribution which enables the performance of the target model to be worst through the countermeasure model simulation as targets. And training the target model by taking the minimum prediction loss of the training sample, which is obtained by simulating the target model and enables the distribution of the worst performance of the target model, as a target. Therefore, the performance lower limit of the target model can be maximized, and the robustness of the target model is improved.
Therefore, errors when the target model predicts the data with different distribution from the training sample can be minimized, and accuracy when the target model predicts the data with different distribution from the training sample is improved.
For the purposes of making the objects, technical solutions and advantages of the present specification more apparent, the technical solutions of the present specification will be clearly and completely described below with reference to specific embodiments of the present specification and corresponding drawings. It will be apparent that the described embodiments are only some, but not all, of the embodiments of the present specification. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are intended to be within the scope of the present disclosure.
The following describes in detail the technical solutions provided by the embodiments of the present specification with reference to the accompanying drawings.
Fig. 1 is a schematic flow chart of a model training method in the present specification, generating an countermeasure network including a target model and a countermeasure model, the method specifically includes the following steps:
s100: and determining each training sample according to the graph data, and acquiring labels of each training sample.
In this specification, the model training method may be performed by a server. The server may be a single server or may be a distributed system formed by a plurality of servers, which is not limited herein.
In one or more embodiments of the present disclosure, the server may be trained to obtain a target model for wind control by the model training method.
The target model may be a graph neural network, and the target model may predict a wind control result based on graph data. For example, the target model may predict whether a target (such as a user, a service, etc.) corresponding to the graph data has a risk as a wind control result based on the graph data, or the target model may also predict a wind control policy, etc. as a wind control result. The following description will be made taking the wind control result as an example of whether or not there is a risk for the user.
Thus, the server can determine each training sample from the graph data and obtain annotations for each training sample.
In one or more embodiments of the present disclosure, when determining the training samples, specifically, the server may first obtain the graph data, and for each node in the graph data, determine a sub-graph corresponding to the node in the graph data. After determining the subgraph corresponding to the node, the server may then determine the training sample according to the subgraph corresponding to the node.
In one or more embodiments of the present description, a user in the graph data may be a node. The server can determine, for each user in the graph data, each other user having a direct business relationship or an indirect business relationship with the user according to a preset range, and determine a sub-graph corresponding to the user according to the determined each other user and the user.
One user corresponds to one training sample. The server may, for each user, use whether the user is historically at risk as a label for the training sample to which the user corresponds.
For each user, the sub-graph corresponding to the user may be a sub-graph obtained by sampling or clipping in the graph data with the user as a center according to a preset range. That is, the subgraph may be a subgraph made up of the user and at least some neighboring nodes of the user in the graph data. Wherein the neighbor node of the user is not limited to the one-degree neighbor node of the user. The first neighbor node is the node connected with the first order (hop) of the user. The neighbor nodes of the user may also include nodes connected with the user in order 2 and 3, and the like, and may be specifically set according to the needs, which is not limited in this specification.
In one or more embodiments of the present description, an inter-node edge may be an inter-node history service. Taking the subgraph as an example of sampling, for each user, other users which are directly connected and/or indirectly connected with the user through the target service in each other user within a preset range can be determined by taking the user as a center and serve as target users. Then, according to the user and each target user, determining the sub-graph corresponding to the user.
In one or more embodiments of the present specification, a specified type of service may be regarded as a target service, for example, one having the highest probability of risk occurrence among various services executed historically may be regarded as a target service.
Alternatively, a service that is historically identified as a risk service may also be considered a target service.
In one or more embodiments of the present disclosure, when determining a training sample according to a subgraph corresponding to the node, the server may specifically determine each node feature of the node and each neighboring node of the node according to the subgraph corresponding to the node, and use the node feature of the node and the neighboring node of the node as the training sample.
Or in one or more embodiments of the present disclosure, the server may further determine, according to the subgraph corresponding to the node, each node feature of the node, each neighboring node of the node, and an edge between the node and the neighboring node of the node, and use the node feature of the node, the neighboring node of the node, and the edge between the neighboring nodes of the node as the training sample. That is, the subgraph of the node may also be used as a training sample.
The node characteristics may include, among other things, attribute data of the user as the node, and may include, for example, the age of the user, whether the user has historically been identified as a risky user, the time the user was identified as a risky user, whether the user is a blacklisted user, whether the user is a whitelisted user, a reputation score on a business platform, and so forth.
S102: and inputting each training sample into the target model to obtain a prediction result of the target model on each training sample, and determining the loss of the target model for predicting each training sample according to the prediction result and the labels of each training sample as a first loss.
In one or more embodiments of the present disclosure, the server may input each training sample into the target model to obtain a predicted result of the target model for each training sample.
And then, according to the prediction result and the labels of the training samples, determining the loss of the target model for predicting the training samples as the first loss.
In the present specification, the manner of determining the first loss is not limited, and for example, the first loss may be determined by a cross entropy loss function. Of course, the first loss may be determined in other ways, and may be set as desired.
S104: inputting the training samples into a countermeasure model to adjust the distribution of the training samples through the countermeasure model simulation.
In one or more embodiments of the present disclosure, the server may further input each training sample into the challenge model to adjust the distribution of each training sample through the challenge model simulation.
S106: and predicting the predicted loss of each training sample after the simulated adjustment distribution by the target model according to the first loss, and taking the predicted loss as a second loss.
Since the first loss is a loss of predicting the original training sample by the target model, and the adjusted training sample is adjusted based on the original training sample, in one or more embodiments of the present disclosure, the server may predict, as the second loss, a loss of predicting each training sample after the adjusted distribution is simulated by the target model according to the first loss.
S108: and training the target model by taking the minimum second loss as a target, and training the countermeasure model by taking the maximum second loss as a target.
In order to enable the challenge model to adjust to a distribution that makes the target model perform worst (hereinafter simply referred to as the worst distribution for convenience of description), the server may train the challenge model with the second loss maximum as a target.
In order to minimize the prediction error of the target model for the worst distributed training samples, the server may train the target model with the second loss minimized.
Based on the model training method shown in fig. 1, a countermeasure network is generated that includes a target model and a countermeasure model. And inputting each training sample determined according to the graph data into a target model to obtain a prediction result of the target model on each training sample, and determining the loss of the target model for predicting each training sample according to the prediction result and the labels of each training sample as a first loss. And inputting each training sample into a countermeasure model, and adjusting the distribution of each training sample through countermeasure model simulation. And predicting the loss of each training sample after the simulated adjustment distribution according to the first loss by the target model to be used as a second loss. Finally, the challenge model is trained by targeting the second loss minimum and targeting the second loss maximum.
As can be seen from the above method, by generating a countermeasure training pattern in which the target model and the countermeasure model are "harder" to each other in the countermeasure network, it is possible to adjust the distribution of training samples to a distribution that makes the target model perform worst through the countermeasure model simulation. And can maximize the accuracy of the target model output to the predicted result of the input data when the data of the input target model is the data of the distribution which makes the performance of the target model worst. Namely, the minimum performance of the maximized target model can be realized, and the generalization and the robustness of the target model under the scene of the distribution drift of input data are improved.
As described above, the target model may be used to output wind control results. The wind control result may be a conclusion of whether there is a risk or may be a wind control strategy. For example, taking a node as an example of a user, if a sub-graph corresponding to the user is input into the target model, a wind control result of whether the user has risk or not can be obtained, or a wind control strategy for the user can be obtained.
In step S104 of the present specification, when the distribution of each training sample is adjusted by the challenge model simulation, the server may specifically output a disturbance distribution for adjusting the distribution of each training sample by the challenge model.
Further, in step S106, when predicting the loss of each training sample after the adjustment distribution by the target model according to the first loss, the server may specifically predict the loss of each training sample after the adjustment distribution by the target model according to the first loss and the disturbance distribution.
In one or more embodiments of the present specification, in particular, when outputting a disturbance distribution for adjusting a distribution of each training sample through the countermeasure model, the server may output, for each training sample, a disturbance weight for adjusting a probability of occurrence of the training sample in the each training sample through the countermeasure model.
Then, the server can obtain disturbance distribution for adjusting the distribution of each training sample according to the disturbance weight corresponding to each training sample.
In one or more embodiments of the present disclosure, when predicting, according to the first loss and the disturbance distribution, the loss of each training sample after the adjustment distribution by the target model is predicted, the server may weight, for each training sample, the first loss of the training sample predicted by the target model according to the disturbance weight corresponding to the training sample in the disturbance distribution, to obtain a weighted loss of the training sample. And then, the server can predict the loss of each training sample after the simulated adjustment distribution according to the weighted loss of each training sample by the target model.
In one or more embodiments of the present disclosure, the server may specifically sum the weighted losses from each training sample to obtain the predicted losses for each training sample for the simulated adjustment distribution by the target model.
In one or more embodiments of the present disclosure, the server may further normalize the perturbation weights corresponding to each training sample. And then, the server can obtain disturbance distribution for adjusting the distribution of each training sample according to the normalized disturbance weight corresponding to each training sample.
Further, the server may predict, for each training sample, a first loss of the training sample for the target model according to the normalized disturbance weight corresponding to the training sample in the disturbance distribution, to obtain a weighted loss of the training sample.
The disturbance weight is the likelihood ratio of the distribution after training sample adjustment to the original distribution.
In one or more embodiments of the present disclosure, the manner in which normalization is performed is not limited, and may be normalized by, for example, a normalization index (softmax) function.
In one or more embodiments of the present description, the challenge model may be used to perform a plurality of tasks. For example, the challenge model may be used to perform a first task of outputting an adjustment strategy that adjusts the distribution of each training sample, and may also be used to perform a second task of estimating the second loss.
In one or more embodiments of the present disclosure, the server may then input each training sample into the challenge model, and output an adjustment strategy that simulates adjustment of the distribution of the training samples through the challenge model. Then, the server can input the first loss and the adjustment strategy into the countermeasure model, and the estimated target model predicts the loss of each training sample after the adjustment distribution is simulated, and the estimated loss is used as the second loss.
That is, when the distribution of each training sample is adjusted by the countermeasure model simulation, the training sample after the adjustment of the distribution is obtained without adjustment.
In one or more embodiments of the present description, the tuning strategy may be a disturbance profile.
Alternatively, in one or more embodiments of the present description, the server may also input each training sample into the challenge model, and output the adjustment strategy via the challenge model. And then, the training samples after the distribution adjustment can be obtained through adjustment according to the adjustment strategy. Then, the server may input the training samples after the distribution is adjusted, the training samples without the distribution, and the first loss of the training samples without the distribution to the countermeasure model, to obtain the target model, and predict the loss of each training sample after the distribution is adjusted, as the second loss.
Or, the loss of prediction can be carried out on each training sample after the adjustment distribution which is simulated by the pre-training pre-estimation target model is output through the pre-training pre-estimation model.
The server can input the training samples after distribution adjustment, the training samples without distribution adjustment and the first loss of the training samples without distribution adjustment into a pre-estimation model to obtain the loss of each training sample after distribution adjustment which is predicted by the target model and serve as the second loss.
Wherein the simulated adjusted distribution is the other distribution in the uncertainty set of the distribution of the original training sample.
In step S108 of the present specification, the server may fix the parameters of the challenge model, adjust the parameters of the target model, fix the parameters of the target model, and adjust the parameters of the challenge model, with the second loss being the minimum, and with the second loss being the maximum, when training the challenge model, with the second loss being the minimum.
Then, the server can continue to train the countermeasure model and the target model according to each training sample and the labels of each training sample until the convergence condition is determined to be met. That is, in one or more embodiments of the present description, the server may loop through steps S100-S108 until it is determined that the convergence condition is satisfied.
That is, in a plurality of training rounds from the start of training to the satisfaction of convergence adjustment, the server may adjust the parameters of the objective model with the minimum second loss as a target and adjust the parameters of the countermeasure model with the maximum second loss as a target in the training round for each training round.
Alternatively, the challenge model and the target model may also be trained alternately. One round of training may only target the second loss minimum, adjust the parameters of the target model, or only target the second loss maximum, adjust the parameters of the countermeasure model. And the models of adjacent two rounds of adjustment are different.
Or after the training countermeasure model converges, the distribution of the training samples is adjusted according to the trained countermeasure model so as to train the target model.
The present specification also provides a schematic illustration of the model training process shown in fig. 2. As shown in fig. 2, after the training samples are respectively input into the target model and the countermeasure model, a prediction result output by the target model can be obtained, and a disturbance weight output by the countermeasure model can be obtained. And obtaining the first loss according to the prediction result and the labeling of the training sample. And obtaining disturbance distribution according to the disturbance weight. From the first loss and the disturbance distribution, a second loss can be obtained. Then, according to the second loss, the parameters of the target model can be adjusted by a gradient descent method, and the parameters of the countermeasure model can be adjusted by a gradient ascent method.
In the present specification, the countermeasure model may be a graph neural network.
The model training method provided above for one or more embodiments of the present disclosure, and in addition, the present disclosure further provides a method for processing graph data, as shown in fig. 3.
Fig. 3 is a flow chart of a method for processing graph data provided in the present specification, including the following steps:
s300: and receiving the graph data to be processed.
The processing method of the graph data can be executed by a server.
First, the server may receive pending diagram data.
S302: and inputting the graph data to be processed into the trained target model.
After receiving the pending diagram data, the server may input the pending diagram data into the trained object model.
The target model may be a target model in a generated countermeasure network trained by the method described in fig. 1.
S304: and obtaining a processing result of the image data to be processed, which is output by the target model.
And then, the server can obtain the processing result of the to-be-processed graph data output by the target model.
In one or more embodiments of the present disclosure, the graph data to be processed is graph data corresponding to a wind control user. The processing result can be a wind control result of the wind control user, which is predicted according to the to-be-processed graph data. For example, the processing result obtained may be a conclusion of whether the wind-controlled user is at risk or not, or may also be a wind-controlled policy.
Based on the same thought as the model training method corresponding to fig. 1, the present disclosure also provides a corresponding model training apparatus, as shown in fig. 4.
Fig. 4 is a schematic diagram of a model training apparatus provided in the present specification, for generating an countermeasure network including a target model and a countermeasure model, the apparatus comprising:
the sample determining module 200 is configured to determine each training sample according to the graph data, and obtain a label of each training sample;
the prediction module 201 is configured to input the training samples into the target model, obtain a predicted result of the target model on the training samples, and determine a loss of the target model for predicting the training samples according to the predicted result and the labels of the training samples, as a first loss;
a simulation module 202, configured to input the training samples into a challenge model, so as to adjust the distribution of the training samples through the challenge model simulation;
the estimating module 203 is configured to estimate, according to the first loss, a predicted loss of each training sample after the simulated adjustment distribution by the target model, as a second loss;
and the training module 204 is configured to train the target model with the second loss being the minimum and train the challenge model with the second loss being the maximum.
Optionally, the sample determining module 200 is specifically configured to obtain graph data; determining a sub-graph corresponding to each node in the graph data aiming at each node in the graph data; and determining a training sample according to the subgraph corresponding to the node.
Optionally, the simulation module 202 is specifically configured to output, through the countermeasure model, a disturbance distribution for adjusting a distribution of the training samples; the estimating module 203 is specifically configured to estimate, according to the first loss and the disturbance distribution, a loss of each training sample after the adjustment distribution by using the target model.
Optionally, the simulation module 202 is specifically configured to output, for each training sample, a disturbance weight for adjusting an occurrence probability of the training sample in the respective training sample through the countermeasure model; and obtaining disturbance distribution for adjusting the distribution of each training sample according to the disturbance weight corresponding to each training sample.
Optionally, the estimating module 203 is specifically configured to, for each training sample, predict, according to a disturbance weight corresponding to the training sample in the disturbance distribution, a first loss of the training sample for the target model, so as to obtain a weighted loss of the training sample; and predicting the loss of each training sample after the simulated adjustment distribution by the target model according to the weighted loss of each training sample.
Optionally, the training module 204 is configured to fix parameters of the countermeasure model, target the second loss to be minimum, adjust parameters of the target model, fix parameters of the target model, and target the second loss to be maximum, and adjust parameters of the countermeasure model.
Based on the same thought as the processing method of the graph data corresponding to fig. 3, the present specification also provides a corresponding processing device of the graph data, as shown in fig. 5.
A receiving module 400, configured to receive to-be-processed graph data;
a processing module 401, configured to input the graph data to be processed into a trained target model; the target model is a target model in a generated countermeasure network which is obtained through training by the method shown in the figure 1;
and the prediction module 402 is used for obtaining a processing result of the to-be-processed graph data output by the target model.
The present specification also provides a computer readable storage medium storing a computer program operable to perform the method provided in fig. 1 or 3 above.
The present specification also provides a schematic structural diagram of the electronic device shown in fig. 6. At the hardware level, as shown in fig. 6, the electronic device includes a processor, an internal bus, a memory, and a nonvolatile memory, and may of course include hardware required by other services. The processor reads the corresponding computer program from the non-volatile memory into the memory and then runs to implement the method provided in fig. 1 or fig. 3 described above.
It should be noted that, in this specification, all actions of acquiring signals, information or data are performed under the condition of conforming to the corresponding data protection rule policy of the country of the location and obtaining the authorization given by the owner of the corresponding device.
Of course, other implementations, such as logic devices or combinations of hardware and software, are not excluded from the present description, that is, the execution subject of the following processing flows is not limited to each logic unit, but may be hardware or logic devices.
In the 90 s of the 20 th century, improvements to one technology could clearly be distinguished as improvements in hardware (e.g., improvements to circuit structures such as diodes, transistors, switches, etc.) or software (improvements to the process flow). However, with the development of technology, many improvements of the current method flows can be regarded as direct improvements of hardware circuit structures. Designers almost always obtain corresponding hardware circuit structures by programming improved method flows into hardware circuits. Therefore, an improvement of a method flow cannot be said to be realized by a hardware entity module. For example, a programmable logic device (Programmable Logic Device, PLD) (e.g., field programmable gate array (Field Programmable Gate Array, FPGA)) is an integrated circuit whose logic function is determined by the programming of the device by a user. A designer programs to "integrate" a digital system onto a PLD without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Moreover, nowadays, instead of manually manufacturing integrated circuit chips, such programming is mostly implemented by using "logic compiler" software, which is similar to the software compiler used in program development and writing, and the original code before the compiling is also written in a specific programming language, which is called hardware description language (Hardware Description Language, HDL), but not just one of the hdds, but a plurality of kinds, such as ABEL (Advanced Boolean Expression Language), AHDL (Altera Hardware Description Language), confluence, CUPL (Cornell University Programming Language), HDCal, JHDL (Java Hardware Description Language), lava, lola, myHDL, PALASM, RHDL (Ruby Hardware Description Language), etc., VHDL (Very-High-Speed Integrated Circuit Hardware Description Language) and Verilog are currently most commonly used. It will also be apparent to those skilled in the art that a hardware circuit implementing the logic method flow can be readily obtained by merely slightly programming the method flow into an integrated circuit using several of the hardware description languages described above.
The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer readable medium storing computer readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, application specific integrated circuits (Application Specific Integrated Circuit, ASIC), programmable logic controllers, and embedded microcontrollers, examples of which include, but are not limited to, the following microcontrollers: ARC625D, atmel AT91SAM, microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic of the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller in a pure computer readable program code, it is well possible to implement the same functionality by logically programming the method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers, etc. Such a controller may thus be regarded as a kind of hardware component, and means for performing various functions included therein may also be regarded as structures within the hardware component. Or even means for achieving the various functions may be regarded as either software modules implementing the methods or structures within hardware components.
The system, apparatus, module or unit set forth in the above embodiments may be implemented in particular by a computer chip or entity, or by a product having a certain function. One typical implementation is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
For convenience of description, the above devices are described as being functionally divided into various units, respectively. Of course, the functions of each element may be implemented in one or more software and/or hardware elements when implemented in the present specification.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.
Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.
It will be appreciated by those skilled in the art that embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, the present specification may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present description can take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
The description may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, as relevant to see a section of the description of method embodiments.
The foregoing is merely exemplary of the present disclosure and is not intended to limit the disclosure. Various modifications and alterations to this specification will become apparent to those skilled in the art. Any modifications, equivalent substitutions, improvements, or the like, which are within the spirit and principles of the present description, are intended to be included within the scope of the claims of the present description.

Claims (16)

1. A model training method of generating an countermeasure network comprising a target model and a countermeasure model, the method comprising:
determining each training sample according to the graph data, and obtaining labels of each training sample;
inputting each training sample into the target model to obtain a prediction result of the target model on each training sample, and determining the loss of the target model on the prediction of each training sample as a first loss according to the prediction result and the labels of each training sample;
Inputting each training sample into a countermeasure model to adjust the distribution of each training sample through the countermeasure model simulation;
according to the first loss, predicting the loss of each training sample subjected to the simulated adjustment distribution by the target model as a second loss;
and training the target model by taking the minimum second loss as a target, and training the countermeasure model by taking the maximum second loss as a target.
2. The method of claim 1, determining each training sample from the graph data, comprising:
obtaining graph data;
determining a sub-graph corresponding to each node in the graph data aiming at each node in the graph data;
and determining a training sample according to the subgraph corresponding to the node.
3. The method of claim 1, wherein the adjusting the distribution of the training samples by the challenge model simulation comprises:
outputting disturbance distribution for adjusting the distribution of each training sample through the countermeasure model;
according to the first loss, predicting the loss of each training sample after the target model is adjusted and distributed, which specifically comprises the following steps:
And predicting the loss of each training sample after the adjustment distribution by the target model according to the first loss and the disturbance distribution.
4. A method according to claim 3, wherein the outputting of the disturbance profile for adjusting the profile of each training sample by the challenge model comprises:
outputting disturbance weights for adjusting the occurrence probability of the training samples in the training samples through the countermeasure model aiming at each training sample;
and obtaining disturbance distribution for adjusting the distribution of each training sample according to the disturbance weight corresponding to each training sample.
5. The method of claim 4, wherein predicting the loss of each training sample after the adjustment distribution by the target model according to the first loss and the disturbance distribution specifically comprises:
for each training sample, according to the disturbance weight corresponding to the training sample in the disturbance distribution, predicting a first loss weighting of the training sample by the target model to obtain a weighting loss of the training sample;
and predicting the loss of each training sample after the simulated adjustment distribution by the target model according to the weighted loss of each training sample.
6. The method of claim 1, wherein the training the target model with the second loss being the smallest and the training the challenge model with the second loss being the largest, specifically comprises:
and fixing the parameters of the countermeasure model, adjusting the parameters of the target model by taking the minimum loss as a target, fixing the parameters of the target model, and adjusting the parameters of the countermeasure model by taking the maximum loss as a target.
7. A method of processing graph data, comprising:
receiving graph data to be processed;
inputting the graph data to be processed into a trained target model; wherein the target model is a target model in a generated countermeasure network trained by the method of any one of claims 1 to 6;
and obtaining a processing result of the image data to be processed, which is output by the target model.
8. A model training apparatus for generating an countermeasure network comprising a target model and a countermeasure model, the apparatus comprising:
the sample determining module is used for determining each training sample according to the graph data and obtaining labels of each training sample;
the prediction module is used for inputting each training sample into the target model to obtain a prediction result of the target model on each training sample, and determining the loss of the target model for predicting each training sample as a first loss according to the prediction result and the labels of each training sample;
The simulation module is used for inputting the training samples into a countermeasure model so as to simulate and adjust the distribution of the training samples through the countermeasure model;
the estimating module is used for estimating the estimated loss of each training sample after the simulated adjustment distribution of the target model according to the first loss, and taking the estimated loss as a second loss;
and the training module is used for training the target model by taking the minimum second loss as a target and training the countermeasure model by taking the maximum second loss as a target.
9. The apparatus of claim 8, the sample determination module being configured to obtain graph data; determining a sub-graph corresponding to each node in the graph data aiming at each node in the graph data; and determining a training sample according to the subgraph corresponding to the node.
10. The apparatus of claim 8, the simulation module being configured to output, via the challenge model, a disturbance profile for adjusting the profile of the training samples; the estimating module is specifically configured to estimate, according to the first loss and the disturbance distribution, a loss of each training sample after the adjustment distribution by using the target model.
11. The apparatus of claim 10, the simulation module being configured to, for each training sample, output, via the challenge model, a disturbance weight for adjusting a probability of occurrence of the training sample in the respective training sample; and obtaining disturbance distribution for adjusting the distribution of each training sample according to the disturbance weight corresponding to each training sample.
12. The apparatus of claim 11, wherein the estimating module is specifically configured to, for each training sample, predict, according to a disturbance weight corresponding to the training sample in the disturbance distribution, a first loss weighting of the training sample for the target model, and obtain a weighted loss of the training sample; and predicting the loss of each training sample after the simulated adjustment distribution by the target model according to the weighted loss of each training sample.
13. The apparatus of claim 8, the training module to fix parameters of the challenge model, target the second loss minimum, adjust parameters of the target model, and fix parameters of the target model, target the second loss maximum, adjust parameters of the challenge model.
14. A graph data processing apparatus comprising:
The receiving module is used for receiving the graph data to be processed;
the processing module is used for inputting the graph data to be processed into the trained target model; wherein the target model is a target model in a generated countermeasure network trained by the method of any one of claims 1 to 6;
and the prediction module is used for obtaining a processing result of the image data to be processed, which is output by the target model.
15. A computer readable storage medium storing a computer program which, when executed by a processor, implements the method of any of the preceding claims 1-7.
16. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method of any of the preceding claims 1-7 when the program is executed.
CN202310264181.8A 2023-03-10 2023-03-10 Model training and graph data processing method, device, medium and equipment Pending CN116402108A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310264181.8A CN116402108A (en) 2023-03-10 2023-03-10 Model training and graph data processing method, device, medium and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310264181.8A CN116402108A (en) 2023-03-10 2023-03-10 Model training and graph data processing method, device, medium and equipment

Publications (1)

Publication Number Publication Date
CN116402108A true CN116402108A (en) 2023-07-07

Family

ID=87015092

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310264181.8A Pending CN116402108A (en) 2023-03-10 2023-03-10 Model training and graph data processing method, device, medium and equipment

Country Status (1)

Country Link
CN (1) CN116402108A (en)

Similar Documents

Publication Publication Date Title
CN113297396B (en) Method, device and equipment for updating model parameters based on federal learning
CN116304720B (en) Cost model training method and device, storage medium and electronic equipment
CN111325444A (en) Risk prevention and control decision method, device, system and equipment
CN116049761A (en) Data processing method, device and equipment
CN115712866A (en) Data processing method, device and equipment
CN115618748B (en) Model optimization method, device, equipment and storage medium
CN116308738B (en) Model training method, business wind control method and device
CN117093862A (en) Model training method and device, electronic equipment and storage medium
CN113792889B (en) Model updating method, device and equipment
CN116402108A (en) Model training and graph data processing method, device, medium and equipment
CN115618964A (en) Model training method and device, storage medium and electronic equipment
CN114120273A (en) Model training method and device
CN115204395A (en) Data processing method, device and equipment
CN116109008B (en) Method and device for executing service, storage medium and electronic equipment
CN117576522B (en) Model training method and device based on mimicry structure dynamic defense
CN116501852B (en) Controllable dialogue model training method and device, storage medium and electronic equipment
CN117036870B (en) Model training and image recognition method based on integral gradient diversity
CN115841335B (en) Data processing method, device and equipment
CN117350351B (en) Training method of user response prediction system, user response prediction method and device
CN117933707A (en) Wind control model interpretation method and device, storage medium and electronic equipment
CN117576522A (en) Model training method and device based on mimicry structure dynamic defense
CN114528931A (en) Model training method and device
CN116386894A (en) Information tracing method and device, storage medium and electronic equipment
CN116842570A (en) Model training method and business wind control method and device
CN117592102A (en) Service execution method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination