CN115470927A - Automatic extraction method of surrogate model, terminal and storage medium - Google Patents

Automatic extraction method of surrogate model, terminal and storage medium Download PDF

Info

Publication number
CN115470927A
CN115470927A CN202210987308.4A CN202210987308A CN115470927A CN 115470927 A CN115470927 A CN 115470927A CN 202210987308 A CN202210987308 A CN 202210987308A CN 115470927 A CN115470927 A CN 115470927A
Authority
CN
China
Prior art keywords
model
data
surrogate
task
extraction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210987308.4A
Other languages
Chinese (zh)
Inventor
刘洋
罗基
王轩
张伟哲
陈斌
蒋琳
吴宇琳
漆舒汉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Graduate School Harbin Institute of Technology
Original Assignee
Shenzhen Graduate School Harbin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Graduate School Harbin Institute of Technology filed Critical Shenzhen Graduate School Harbin Institute of Technology
Priority to CN202210987308.4A priority Critical patent/CN115470927A/en
Publication of CN115470927A publication Critical patent/CN115470927A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses an automatic extraction method of a surrogate model, a terminal and a storage medium, wherein the method comprises the following steps: acquiring a task and task data information oriented to the target model, determining a framework of the substitution model according to the task oriented to the target model, and setting a model extraction framework for improving an extraction process according to the characteristics of model extraction; performing dimensionality reduction on the collected task data of the target model, and screening the dimensionality reduced data to obtain a training data set of the surrogate model; measuring the classification confidence of the substitution model according to a preset index, and classifying the training data set according to the classification confidence to obtain query sample data; training a surrogate model through a supervised learning algorithm and a consistency regularization algorithm in a model extraction framework to obtain a trained surrogate model; the method can obtain the decision capability of the target model, so that the surrogate model is close to or even surpasses the performance of the target model on the test data set as much as possible, and the surrogate model has more satisfactory usability.

Description

Automatic extraction method of surrogate model, terminal and storage medium
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to an automatic extraction method of a surrogate model, a terminal and a storage medium.
Background
Machine learning as a service helps clients benefit from machine learning, developers can build quickly and efficiently using machine learning as a service product, can access pre-built algorithms and models, without the associated cost, time and risk of building internal machine learning teams.
The existing work shows that model imitation learning can be carried out by accessing the model on the platform and utilizing information such as model output and the like, so that the specific task oriented capability of the target model is obtained. The method can roughly know the task and the used data of the target model, but the architecture, the training mode and the specific training data of the target model are unknown, and the model output can be obtained only by inquiring a large number of samples from the target model, so that a surrogate model training data set is constructed to train the surrogate model; theoretically, more queries bring better model results. However, the large number of queries increases the cost of the queries to the platform, resulting in higher costs. How to select a more representative sample to query the target model reduces the number of queries, and meanwhile, the usability of the substitute model can be guaranteed to be a current research hotspot.
Therefore, the prior art has yet to be improved.
Disclosure of Invention
The technical problem to be solved by the present invention is to provide an automatic extraction method of a surrogate model, a terminal and a storage medium, aiming at the defects of the prior art, so as to solve the technical problem that the availability of the surrogate model cannot be ensured under the condition of low cost in the traditional surrogate model construction mode.
The technical scheme adopted by the invention for solving the technical problem is as follows:
in a first aspect, the present invention provides a method for automatically extracting a surrogate model, including:
acquiring a task and task data information oriented to a target model, determining a framework of a substitution model according to the task oriented to the target model, and setting a model extraction framework for an improved extraction process according to the characteristics of model extraction;
collecting task data according to the task data information of the target model, performing dimensionality reduction on the collected task data of the target model, and screening the dimensionality reduced data to obtain a training data set of a surrogate model;
measuring the classification confidence of the surrogate model according to a preset index, and classifying the training data set according to the classification confidence to obtain query sample data;
and training the surrogate model through a supervised learning algorithm and a consistency regularization algorithm in the model extraction framework to obtain the trained surrogate model.
In an implementation manner, the acquiring the task and task data information oriented by the target model includes:
and acquiring a task oriented by the target model, and acquiring task data information of the target model according to the task.
In one implementation, the determining a framework of a surrogate model according to the task oriented by the target model, and setting a model extraction framework for improving an extraction process according to the characteristics of model extraction includes:
determining a framework of a substitute model according to the task oriented by the target model;
determining the characteristics of model extraction and improving the extraction process;
and setting the model extraction framework according to the characteristics of the model extraction and for improving the extraction process.
In an implementation manner, the performing dimension reduction on the collected task data of the target model, and screening the dimension-reduced data to obtain a training data set of a surrogate model includes:
performing dimensionality reduction on the collected task data of the target model through an autoencoder, and clustering the dimensionality reduced data;
and selecting the clustered data through a preset algorithm, and removing repeated or similar data to obtain the training data set.
In one implementation, the performing, by an auto-encoder, dimensionality reduction on the collected task data of the target model, and clustering the dimensionality reduced data includes:
optimizing the objective function through a clustering algorithm and the number of class centers of a given data set to obtain an optimized objective function;
performing dimensionality reduction processing on the collected task data of the target model through the self-encoder to obtain intermediate data;
and clustering the intermediate data according to the optimized objective function to obtain the clustered data.
In one implementation, the measuring the classification confidence of the surrogate model according to a preset index, and classifying the training data set according to the classification confidence to obtain query sample data includes:
measuring the classification confidence of the substitution model according to the preset index;
inputting the training data set into the surrogate model, and classifying the training data set according to the classification confidence to obtain a first confidence data set and a second confidence data set;
and taking the second confidence data set as the query sample data.
In one implementation, the obtaining the first confidence data set and the second confidence data set further includes:
taking the data in the first confidence data set as unlabeled data;
and introducing and modifying a consistency regularization algorithm in semi-supervised learning according to the label-free data.
In one implementation, the training the surrogate model by a supervised learning algorithm and a consistency regularization algorithm in the model extraction framework to obtain a trained surrogate model includes:
dividing the training process into a plurality of substitution cycles, and extracting partial data in each substitution cycle to label the substitution model;
respectively obtaining a first confidence data set and a second confidence data set of the substitution model, and inquiring the target model through the second confidence data set;
first confidence data fed back by the target model and a corresponding label set are reserved;
and training the surrogate model through the supervised learning algorithm and the consistency regularization algorithm to obtain the trained surrogate model.
In a second aspect, the present invention further provides a terminal, including: a processor, and a memory storing a surrogate model auto-extraction program for implementing the operations of the surrogate model auto-extraction method according to the first aspect when executed by the processor.
In a third aspect, the present invention also provides a storage medium, which is a computer-readable storage medium, and the storage medium stores a surrogate model automatic extracting program, and the surrogate model automatic extracting program is used for implementing the operation of the surrogate model automatic extracting method according to the first aspect when executed by a processor.
The invention adopts the technical scheme and has the following effects:
the invention can utilize a larger data set in the platform to the maximum extent and greatly reduce the data volume queried by the platform; moreover, by using the pre-training model and the target model task related data as the surrogate model framework and the training data, the performance of the surrogate model is improved, and by using the consistency regularization training surrogate model of semi-supervised learning, learning can be performed from the label-free data without the query target model; according to the method, the substitution model is labeled before the target model is queried, only the low-confidence data of the substitution model is queried, so that the query work is reduced, and after the output of the target model is obtained, the substitution model is trained only by using the high-confidence data of the target model, so that the training efficiency and the output performance of the substitution model are improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the structures shown in the drawings without creative efforts.
FIG. 1 is a flow diagram of a method for automatic extraction of surrogate models in one implementation of the invention.
Fig. 2 is a schematic diagram of a self-encoder structure in an implementation manner of the invention.
FIG. 3 is a schematic diagram of query sample selection in one implementation of the invention.
FIG. 4 is a diagram of a surrogate model extraction framework in one implementation of the invention.
FIG. 5 is a schematic diagram of a surrogate model extraction algorithm in one implementation of the invention.
Fig. 6 is a functional schematic of a terminal in one implementation of the invention.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer and clearer, the present invention is further described in detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Exemplary method
As shown in fig. 1, an embodiment of the present invention provides an automatic extraction method for a surrogate model, including the following steps:
and S100, acquiring a task and task data information oriented to a target model, determining a framework of a substitution model according to the task oriented to the target model, and setting a model extraction framework for an improved extraction process according to the characteristics of model extraction.
In this embodiment, the method for automatically extracting a surrogate model is applied to a terminal, which includes but is not limited to: computers, mobile terminals, and the like.
In this embodiment, the surrogate model is locally created and trained to obtain the function of the target model. In order to enable the surrogate model to perform well on the task to which the surrogate model is directed, have better usability, and perform equivalently to the target model, the performance of the surrogate model for a specific task needs to be improved. This can work from two aspects, on one hand, to select better architecture and training data for the local surrogate model, and on the other hand, to find a more efficient way to promote the performance of the training model.
In this embodiment, for the problem of the surrogate model, the model extraction cost is reduced and the model performance is improved in a more practical setting, and a complete model extraction framework is obtained, and the contribution of the model extraction framework mainly includes the following points:
1. a general preprocessing step of model extraction is provided, when a lot of data can be found and only a few data are required to be queried, the data volume of query can be greatly reduced while the large data set is utilized to the maximum extent.
2. The pre-training model and the target model task related data are provided as better choices of the surrogate model architecture and the training data, the query quantity is reduced, and the performance of the surrogate model is improved. Surrogate models can be trained more efficiently by consistency regularization using semi-supervised learning, which can learn from unlabeled data without a query target model.
3. Before the target model is queried, the substitution model is labeled, and query quantity is reduced by only querying low-confidence data of the substitution model; after the output of the target model is obtained, efficiency and performance are improved by training the surrogate model using only the high confidence data of the target model.
Specifically, in one implementation manner of the present embodiment, the step S100 includes the following steps:
and S101, acquiring a task oriented to the target model, and acquiring task data information of the target model according to the task.
In this embodiment, the ability to obtain the target model on the cloud platform is needed, and a surrogate model is trained locally to replace the ability of the target model on the platform, so that the two models perform equivalently on the test data. The method comprises the steps of accessing a target model, knowing tasks faced by the target model, roughly knowing training data of the target model, interacting with the target model through an API (application programming interface) to obtain output returned by the target model, and establishing a substitution model by using the output returned by the target model.
In this embodiment, corresponding model information is obtained according to the output returned by the target model to establish a surrogate model; when building the surrogate model, a data set for querying the target model is needed, i.e. task data related to the task of the target model is beneficial for better building the target model. These relevant task data may come from public data sources or from more realistic data in production.
In this embodiment, only if the target model is known to some extent, the data related to the target model task can be found. The use of data related to the target model task may eliminate the need to query irrelevant data, thus reducing the amount of query data; after the target model is known, data can be collected on the network or some public data can be used; on the other hand, it is also possible to obtain some data actually used, such as: data in real business, google was used to search data related to the task to approximate reality.
Specifically, in an implementation manner of this embodiment, step S100 further includes the following steps:
step S102, determining a framework of a substitution model according to the task oriented by the target model;
step S103, determining the characteristics of model extraction and improving the extraction process;
and step S104, setting the model extraction framework for improving the extraction process according to the characteristics of the model extraction.
In this embodiment, after the information about the target model is obtained, a better architecture, that is, a pre-training model for the target model task, can be selected for the surrogate model by using the information, and corresponding data can be searched by using the information. There is no need to build a surrogate model from scratch in this embodiment because doing so requires a large amount of data, resources, and a complex architecture. Instead, the present embodiment may begin with a pre-trained model that has been trained on a large data set, and may predict various classes.
In this embodiment, the pre-trained model may be obtained from an existing machine learning framework; there are various pre-trained models in many machine learning frameworks (e.g., pyTorch, tensorflow, and MXNet) for tasks such as image, text, and speech. For example, for an image classification task, a pre-trained image classification network may be used that has learned to extract powerful and information-rich features from natural images as a starting point for learning a new task. Most pre-trained networks are trained based on a subset of the ImageNet database used in the ImageNet large-scale visual recognition challenge (ILSVRC). These networks have been trained on over 100 million images, which can be divided into 1000 object classes, such as keyboard, coffee cup, pencil, and various animals. In general, using a pre-trained network for transfer learning is faster and easier than training the network from scratch.
In this embodiment, in the model extraction process, a matched pre-training model may be selected as the framework of the surrogate model according to the task of the target model.
As shown in fig. 1, in an implementation manner of the embodiment of the present invention, the method for automatically extracting a surrogate model further includes the following steps:
and S200, collecting task data according to the task data information of the target model, performing dimension reduction processing on the collected task data of the target model, and screening the dimension reduced data to obtain a training data set of the surrogate model.
In this embodiment, the data may be reduced by instance selection, which is intended to decide which instances in the training set should be kept for further use in the learning process, which reduces the training set and helps to reduce the run time in the training process. And the problem of excessive queries can be effectively relieved by applying the instance selection to the extraction process of the substitution model.
Specifically, in one implementation manner of the present embodiment, the step S200 includes the following steps:
step S201, performing dimension reduction processing on the collected task data of the target model through a self-encoder, and clustering the dimension-reduced data;
and S202, selecting the clustered data through a preset algorithm, and removing repeated or similar data to obtain the training data set.
In the embodiment, the dimension reduction is performed on the image data through the self-encoder, then clustering is performed, and finally a well-designed algorithm is adopted to select from the clustered data, so that repeated or similar data are reduced to the maximum extent, and the query efficiency is improved; the structure of the self-encoder is shown in fig. 2.
Specifically, in one implementation manner of the present embodiment, the step S201 includes the following steps:
step S201a, optimizing an objective function through a clustering algorithm and the number of class centers of a given data set to obtain an optimized objective function;
step S201b, performing dimensionality reduction processing on the collected task data of the target model through the self-encoder to obtain intermediate data;
step S201c, clustering the intermediate data according to the optimized objective function to obtain the clustered data.
In this embodiment, the preset algorithm is a Mini-Batch K-Means algorithm, and the Mini-Batch K-Means algorithm is adopted to cluster the intermediate representation of the self-encoder, and the algorithm is a variant of the K-Means algorithm. The K-Means algorithm expects to optimize an objective function given the number of classes in the dataset X;
min∑ x∈X ||f(C,x)-x|| 2 (3-1)
c in the formula is a class center set, f (C, x) returns a class center C epsilon C which is nearest to x. After selecting the initial centroid, the algorithm repeats these two steps: the samples are assigned to their nearest centroids and the average of the various types of samples is used to calculate the new centroid. The algorithm stops when the difference between the old and new centroids is less than a threshold.
The Mini-Batch K-Means algorithm has similar steps to the standard algorithm and still attempts to optimize the same objective function. It randomly samples small batches at each training iteration, not only greatly reducing the amount of computation and computation time required for convergence, but also making it faster to process large data sets and potentially more robust to statistical noise. In practice, the difference in the quality of the results produced by this algorithm from the standard algorithm is typically very small.
As shown in fig. 1, in an implementation manner of the embodiment of the present invention, the method for automatically extracting a surrogate model further includes the following steps:
and S300, measuring the classification confidence of the surrogate model according to a preset index, and classifying the training data set according to the classification confidence to obtain query sample data.
In the embodiment, the substitution model only queries low-confidence data, so that the cost budget in the query process can be reduced; therefore, in the classification task of the substitution model, the confidence of the substitution model on the sample classification can be used to measure whether the substitution model obtains the capability of classifying the sample. By performing softmax operations on the output of the surrogate model, a method of selecting a query sample is obtained.
Specifically, in one implementation manner of the present embodiment, the step S300 includes the following steps:
step S301, measuring the classification confidence of the substitution model according to the preset index;
step S302, inputting the training data set into the surrogate model, and classifying the training data set according to the classification confidence to obtain a first confidence data set and a second confidence data set;
step S303, using the second confidence data set as the query sample data.
In this embodiment, the first confidence data set is a high confidence data set and the second confidence data set is a low confidence data set.
As shown in fig. 3, when query sample data is selected, the owned data is input to the surrogate model, and the surrogate model is used for classification; for the samples classified by the surrogate model with high confidence, the target model does not need to be labeled, and the surrogate model can be considered to obtain the capacity of classifying the samples, so that the target model only needs to be queried by the samples (namely, the samples classified with low confidence) which are not confident in the surrogate model, and the query frequency is reduced.
And after the target model is labeled with the sample data which is not self-confident, the target model gives an output result, and the output result is used as a classification target of the substitution model. However, not all the labels returned by the target model are used, only the samples and their results classified with high confidence by the target model are used; the process of surrogate model extraction is to learn the ability of the object model to classify with high confidence, rather than learning the ability of the object model to classify without confidence.
In this embodiment, the classification confidence of the surrogate model needs to be measured, so three indexes are proposed:
Figure BDA0003802325890000091
Figure BDA0003802325890000092
Figure BDA0003802325890000093
wherein k is i Is the ith most confident class, and τ is the confidence threshold. Equations (3-4) are inspired by the concept of entropy, which is a measure of uncertainty. It can be seen that the larger the value of equation (3-4), the more deterministic and therefore can be used as an indicator for measuring the classification confidence of the model. The appropriate index should be selected from the three according to the specific task and data characteristics.
Specifically, in one implementation manner of the present embodiment, the step S302 is followed by the following steps:
step S302a, taking the data in the first confidence data set as non-label data;
and step S302b, introducing and modifying a consistency regularization algorithm in semi-supervised learning according to the non-label data.
In this embodiment, a method for training a surrogate model is provided, and this method can train two types of data in two different ways at the same time, so that the training process of the model better conforms to the characteristics of model extraction.
In the first training mode, after the substitution model is classified, the output low-confidence data is delivered to the target model for further labeling, and after the result returned by the target model is obtained, the data still uses cross entropy loss supervision to train the substitution model;
in the second training mode, since the high-confidence data after the classification of the surrogate model is not labeled by the target model, the later supervision training cannot be performed, but the data still has use value. The surrogate model can continue to learn from this portion of data, improving the performance of the model, so the surrogate model is still trained with high confidence data. And regarding the data as label-free data, and introducing and modifying consistency regularization in semi-supervised learning so as to fully utilize the label-free data to train a surrogate model.
Specifically, for all the high confidence data of the surrogate models that participated in the t-th round of training, the loss of these data was calculated:
Figure BDA0003802325890000094
Z=αZ+(1-α)z(3-6)
Figure BDA0003802325890000101
wherein C is the number of classes, B is a set formed by all high-confidence data participating in the training of the current round, and z is a prediction vector given by the model. For the subsequent combination of supervised and unsupervised loss terms, this part of the loss is scaled by a time-dependent weighting function w (t). At the same time, this approach aggregates multiple predictions previously evaluated by the network into an integrated prediction Z, where α is a momentum term that controls how close the aggregate output is to the training history, and Z therefore contains a weighted average from the previous network predictions. Such integrated predictions may be expected to predict unknown labels better than the network output in the most recent training round, and thus may be targeted for training.
There may be less noise than using only one previous prediction. To generate the training target Z, the start-up bias in Z needs to be corrected by dividing by a factor (1- α t). For example, similar bias corrections are also used in Adam and mean-only bulk normalization. In the first round of training, Z and Z are zero, since no data was available in the previous round.
As shown in fig. 1, in an implementation manner of the embodiment of the present invention, the method for automatically extracting a surrogate model further includes the following steps:
and S400, training the surrogate model through a supervised learning algorithm and a consistency regularization algorithm in the model extraction frame to obtain the trained surrogate model.
In the embodiment, the extraction process of the surrogate model is realized by a model extraction framework; in the embodiment, the method is combined, the process is optimized, and a model extraction framework is obtained, the framework fully considers the characteristics of model extraction, the number of times of inquiring the target model is reduced, the inquiry budget is reduced, meanwhile, the training data is fully utilized, the two parts of training data use different training modes, and the substitution model can be trained at one time.
As shown in FIG. 4, FIG. 4 is a description of the model extraction framework and its steps.
As shown in fig. 5, in order to express details of the framework algorithm in more detail, pseudo code of the algorithm is given in fig. 5 in the present embodiment.
Specifically, in one implementation manner of the present embodiment, the step S400 includes the following steps:
step S401, dividing the training process into a plurality of substitution cycles, and extracting partial data in each substitution cycle to label the substitution model;
step S402, respectively obtaining a first confidence data set and a second confidence data set of the substitution model, and inquiring the target model through the second confidence data set;
step S403, reserving first confidence data fed back by the target model and a corresponding label set;
and S404, training the surrogate model through the supervised learning algorithm and the consistency regularization algorithm to obtain a trained surrogate model.
In this embodiment, first, collected task-related data is processed by adopting an idea of instance selection, and a part of samples are selected; second, the training process is divided into a number of alternative cycles. In each substitution cycle, extracting partial data to label the substitution model; thirdly, respectively obtaining a high-confidence data set and a low-confidence data set of the substitution model, only querying the target model by the low-confidence data set, and leaving the high-confidence data and the tag set of the target model; and finally, training the substitution model through supervised learning and consistency regularization simultaneously to obtain the substitution model with the capability equivalent to that of the target model.
In this embodiment, the purpose of model extraction is defined as obtaining the decision-making capability of the target model as much as possible, i.e. the surrogate model should be as close to or even exceed the target model as possible in the test data setThe purpose of this is to stand in a practical sense, and it is desirable that the extracted model has a more satisfactory usability. So the ratio of the test accuracy of the surrogate model to the target model is used as the evaluation index. As shown in the following formula, wherein,
Figure BDA0003802325890000111
for the surrogate model, f is the target model.
Figure BDA0003802325890000112
In this embodiment, the image identifier model may be built, deployed, and refined using the Azure store Vision image recognition service provided by microsoft. The image identifier applies labels to the image based on the detected visual features, each label representing a classification or object. 5000 Image data are randomly extracted from training data sets of an Intel Image data set and a Fashion MNIST data set respectively, the data quantity balance of each category is guaranteed, and two models are trained respectively by using the service to serve as target models. The target model is trained on the platform for dozens of minutes, the structure, the weight and the hyper-parameters of the target model are unknown in the process, the target model cannot be changed after the training is finished, only the query can be carried out, and the target model returns an output vector. The trained target model also gives an accuracy, which is 94.7% and 91% on the two datasets, respectively.
In this embodiment, the surrogate model is based on Squeezenet, and is a lightweight model that reduces network parameters, increases the computation speed to the maximum extent, and does not lose network performance. Because data related to the target model task is used, as an ideal case, the surrogate model training data takes part of the Intel image validation set and the Fashinon MNIST test data set. This is reasonable, which ensures that the surrogate model and the target model training data set are different, consistent with the assumption that no knowledge of the specific data used by the target model is available. This approach also conforms to the assumption that data related to the target model task can be used. In consideration of the data available in the actual situation, google is used to search for images related to the target model task, and the images are downloaded to be used as training data of the substitute model.
In order to reduce the query to the maximum extent, the substitution model adopts a mode of 5 substitution cycles for training. In each substitution cycle, the substitution model is trained 60 rounds using the updated data set. Other training aspects of the setup include selecting the appropriate optimizer and learning rate, etc.
The embodiment achieves the following technical effects through the technical scheme:
the embodiment can utilize a larger data set in the platform to the maximum extent, and simultaneously greatly reduces the data volume queried by the platform; moreover, by using the pre-training model and the target model task related data as the surrogate model architecture and the training data, the performance of the surrogate model is improved, and by using the consistency regularization training surrogate model of semi-supervised learning, the surrogate model can be learned from the label-free data without the query target model; in the embodiment, the surrogate model is labeled before the target model is queried, the query work is reduced by only querying the low-confidence data of the surrogate model, and after the output of the target model is obtained, the surrogate model is trained by only using the high-confidence data of the target model, so that the training efficiency and the output performance of the surrogate model are improved.
Exemplary device
Based on the above embodiment, the present invention further provides a terminal, including: the system comprises a processor, a memory, an interface, a display screen and a communication module which are connected through a system bus; wherein the processor is configured to provide computing and control capabilities; the memory comprises a storage medium and an internal memory; the storage medium stores an operating system and a computer program; the internal memory provides an environment for the running of an operating system and a computer program in the storage medium; the interface is used for connecting external equipment, such as mobile terminals, computers and the like; the display screen is used for displaying corresponding information; the communication module is used for communicating with a cloud server or a mobile terminal.
The computer program is operable when executed by the processor to perform the operations of a surrogate model auto-extraction method.
It will be understood by those skilled in the art that the block diagram of fig. 6 is a block diagram of only a portion of the structure associated with the inventive arrangements and is not intended to limit the terminals to which the inventive arrangements may be applied, and that a particular terminal may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a terminal is provided, which includes: a processor and a memory storing a surrogate model auto-extracting program for implementing the operations of the surrogate model auto-extracting method as described above when executed by the processor.
In one embodiment, a storage medium is provided, wherein the storage medium stores a surrogate model automatic extracting program for implementing the operation of the surrogate model automatic extracting method as described above when executed by the processor.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above may be implemented by instructing relevant hardware by a computer program, and the computer program may be stored in a non-volatile storage medium, and when executed, may include the processes of the embodiments of the methods described above. Any reference to memory, storage, databases, or other media used in embodiments provided herein may include non-volatile and/or volatile memory.
In summary, the present invention provides a method, a terminal and a storage medium for automatically extracting a surrogate model, wherein the method comprises: acquiring tasks and task data information oriented to the target model, determining a framework of the substitution model according to the tasks oriented to the target model, and setting a model extraction framework for an improved extraction process according to the characteristics of model extraction; performing dimensionality reduction on the collected task data of the target model, and screening the dimensionality reduced data to obtain a training data set of the surrogate model; measuring the classification confidence of the substitution model according to a preset index, and classifying the training data set according to the classification confidence to obtain query sample data; training a surrogate model through a supervised learning algorithm and a consistency regularization algorithm in a model extraction framework to obtain a trained surrogate model; the method can obtain the decision capability of the target model, so that the surrogate model is close to or even surpasses the performance of the target model on the test data set as much as possible, and the surrogate model has more satisfactory usability.
It is to be understood that the invention is not limited to the examples described above, but that modifications and variations may be effected thereto by those of ordinary skill in the art in light of the foregoing description, and that all such modifications and variations are intended to be within the scope of the invention as defined by the appended claims.

Claims (10)

1. An automatic extraction method of a surrogate model is characterized by comprising the following steps:
acquiring a task and task data information oriented to a target model, determining a framework of a substitution model according to the task oriented to the target model, and setting a model extraction framework for an improved extraction process according to the characteristics of model extraction;
collecting task data according to the task data information of the target model, performing dimension reduction processing on the collected task data of the target model, and screening the dimension reduced data to obtain a training data set of a substitute model;
measuring the classification confidence of the surrogate model according to a preset index, and classifying the training data set according to the classification confidence to obtain query sample data;
and training the surrogate model through a supervised learning algorithm and a consistency regularization algorithm in the model extraction framework to obtain the trained surrogate model.
2. The method according to claim 1, wherein the obtaining of the task and task data information oriented by the target model comprises:
and acquiring a task oriented by the target model, and acquiring task data information of the target model according to the task.
3. The method for automatically extracting a surrogate model according to claim 1, wherein the determining a surrogate model architecture according to the task oriented by the target model and setting a model extraction framework for improving an extraction process according to the characteristics of model extraction comprise:
determining the architecture of a substitution model according to the task oriented by the target model;
determining the characteristics of model extraction and improving the extraction process;
and setting the model extraction framework according to the characteristics of the model extraction and for improving the extraction process.
4. The method for automatically extracting an alternative model according to claim 1, wherein the performing dimension reduction on the collected task data of the target model and screening the dimension-reduced data to obtain a training data set of the alternative model includes:
performing dimensionality reduction on the collected task data of the target model through an autoencoder, and clustering the dimensionality reduced data;
and selecting the clustered data through a preset algorithm, and removing repeated or similar data to obtain the training data set.
5. The automatic extraction method of the surrogate model according to claim 4, wherein the performing, by the self-encoder, the dimensionality reduction on the collected task data of the target model and clustering the dimensionality reduced data comprises:
optimizing the objective function through a clustering algorithm and the number of class centers of a given data set to obtain an optimized objective function;
performing dimensionality reduction processing on the collected task data of the target model through the self-encoder to obtain intermediate data;
and clustering the intermediate data according to the optimized objective function to obtain the clustered data.
6. The method of claim 1, wherein the measuring classification confidence of the surrogate model according to a preset index, and classifying the training data set according to the classification confidence to obtain query sample data includes:
measuring the classification confidence of the substitution model according to the preset index;
inputting the training data set into the surrogate model, and classifying the training data set according to the classification confidence to obtain a first confidence data set and a second confidence data set;
and taking the second confidence data set as the query sample data.
7. The surrogate model auto-extraction method as recited in claim 6, wherein the obtaining a first confidence data set and a second confidence data set further comprises:
taking the data in the first confidence data set as non-label data;
and introducing and modifying a consistency regularization algorithm in semi-supervised learning according to the label-free data.
8. The method according to claim 1, wherein the training the surrogate model by a supervised learning algorithm and a consistency regularization algorithm in the model extraction framework to obtain a trained surrogate model comprises:
dividing the training process into a plurality of substitution cycles, and extracting partial data in each substitution cycle to label the substitution model;
respectively obtaining a first confidence data set and a second confidence data set of the substitution model, and inquiring the target model through the second confidence data set;
first confidence data fed back by the target model and a corresponding label set are reserved;
and training the surrogate model through the supervised learning algorithm and the consistency regularization algorithm to obtain the trained surrogate model.
9. A terminal, comprising: a processor, and a memory storing a surrogate model auto-extraction program for implementing operations of the surrogate model auto-extraction method according to any one of claims 1-8 when executed by the processor.
10. A storage medium, characterized in that the storage medium is a computer-readable storage medium, and the storage medium stores a surrogate model automatic extracting program for implementing an operation of the surrogate model automatic extracting method according to any one of claims 1 to 8 when the surrogate model automatic extracting program is executed by a processor.
CN202210987308.4A 2022-08-17 2022-08-17 Automatic extraction method of surrogate model, terminal and storage medium Pending CN115470927A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210987308.4A CN115470927A (en) 2022-08-17 2022-08-17 Automatic extraction method of surrogate model, terminal and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210987308.4A CN115470927A (en) 2022-08-17 2022-08-17 Automatic extraction method of surrogate model, terminal and storage medium

Publications (1)

Publication Number Publication Date
CN115470927A true CN115470927A (en) 2022-12-13

Family

ID=84367132

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210987308.4A Pending CN115470927A (en) 2022-08-17 2022-08-17 Automatic extraction method of surrogate model, terminal and storage medium

Country Status (1)

Country Link
CN (1) CN115470927A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117496118A (en) * 2023-10-23 2024-02-02 浙江大学 Method and system for analyzing steal vulnerability of target detection model
CN117496118B (en) * 2023-10-23 2024-06-04 浙江大学 Method and system for analyzing steal vulnerability of target detection model

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117496118A (en) * 2023-10-23 2024-02-02 浙江大学 Method and system for analyzing steal vulnerability of target detection model
CN117496118B (en) * 2023-10-23 2024-06-04 浙江大学 Method and system for analyzing steal vulnerability of target detection model

Similar Documents

Publication Publication Date Title
CN110852447B (en) Meta learning method and apparatus, initializing method, computing device, and storage medium
CN111507768B (en) Potential user determination method and related device
CN109993102B (en) Similar face retrieval method, device and storage medium
CN112765477B (en) Information processing method and device, information recommendation method and device, electronic equipment and storage medium
Wu et al. Modeling users’ preferences and social links in social networking services: a joint-evolving perspective
CN109471978B (en) Electronic resource recommendation method and device
CN112395487B (en) Information recommendation method and device, computer readable storage medium and electronic equipment
CN110955831A (en) Article recommendation method and device, computer equipment and storage medium
CN115033801B (en) Article recommendation method, model training method and electronic equipment
WO2024067373A1 (en) Data processing method and related apparatus
CN112905897A (en) Similar user determination method, vector conversion model, device, medium and equipment
CN113033458A (en) Action recognition method and device
CN112883265A (en) Information recommendation method and device, server and computer readable storage medium
CN117726884B (en) Training method of object class identification model, object class identification method and device
CN113360788A (en) Address recommendation method, device, equipment and storage medium
CN113836388A (en) Information recommendation method and device, server and storage medium
CN116910357A (en) Data processing method and related device
CN115470927A (en) Automatic extraction method of surrogate model, terminal and storage medium
CN114625967A (en) User information mining method based on big data service optimization and artificial intelligence system
JP2016014990A (en) Moving image search method, moving image search device, and program thereof
CN112868048A (en) Image processing learning program, image processing program, information processing device, and image processing system
CN114519593A (en) Resource recall model updating method and device, electronic equipment and storage medium
CN116679981B (en) Software system configuration optimizing method and device based on transfer learning
US20240022726A1 (en) Obtaining video quality scores from inconsistent training quality scores
CN113298448B (en) Lease index analysis method and system based on Internet and cloud platform

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination