CN117932457B - Model fingerprint identification method and system based on error classification - Google Patents
Model fingerprint identification method and system based on error classification Download PDFInfo
- Publication number
- CN117932457B CN117932457B CN202410331647.6A CN202410331647A CN117932457B CN 117932457 B CN117932457 B CN 117932457B CN 202410331647 A CN202410331647 A CN 202410331647A CN 117932457 B CN117932457 B CN 117932457B
- Authority
- CN
- China
- Prior art keywords
- sample
- fingerprint
- samples
- model
- error
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24133—Distances to prototypes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0475—Generative networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/0985—Hyperparameter optimisation; Meta-learning; Learning-to-learn
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/18—Legal services
- G06Q50/184—Intellectual property management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Business, Economics & Management (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computing Systems (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Software Systems (AREA)
- Evolutionary Biology (AREA)
- Tourism & Hospitality (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Technology Law (AREA)
- Mathematical Physics (AREA)
- Molecular Biology (AREA)
- Economics (AREA)
- Entrepreneurship & Innovation (AREA)
- Operations Research (AREA)
- Primary Health Care (AREA)
- Human Resources & Organizations (AREA)
- General Business, Economics & Management (AREA)
- Marketing (AREA)
- Strategic Management (AREA)
- Collating Specific Patterns (AREA)
Abstract
本发明提供一种基于错误分类的模型指纹识别方法及系统,涉及模型版权保护领域。该基于错误分类的模型指纹识别,首先,寻找目标模型和盗版模型(修改后的模型)都分类错误的模型的样本。然后,在不改变目标模型参数的前提条件下,使用GAN网络对这些错误样本的分类特征增强以生成指纹样本并使其分类正确,同时保证指纹样本自然且与原样本的差异较小。最后,用错误样本和指纹样本作为查询集,通过对比错误样本和指纹样本的预测标签来验证模型所有权。这种方法不仅极大增强了指纹样本的隐蔽性,还提高了对模型微调、剪枝和加噪等攻击的鲁棒性。
The present invention provides a model fingerprint recognition method and system based on error classification, and relates to the field of model copyright protection. The model fingerprint recognition based on error classification first searches for samples of models that are incorrectly classified by both the target model and the pirated model (modified model). Then, under the premise of not changing the parameters of the target model, a GAN network is used to enhance the classification features of these error samples to generate fingerprint samples and classify them correctly, while ensuring that the fingerprint samples are natural and have little difference from the original samples. Finally, the error samples and fingerprint samples are used as query sets, and the model ownership is verified by comparing the predicted labels of the error samples and the fingerprint samples. This method not only greatly enhances the concealment of the fingerprint samples, but also improves the robustness to attacks such as model fine-tuning, pruning and noise addition.
Description
技术领域Technical Field
本发明涉及模型版权保护技术领域,具体为一种基于错误分类的模型指纹识别方法及系统。The present invention relates to the technical field of model copyright protection, and in particular to a model fingerprint recognition method and system based on error classification.
背景技术Background technique
随着深度学习的快速发展,深度神经网络已经在许多人工智能领域取得了巨大的成功,如图像识别、视觉理解和自然语言处理等。像微软、谷歌和百度等企业已经在他们的商业产品中部署了DL模型,以提供更高质量和智能的服务。尽管深度神经网络优于传统的方法,但设计和训练一个高性能的深度模型并不是一个简单的任务,通常需要大规模带有标记的训练数据、大量的计算资源,以及专业知识来设计一个优异的框架和适合的学习策略,其开发成本并不是一般人可以承受的。然而高性能的深度模型充满了巨大的商业价值,恶意用户可能会使用代理攻击通过访问目标模型的API来窃取模型,或者盗取模型结构和参数并修改模型。因此,模型保护产品的知识产权需要保护以防止被盗版。With the rapid development of deep learning, deep neural networks have achieved great success in many artificial intelligence fields, such as image recognition, visual understanding, and natural language processing. Companies such as Microsoft, Google, and Baidu have deployed DL models in their commercial products to provide higher quality and intelligent services. Although deep neural networks are superior to traditional methods, designing and training a high-performance deep model is not a simple task. It usually requires large-scale labeled training data, a large amount of computing resources, and expertise to design an excellent framework and a suitable learning strategy. The development cost is not affordable for ordinary people. However, high-performance deep models are full of huge commercial value. Malicious users may use proxy attacks to steal models by accessing the API of the target model, or steal model structures and parameters and modify the model. Therefore, the intellectual property rights of model protection products need to be protected to prevent piracy.
模型水印是一种常见的模型知识产权保护方法,通过修改模型参数等方法将水印信息嵌入到模型里。但现有的研究工作表明,基于水印的模型保护方法不可避免地对模型性能造成影响。然而在医疗和金融等关键领域,即使1%的精度损失也是无法容忍的,因此研究人员提出了模型指纹识别方法。模型指纹识别并不需要修改模型的训练过程或微调模型参数,而是通过寻找模型特有的特征来保护模型知识产权。模型指纹识别方法首先在目标模型的分类边界寻找一些样本,然后通过对抗样本等方法将这些样本生成为指纹样本。最后用指纹样本和其预测的标签作为目标模型的指纹,对于一个可疑的分类器,模型所有者通过远程访问API,输入指纹样本集以获得其标签。通过对比可疑分类器和目标分类器对指纹样本的预测标签,模型所有者验证可疑分类器是否是从目标分类器中盗版的。Model watermarking is a common method for protecting model intellectual property rights. Watermark information is embedded into the model by modifying model parameters. However, existing research shows that watermark-based model protection methods inevitably affect model performance. However, in key fields such as medicine and finance, even a 1% loss in accuracy is intolerable, so researchers have proposed a model fingerprinting method. Model fingerprinting does not require modifying the model training process or fine-tuning model parameters, but protects model intellectual property rights by finding model-specific features. The model fingerprinting method first finds some samples at the classification boundary of the target model, and then generates these samples into fingerprint samples through methods such as adversarial samples. Finally, the fingerprint sample and its predicted label are used as the fingerprint of the target model. For a suspicious classifier, the model owner remotely accesses the API and inputs the fingerprint sample set to obtain its label. By comparing the predicted labels of the fingerprint sample by the suspicious classifier and the target classifier, the model owner verifies whether the suspicious classifier is pirated from the target classifier.
虽然现有基于分类边界的模型指纹识别方法实现了对模型的知识产权保护,但是使用决策边界上的样本对模型攻击的鲁棒性较差,并不稳健。而且使用对抗样本的方式来生成的指纹样本不自然隐蔽性较低,容易被受到检测出来。Although the existing model fingerprint recognition method based on classification boundaries has achieved intellectual property protection for the model, the samples on the decision boundary are not robust to model attacks. Moreover, the fingerprint samples generated by adversarial samples are unnatural and less concealed, making them easy to be detected.
发明内容Summary of the invention
(一)解决的技术问题1. Technical issues to be resolved
针对现有技术的不足,本发明提供了一种基于错误分类的模型指纹识别方法及系统,解决了有基于分类边界的模型指纹识别方法实现了对模型的知识产权保护,但是使用决策边界上的样本对模型攻击的鲁棒性较差,并不稳健的问题。In view of the shortcomings of the prior art, the present invention provides a model fingerprint recognition method and system based on misclassification, which solves the problem that the model fingerprint recognition method based on classification boundaries can realize the intellectual property protection of the model, but the robustness of using samples on the decision boundary to model attacks is poor and not robust.
(二)技术方案(II) Technical solution
为实现以上目的,本发明通过以下技术方案予以实现:To achieve the above objectives, the present invention is implemented through the following technical solutions:
第一方面,提供了一种基于错误分类的模型指纹识别方法,包括:In a first aspect, a model fingerprint recognition method based on error classification is provided, comprising:
输入公共数据集Dm,使用公共数据集Dm频繁访问目标模型得到公共数据集Dm的预测标签,将带有标签的公共数据集Dm作为原训练集Dtrain,通过原训练集Dtrain训练盗版模型;Input a public dataset D m , use the public dataset D m to frequently access the target model to obtain the predicted label of the public dataset D m , use the public dataset D m with the label as the original training set D train , and train the piracy model through the original training set D train ;
在原训练集Dtrain中筛选出目标模型和盗版模型均分类错误的样本Z;Filter out samples Z that are misclassified by both the target model and the pirated model in the original training set D train ;
找出训练集Dtrain每一类中与同类的其他样本累积距离最小的样本作为质心样本Ds;Find the sample with the smallest cumulative distance from other samples of the same category in each category of the training set D train as the centroid sample D s ;
在分类错误的样本Z中筛选出一批与质心样本距离最大的样本作为查询集中的错误样本Ze,并记录错误样本Ze的标签集;Select a batch of samples with the largest distance from the centroid sample from the misclassified samples Z as the error samples Ze in the query set, and record the label set of the error samples Ze ;
把错误样本Ze输入到预先构建的GAN网络中,引导错误样本Ze正确分类,并生成自然的指纹样本Zr;Input the error sample Ze into the pre-built GAN network to guide the error sample Ze to be correctly classified and generate a natural fingerprint sample Zr ;
在GAN网络生成的指纹样本Zr中筛选出一批与质心样本距离最小的样本作为查询集中的指纹样本Zw,并记录指纹样本的标签集;A batch of samples with the smallest distance to the centroid sample are selected from the fingerprint samples Z r generated by the GAN network as the fingerprint samples Z w in the query set, and the label set of the fingerprint samples is recorded;
将错误样本Ze和指纹样本Zw分别输入到预先构建的可疑模型中,得到错误样本的标签集和指纹样本的标签集。The error samples Ze and fingerprint samples Zw are respectively input into the pre-built suspicious model to obtain the label set of the error samples and the label set of the fingerprint samples.
优选的,所述找出训练集Dtrain每一类中与同类的其他样本累积距离最小的样本作为质心样本Ds,公式如下:Preferably, the method of finding the sample with the smallest cumulative distance from other samples of the same category in each category of the training set D train as the centroid sample D s is as follows:
其中,N表示k类中的数据数,表示向量的长度。Where N represents the number of data in class k, Indicates the length of a vector.
优选的,所述把错误样本Ze输入到预先构建的GAN网络中,引导错误样本Ze正确分类,并生成自然的指纹样本Zr,具体包括:Preferably, the step of inputting the error sample Ze into a pre-built GAN network, guiding the error sample Ze to be correctly classified, and generating a natural fingerprint sample Zr specifically includes:
把错误样本Ze输入到GAN网络中的生成器G得到指纹样本Zr,将指纹样本Zr输入目标模型,引导其分类正确;Input the error sample Z e into the generator G in the GAN network to obtain the fingerprint sample Z r , and input the fingerprint sample Z r into the target model to guide its classification correct;
利用分类损失和鉴别损失/>的加权组合/>训练该GAN网络,/>是平衡错误样本和指纹样本质量的超参数;Using classification loss and identification loss/> A weighted combination of Train the GAN network, /> is a hyperparameter that balances the quality of error samples and fingerprint samples;
将指纹样本Zr输入鉴别器中,通过计算鉴别损失Ld引导生成自然的指纹样本;Input the fingerprint sample Zr into the discriminator, and guide the generation of natural fingerprint samples by calculating the discrimination loss Ld ;
计算总损失,反向传播以最小化总损失函数L,迭代更新GAN网络的参数,得到自然的指纹样本。Calculate total loss , back propagation is performed to minimize the total loss function L, and the parameters of the GAN network are iteratively updated to obtain natural fingerprint samples.
优选的,所述分类损失公式如下:Preferably, the classification loss The formula is as follows:
其中为目标模型F对指纹样本/>的SoftMax函数,Y是引导指纹样本/>分类的标签,/>是Carlini-Wagner损失。in For the target model F, the fingerprint sample is SoftMax function, Y is the guide fingerprint sample/> Category tags, /> It's the Carlini-Wagner loss.
优选的,所述Carlini-Wagner损失,公式如下:Preferably, the Carlini-Wagner loss , the formula is as follows:
Z为,参数k鼓励GAN网络生成被分类为Y类的高置信度样本。Z is , the parameter k encourages the GAN network to generate high-confidence samples classified as class Y.
优选的,所述鉴别损失Ld,公式如下:Preferably, the identification loss L d is expressed as follows:
。 .
优选的,所述将错误样本Ze和指纹样本Zw分别输入到预先构建的可疑模型中,得到错误样本的标签集和指纹样本的标签集后,判断是否满足Ei’=Ei且Wi’=Wi,其中,Ei’表示错误样本的标签集,Ei表示保留的错误样本的标签集,Wi’表示指纹样本的标签集,Wi保留的指纹样本的标签集,计算匹配率S公式如下:Preferably, the error sample Z e and the fingerprint sample Z w are respectively input into the pre-built suspicious model, and after obtaining the label set of the error sample and the label set of the fingerprint sample, it is determined whether E i '=E i and Wi '=W i are satisfied, wherein E i ' represents the label set of the error sample, E i represents the label set of the retained error sample, Wi ' represents the label set of the fingerprint sample, and Wi represents the label set of the retained fingerprint sample. The formula for calculating the matching rate S is as follows:
匹配率大于95%,可疑模型被视为被盗模型。Match Rate If the value is greater than 95%, the suspicious model is considered a stolen model.
第二方面,提供了一种基于错误分类的模型指纹识别系统,包括:In a second aspect, a model fingerprint recognition system based on error classification is provided, comprising:
预处理模块,用于输入公共数据集Dm,使用公共数据集Dm频繁访问目标模型得到公共数据集Dm的预测标签,将带有标签的公共数据集Dm作为原训练集Dtrain,通过原训练集Dtrain训练盗版模型;A preprocessing module is used to input a public data set D m , frequently access a target model using the public data set D m to obtain a predicted label of the public data set D m , use the public data set D m with the label as an original training set D train , and train a piracy model through the original training set D train ;
第一筛选模块,用于在原训练集Dtrain中筛选出目标模型和盗版模型均分类错误的样本Z;The first screening module is used to screen out samples Z that are misclassified by both the target model and the pirated model in the original training set D train ;
提取模块,用于找出训练集Dtrain每一类中与同类的其他样本累积距离最小的样本作为质心样本Ds;An extraction module is used to find the sample with the smallest cumulative distance from other samples of the same category in each category of the training set D train as the centroid sample D s ;
记录模块,用于在分类错误的样本Z中筛选出一批与质心样本距离最大的样本作为查询集中的错误样本Ze,并记录错误样本Ze的标签集;A recording module is used to select a batch of samples with the largest distance from the centroid sample from the misclassified samples Z as the error samples Z e in the query set, and record the label set of the error samples Z e ;
生成模块,用于把错误样本Ze输入到预先构建的GAN网络中,引导错误样本Ze正确分类,并生成自然的指纹样本Zr;The generation module is used to input the error sample Ze into the pre-built GAN network, guide the error sample Ze to be correctly classified, and generate a natural fingerprint sample Zr ;
第二筛选模块,用于在GAN网络生成的指纹样本Zr中筛选出一批与质心样本距离最小的样本作为查询集中的指纹样本Zw,并记录指纹样本的标签集;The second screening module is used to screen out a batch of samples with the smallest distance from the centroid sample from the fingerprint samples Z r generated by the GAN network as the fingerprint samples Z w in the query set, and record the label set of the fingerprint samples;
处理与输出模块,用于将错误样本Ze和指纹样本Zw分别输入到预先构建的可疑模型中,得到错误样本的标签集和指纹样本的标签集。The processing and output module is used to input the error sample Ze and the fingerprint sample Zw into the pre-built suspicious model respectively to obtain the label set of the error sample and the label set of the fingerprint sample.
第三方面,提供了一种存储一个或多个程序的计算机可读存储介质,所述一个或多个程序包括指令,所述指令当由计算设备执行时,使得所述计算设备执行第一方面的方法。In a third aspect, a computer-readable storage medium storing one or more programs is provided, wherein the one or more programs include instructions, which, when executed by a computing device, cause the computing device to perform the method of the first aspect.
第四方面,提供了一种计算设备,包括:In a fourth aspect, a computing device is provided, including:
一个或多个处理器、存储器以及一个或多个程序,其中一个或多个程序存储在所述存储器中并被配置为由所述一个或多个处理器执行,所述一个或多个程序包括用于执行第一方面的方法中的指令。One or more processors, a memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, and the one or more programs include instructions for executing the method of the first aspect.
(三)有益效果(III) Beneficial effects
本发明一种基于错误分类的模型指纹识别方法,区别于其他基于分类边界的模型指纹识别方法,首先在目标模型和被盗模型的都分类错误的区域内寻找错误样本,并通过GAN网络增强错误样本的分类特征,以生成分类正确的指纹样本。本发明使用GAN网络生成指纹样本,极大增强了样本的隐蔽性。同时模拟了攻击者可以对模型的修改,并利用质心样本筛选出错误样本和指纹样本,对各种模型攻击的鲁棒性有了较大的提高。The present invention discloses a model fingerprint recognition method based on misclassification, which is different from other model fingerprint recognition methods based on classification boundaries. It first searches for error samples in the area where both the target model and the stolen model are misclassified, and enhances the classification features of the error samples through a GAN network to generate fingerprint samples with correct classification. The present invention uses a GAN network to generate fingerprint samples, which greatly enhances the concealment of the samples. At the same time, it simulates the modification that an attacker can make to the model, and uses the centroid sample to filter out the error samples and fingerprint samples, which greatly improves the robustness to various model attacks.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1为本发明基于错误分类的模型指纹识别方法流程图;FIG1 is a flow chart of a model fingerprint recognition method based on error classification according to the present invention;
图2为本发明实施例中训练盗版模型过程的示意图;FIG2 is a schematic diagram of a process of training a piracy model in an embodiment of the present invention;
图3为本发明实施例中筛选错误样本过程的示意图;FIG3 is a schematic diagram of a process for screening erroneous samples in an embodiment of the present invention;
图4为本发明实施例中构造质心样本的示意图;FIG4 is a schematic diagram of constructing a centroid sample in an embodiment of the present invention;
图5为本发明实施例中GAN网络生成指纹样本的结构图;FIG5 is a structural diagram of a fingerprint sample generated by a GAN network according to an embodiment of the present invention;
图6为本发明实施例中验证模型所有权的流程图。FIG. 6 is a flow chart of verifying model ownership in an embodiment of the present invention.
具体实施方式Detailed ways
下面将结合本发明的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The following will be combined with the accompanying drawings of the present invention to clearly and completely describe the technical solutions in the embodiments of the present invention. Obviously, the described embodiments are only part of the embodiments of the present invention, not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by ordinary technicians in this field without creative work are within the scope of protection of the present invention.
实施例Example
如图1所示,本发明实施例提供了一种基于错误分类的模型指纹识别方法,包括:As shown in FIG1 , an embodiment of the present invention provides a model fingerprint recognition method based on error classification, comprising:
输入公共数据集Dm,使用公共数据集Dm频繁访问目标模型得到公共数据集Dm的预测标签,将带有标签的公共数据集Dm作为原训练集Dtrain,通过原训练集Dtrain训练盗版模型;Input a public dataset D m , use the public dataset D m to frequently access the target model to obtain the predicted label of the public dataset D m , use the public dataset D m with the label as the original training set D train , and train the piracy model through the original training set D train ;
在原训练集Dtrain中筛选出目标模型和盗版模型均分类错误的样本Z;Filter out samples Z that are misclassified by both the target model and the pirated model in the original training set D train ;
找出训练集Dtrain每一类中与同类的其他样本累积距离最小的样本作为质心样本Ds;Find the sample with the smallest cumulative distance from other samples of the same category in each category of the training set D train as the centroid sample D s ;
在分类错误的样本Z中筛选出一批与质心样本距离最大的样本作为查询集中的错误样本Ze,并记录错误样本Ze的标签集;Select a batch of samples with the largest distance from the centroid sample from the misclassified samples Z as the error samples Ze in the query set, and record the label set of the error samples Ze ;
把错误样本Ze输入到预先构建的GAN网络中,引导错误样本Ze正确分类,并生成自然的指纹样本Zr;Input the error sample Ze into the pre-built GAN network to guide the error sample Ze to be correctly classified and generate a natural fingerprint sample Zr ;
在GAN网络生成的指纹样本Zr中筛选出一批与质心样本距离最小的样本作为查询集中的指纹样本Zw,并记录指纹样本的标签集;A batch of samples with the smallest distance to the centroid sample are selected from the fingerprint samples Z r generated by the GAN network as the fingerprint samples Z w in the query set, and the label set of the fingerprint samples is recorded;
将错误样本Ze和指纹样本Zw分别输入到预先构建的可疑模型中,得到错误样本的标签集和指纹样本的标签集。The error samples Ze and fingerprint samples Zw are respectively input into the pre-built suspicious model to obtain the label set of the error samples and the label set of the fingerprint samples.
具体的,结合附图2对训练盗版模型过程进行详细地说明,按照攻击者访问模型的权限不同,用两种方式来训练盗版模型。一种是通过访问目标模型以获取数据集的标签,通过用带有标签的数据集来训练一个与目标模型功能相似的代理模型以窃取模型。另一种是通过模型微调、剪枝、加噪等手段直接修改目标模型。其中目标模型的结构为WideResNet,代理模型的结构为PreacTresnet。Specifically, the process of training the pirated model is described in detail in conjunction with Figure 2. According to the different permissions of the attacker to access the model, there are two ways to train the pirated model. One is to access the target model to obtain the label of the data set, and use the labeled data set to train a proxy model with similar functions to the target model to steal the model. The other is to directly modify the target model by means of model fine-tuning, pruning, and noise addition. The structure of the target model is WideResNet, and the structure of the proxy model is PreacTresnet.
下面结合附图3对选择错误样本过程进行详细地说明,在盗版模型训练完成后,使用训练集Dtrain筛选出目标模型和盗版模型分类都出错的样本Z。选择CIFAR-10作为训练集Dtrain,该数据集由6万张彩色图像组成,包括5万张训练图像(一共10个类,每个类5千张图像)和1万张测试图像(一共10个类,每个类1千张图像)。The process of selecting the wrong samples is described in detail below in conjunction with FIG3. After the training of the pirated model is completed, the training set D train is used to screen out the samples Z that are incorrectly classified by both the target model and the pirated model. CIFAR-10 is selected as the training set D train . The data set consists of 60,000 color images, including 50,000 training images (a total of 10 classes, 5,000 images for each class) and 10,000 test images (a total of 10 classes, 1,000 images for each class).
进一步的,由于攻击模型的手段复杂多样,还需进一步筛选出与质心样本距离最大的样本Ze以增强指纹的鲁棒性。图4为构造质心样本的示意图,质心样本是训练集中最接近某一个类决策范围中心的样本。找出训练集Dtrain每一类中与同类的其他样本累积距离最小的样本作为质心样本Ds,公式如下:Furthermore, since the attack model has complex and diverse means, it is necessary to further screen out the sample Ze with the largest distance from the centroid sample to enhance the robustness of the fingerprint. Figure 4 is a schematic diagram of constructing the centroid sample. The centroid sample is the sample closest to the center of the decision range of a class in the training set. Find the sample with the smallest cumulative distance to other samples of the same class in each class of the training set D train as the centroid sample D s , and the formula is as follows:
其中,N表示k类中的数据数,表示向量的长度。Where N represents the number of data in class k, Indicates the length of a vector.
进一步的,下面结合附图5对GAN网络生成指纹样本的过程进行详细地说明。把错误样本Ze输入到预先构建的GAN网络中,引导错误样本Ze正确分类,并生成自然的指纹样本Zr,具体包括:Furthermore, the process of generating fingerprint samples by the GAN network is described in detail below in conjunction with FIG5. The error sample Ze is input into the pre-built GAN network to guide the error sample Ze to be correctly classified and generate a natural fingerprint sample Zr , which specifically includes:
把错误样本Ze输入到GAN网络中的生成器G得到指纹样本Zr,将输入指纹样本Zr目标模型,引导其分类正确;Input the error sample Z e into the generator G in the GAN network to obtain the fingerprint sample Z r , and input the fingerprint sample Z r into the target model to guide its classification correct;
利用分类损失和鉴别损失/>的加权组合/>训练该GAN网络,/>是平衡错误样本和指纹样本质量的超参数;Using classification loss and identification loss/> A weighted combination of Train the GAN network, /> is a hyperparameter that balances the quality of error samples and fingerprint samples;
将指纹样本Zr输入鉴别器中,通过计算鉴别损失Ld引导生成自然的指纹样本;Input the fingerprint sample Zr into the discriminator, and guide the generation of natural fingerprint samples by calculating the discrimination loss Ld ;
计算总损失,反向传播以最小化总损失函数L,迭代更新GAN网络的参数,得到自然的指纹样本。Calculate total loss , back propagation is performed to minimize the total loss function L, and the parameters of the GAN network are iteratively updated to obtain natural fingerprint samples.
进一步的,分类损失公式如下:Furthermore, the classification loss The formula is as follows:
其中为目标模型F对指纹样本/>的SoftMax函数,Y是引导指纹样本/>分类的标签,/>是Carlini-Wagner损失。in For the target model F, the fingerprint sample is SoftMax function, Y is the guide fingerprint sample/> Category tags, /> It's the Carlini-Wagner loss.
进一步的,Carlini-Wagner损失,公式如下:Furthermore, the Carlini-Wagner loss , the formula is as follows:
Z为,参数k鼓励GAN网络生成被分类为Y类的高置信度样本。Z is , the parameter k encourages the GAN network to generate high-confidence samples classified as class Y.
进一步的,鉴别损失Ld,公式如下:Furthermore, the identification loss L d is given by the following formula:
。 .
进一步的,下面结合附图6,详细说明验证模型所有权的实现过程:将错误样本Ze和指纹样本Zw分别输入到预先构建的可疑模型中,得到错误样本的标签集和指纹样本的标签集后,判断是否满足Ei’=Ei且Wi’=Wi,其中,Ei’表示错误样本的标签集,Ei表示保留的错误样本的标签集,Wi’表示指纹样本的标签集,Wi保留的指纹样本的标签集,计算匹配率S公式如下:Furthermore, the following is a detailed description of the implementation process of verifying model ownership in conjunction with FIG6: the error sample Ze and the fingerprint sample Zw are respectively input into the pre-built suspicious model, and after obtaining the label set of the error sample and the label set of the fingerprint sample, it is determined whether Ei '= Ei and Wi '= Wi are satisfied, where Ei ' represents the label set of the error sample, Ei represents the label set of the retained error sample, Wi ' represents the label set of the fingerprint sample, and Wi represents the label set of the retained fingerprint sample. The formula for calculating the matching rate S is as follows:
匹配率大于95%,可疑模型被视为被盗模型。Match Rate If the value is greater than 95%, the suspicious model is considered a stolen model.
本发明又一个实施例提供了一种基于错误分类的模型指纹识别系统,包括:Yet another embodiment of the present invention provides a model fingerprint recognition system based on error classification, comprising:
预处理模块,用于输入公共数据集Dm,使用公共数据集Dm频繁访问目标模型得到公共数据集Dm的预测标签,将带有标签的公共数据集Dm作为原训练集Dtrain,通过原训练集Dtrain训练盗版模型;A preprocessing module is used to input a public data set D m , frequently access a target model using the public data set D m to obtain a predicted label of the public data set D m , use the public data set D m with the label as an original training set D train , and train a piracy model through the original training set D train ;
第一筛选模块,用于在原训练集Dtrain中筛选出目标模型和盗版模型均分类错误的样本Z;The first screening module is used to screen out samples Z that are misclassified by both the target model and the pirated model in the original training set D train ;
提取模块,用于找出训练集Dtrain每一类中与同类的其他样本累积距离最小的样本作为质心样本Ds;An extraction module is used to find the sample with the smallest cumulative distance from other samples of the same category in each category of the training set D train as the centroid sample D s ;
记录模块,用于在分类错误的样本Z中筛选出一批与质心样本距离最大的样本作为查询集中的错误样本Ze,并记录错误样本Ze的标签集;A recording module is used to select a batch of samples with the largest distance from the centroid sample from the misclassified samples Z as the error samples Z e in the query set, and record the label set of the error samples Z e ;
生成模块,用于把错误样本Ze输入到预先构建的GAN网络中,引导错误样本Ze正确分类,并生成自然的指纹样本Zr;The generation module is used to input the error sample Ze into the pre-built GAN network, guide the error sample Ze to be correctly classified, and generate a natural fingerprint sample Zr ;
第二筛选模块,用于在GAN网络生成的指纹样本Zr中筛选出一批与质心样本距离最小的样本作为查询集中的指纹样本Zw,并记录指纹样本的标签集;The second screening module is used to screen out a batch of samples with the smallest distance from the centroid sample from the fingerprint samples Z r generated by the GAN network as the fingerprint samples Z w in the query set, and record the label set of the fingerprint samples;
处理与输出模块,用于将错误样本Ze和指纹样本Zw分别输入到预先构建的可疑模型中,得到错误样本的标签集和指纹样本的标签集。The processing and output module is used to input the error sample Ze and the fingerprint sample Zw into the pre-built suspicious model respectively to obtain the label set of the error sample and the label set of the fingerprint sample.
本申请的实施例可提供为方法或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。本申请实施例中的方案可以采用各种计算机语言实现,例如,面向对象的程序设计语言Java和直译式脚本语言JavaScript等。The embodiments of the present application may be provided as methods or computer program products. Therefore, the present application may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, the present application may adopt the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes. The schemes in the embodiments of the present application may be implemented in various computer languages, for example, object-oriented programming language Java and literal scripting language JavaScript, etc.
本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present application is described with reference to the flowcharts and/or block diagrams of the methods, devices (systems), and computer program products according to the embodiments of the present application. It should be understood that each process and/or box in the flowchart and/or block diagram, as well as the combination of the processes and/or boxes in the flowchart and/or block diagram, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, a special-purpose computer, an embedded processor, or other programmable data processing device to generate a machine, so that the instructions executed by the processor of the computer or other programmable data processing device generate a device for implementing the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing device to operate in a specific manner, so that the instructions stored in the computer-readable memory produce a manufactured product including an instruction device that implements the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions may also be loaded onto a computer or other programmable data processing device so that a series of operational steps are executed on the computer or other programmable device to produce a computer-implemented process, whereby the instructions executed on the computer or other programmable device provide steps for implementing the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.
需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。It should be noted that, in this article, relational terms such as first and second, etc. are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Moreover, the terms "include", "comprise" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, method, article or device including a series of elements includes not only those elements, but also other elements not explicitly listed, or also includes elements inherent to such process, method, article or device. In the absence of further restrictions, the elements defined by the sentence "comprise a ..." do not exclude the presence of other identical elements in the process, method, article or device including the elements.
Claims (9)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410331647.6A CN117932457B (en) | 2024-03-22 | 2024-03-22 | Model fingerprint identification method and system based on error classification |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410331647.6A CN117932457B (en) | 2024-03-22 | 2024-03-22 | Model fingerprint identification method and system based on error classification |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117932457A CN117932457A (en) | 2024-04-26 |
CN117932457B true CN117932457B (en) | 2024-05-28 |
Family
ID=90757833
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410331647.6A Active CN117932457B (en) | 2024-03-22 | 2024-03-22 | Model fingerprint identification method and system based on error classification |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117932457B (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111240279A (en) * | 2019-12-26 | 2020-06-05 | 浙江大学 | An Adversarial Enhanced Fault Classification Method for Industrial Imbalanced Data |
CN113298184A (en) * | 2021-06-21 | 2021-08-24 | 哈尔滨工程大学 | Sample extraction and expansion method and storage medium for small sample image recognition |
CN114021670A (en) * | 2022-01-04 | 2022-02-08 | 深圳佑驾创新科技有限公司 | Classification model learning method and terminal |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US12307377B2 (en) * | 2020-12-03 | 2025-05-20 | International Business Machines Corporation | Generating data based on pre-trained models using generative adversarial models |
-
2024
- 2024-03-22 CN CN202410331647.6A patent/CN117932457B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111240279A (en) * | 2019-12-26 | 2020-06-05 | 浙江大学 | An Adversarial Enhanced Fault Classification Method for Industrial Imbalanced Data |
CN113298184A (en) * | 2021-06-21 | 2021-08-24 | 哈尔滨工程大学 | Sample extraction and expansion method and storage medium for small sample image recognition |
CN114021670A (en) * | 2022-01-04 | 2022-02-08 | 深圳佑驾创新科技有限公司 | Classification model learning method and terminal |
Also Published As
Publication number | Publication date |
---|---|
CN117932457A (en) | 2024-04-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2023070696A1 (en) | Feature manipulation-based attack and defense method for continuous learning ability system | |
CN110766598B (en) | Intelligent model watermark embedding and extracting method and system based on convolutional neural network | |
CN113360912A (en) | Malicious software detection method, device, equipment and storage medium | |
CN111881446B (en) | Industrial Internet malicious code identification method and device | |
CN112313645A (en) | Learning method and testing method of data embedding network by synthesizing original data and labeled data to generate labeled data, and learning device and testing device using the same | |
CN112232434A (en) | Method and device for cooperative defense against adversarial attack based on correlation analysis | |
CN114282258A (en) | Screen capture data desensitization method and device, computer equipment and storage medium | |
CN118709232A (en) | Edge IoT proxy method based on secure encryption chip | |
WO2025030635A1 (en) | Data processing method and apparatus, device and storage medium | |
CN115828239A (en) | Malicious code detection method based on multi-dimensional data decision fusion | |
CN118734314A (en) | Large language model prompt word injection attack detection method and device based on context learning | |
CN119026127B (en) | Malicious code detection method, system and equipment based on multi-level feature fusion | |
Cheng et al. | Topology-aware universal adversarial attack on 3D object tracking | |
CN110163163B (en) | A defense method and defense device against attacks with limited number of queries for a single face | |
CN114970809A (en) | Picture countermeasure sample generation method based on generation type countermeasure network | |
CN117932457B (en) | Model fingerprint identification method and system based on error classification | |
CN118736373A (en) | An adversarial training method combining image denoising and feature alignment | |
Zou et al. | Survey on AI-Generated Media Detection: From Non-MLLM to MLLM | |
CN118174918A (en) | Electric power Internet of things attack behavior detection method, system, device and medium | |
Xia et al. | Source Code Vulnerability Detection Based On SAR-GIN | |
CN109960934A (en) | A CNN-based malicious request detection method | |
CN113259369B (en) | A data set authentication method and system based on machine learning membership inference attack | |
CN115600202A (en) | A Malware Detection and Family Classification Method Based on Multiple Cropping Strategies and Deep Convolutional Generative Adversarial Networks | |
Peng et al. | MobileViT-FocR: MobileViT with fixed-one-centre loss and gradient reversal for generalised fake face detection | |
CN118395195B (en) | Model training method, video positioning method, system, equipment, product and medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |