WO2021143478A1 - Method and apparatus for identifying adversarial sample to protect model security - Google Patents
Method and apparatus for identifying adversarial sample to protect model security Download PDFInfo
- Publication number
- WO2021143478A1 WO2021143478A1 PCT/CN2020/138824 CN2020138824W WO2021143478A1 WO 2021143478 A1 WO2021143478 A1 WO 2021143478A1 CN 2020138824 W CN2020138824 W CN 2020138824W WO 2021143478 A1 WO2021143478 A1 WO 2021143478A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- sample
- samples
- privacy
- control
- experimental
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 68
- 239000000523 sample Substances 0.000 claims abstract description 314
- 239000013068 control sample Substances 0.000 claims abstract description 91
- 238000011156 evaluation Methods 0.000 claims abstract description 79
- 238000012360 testing method Methods 0.000 claims abstract description 64
- 238000010801 machine learning Methods 0.000 claims abstract description 60
- 238000005070 sampling Methods 0.000 claims abstract description 42
- 238000012549 training Methods 0.000 claims description 41
- 238000012545 processing Methods 0.000 claims description 20
- 230000000052 comparative effect Effects 0.000 claims description 8
- 238000004590 computer program Methods 0.000 claims description 6
- 238000012935 Averaging Methods 0.000 claims description 4
- 238000002474 experimental method Methods 0.000 claims 1
- 238000010586 diagram Methods 0.000 description 8
- 238000013145 classification model Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 241000282620 Hylobates sp. Species 0.000 description 1
- 208000025174 PANDAS Diseases 0.000 description 1
- 208000021155 Paediatric autoimmune neuropsychiatric disorders associated with streptococcal infection Diseases 0.000 description 1
- 240000000220 Panda oleosa Species 0.000 description 1
- 235000016496 Panda oleosa Nutrition 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000009193 crawling Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6245—Protecting personal data, e.g. for financial or medical purposes
Definitions
- One or more embodiments of this specification relate to the technical field of data computing security, and more particularly to a method and device for identifying countermeasure samples to protect model security.
- One or more embodiments of this specification describe a method and device for identifying adversarial samples to protect the safety of the model, which can be used to improve the training performance and prediction performance of the model.
- sampling multiple non-confrontational samples several times to obtain several control sample sets includes: using an enumeration method to sample the multiple non-confrontational samples multiple times to obtain multiple control sample sets; or , Using the stratified sampling method to sample the multiple non-confrontational samples several times to obtain the several control sample sets; or using the self-service sampling method to perform several samplings of the multiple non-confrontational samples to obtain the Describe several control sample sets.
- using several gain values determined based on the several control sample sets and the several experimental sample sets to determine whether the target sample is an adversarial sample includes: determining the gain average of the several gain values, In addition, in the case where the gain average value is less than the set threshold value, it is determined that the target sample belongs to the adversarial sample; or, the gain ratio of the plurality of gain values that is greater than the set threshold value is determined, and the gain ratio is less than the first In the case of a preset ratio, it is determined that the target sample belongs to the adversarial sample.
- determining whether the target sample is an adversarial sample further includes: averaging a number of comparison values of the plurality of comparison sample sets against the preset evaluation index to obtain a comparison mean value; The product of the mean value and the second preset ratio is determined as the set threshold.
- an apparatus for identifying adversarial samples to protect the safety of a model includes: a sampling unit configured to sample multiple non-adversarial samples several times to obtain several control sample sets; and an adding unit configured to The plurality of control sample sets are respectively added to the target samples to be tested to obtain a plurality of experimental sample sets; the first training unit is configured to use the first control sample set for any first control sample set in the plurality of control sample sets The initial machine learning model is trained to obtain the trained first comparison model; the first evaluation unit is configured to use the test sample set to evaluate the performance of the first comparison model to obtain the first comparison value for the preset evaluation index, so The test sample set is determined based on the multiple non-confrontational samples; the second training unit is configured to use the first experimental sample set obtained by adding the target sample to the first control sample set.
- a method for identifying anti-privacy samples to protect privacy includes: sampling multiple non-confrontational privacy samples several times to obtain several comparative privacy sample sets; adding target privacy samples to be tested to the several comparative privacy sample sets to obtain several experimental privacy sample sets; A plurality of comparison privacy sample sets are used for any first comparison privacy sample set, and the first comparison privacy sample set is used to train the initial machine learning model to obtain the trained first comparison model; the test privacy sample set is used to compare the first comparison model Perform performance evaluation to obtain a first comparison value for a preset evaluation index, the test privacy sample set is determined based on the multiple non-confrontational privacy samples; for adding the target privacy sample to the first comparison privacy sample set The first experimental privacy sample set obtained is used to train the initial machine learning model using the first experimental privacy sample set to obtain the trained first experimental model; the test privacy sample set is used to compare the first experimental model Perform performance evaluation to obtain the first experimental value for the preset evaluation index; determine the difference between the first experimental value and the first control value as the first
- an apparatus for identifying a privacy-against sample to protect privacy includes: a sampling unit configured to sample multiple non-confrontational privacy samples several times to obtain several comparative privacy sample sets; an adding unit configured to add target privacy samples to be detected to the several comparative privacy sample sets, respectively, Obtain a number of experimental privacy sample sets; the first training unit is configured to train an initial machine learning model for any first control privacy sample set in the plurality of control privacy sample sets, and use the first control privacy sample set to train an initial machine learning model to obtain the trained A first comparison model; a first evaluation unit configured to evaluate the performance of the first comparison model using a test privacy sample set to obtain a first comparison value for a preset evaluation index, and the test privacy sample set is based on the multiple A non-confrontational privacy sample; the second training unit is configured to use the first experimental privacy sample set for the first experimental privacy sample set obtained by adding the target privacy sample to the first control privacy sample set The initial machine learning model is trained to obtain the trained first experimental model;
- a computing device including a memory and a processor, the memory stores executable code, and when the processor executes the executable code, the method of the first aspect or the third aspect is implemented .
- Fig. 1 shows an implementation block diagram of a method for identifying adversarial samples according to an embodiment
- Figure 2 shows a flow chart of a method for identifying adversarial samples to protect model safety according to an embodiment
- Fig. 3 shows a sequence diagram of steps in identifying adversarial samples according to an embodiment
- FIG. 4 shows a structural diagram of an apparatus for recognizing adversarial samples to protect model safety according to an embodiment
- Fig. 6 shows a structural diagram of an apparatus for identifying an anti-privacy sample to protect privacy according to an embodiment.
- the training samples currently used for model training can include different sources, such as manual marking, crawling from websites or network platforms, etc., among which it is easy to mix adversarial samples. As mentioned earlier, identifying adversarial samples is very important to ensure model training performance and prediction performance, thereby protecting the safety of the model.
- these samples may be image samples, and accordingly, the initial machine learning model may be an image processing model.
- these samples may include face images, iris images, fingerprint images, etc., and the initial machine learning model may be an identity recognition model.
- these samples may be text samples, and accordingly, the initial machine learning model may be a text processing model.
- these samples may be speech samples, and accordingly, the initial machine learning model may be a speech processing model.
- the enumeration method may be used to perform multiple sampling to obtain multiple control sample sets.
- the enumeration method is a method to enumerate all possible methods. Assuming that multiple non-adversarial samples include 3 samples, which are designated by A, B, and C respectively, then the control sample set obtained by using the enumeration method includes: ⁇ A ⁇ , ⁇ B ⁇ , ⁇ C ⁇ , ⁇ A,B ⁇ , ⁇ A,C ⁇ , ⁇ B,C ⁇ and ⁇ A,B,C ⁇ .
- a self-service sampling method can also be used to perform several samplings to obtain several control sample sets. Specifically, for a certain sampling, assuming that the number of multiple non-adversarial samples is M and the number of samples that need to be collected is m, then one sample can be randomly selected from M non-adversarial samples each time and classified as m samples, and then put this sample back into M non-adversarial samples, so that the sample can still be selected in the next selection. After this process is repeated m times, a control sample set including m samples can be obtained .
- the initial machine learning model may be an initialization model, that is, the initial machine learning model may be a model that has not undergone any training, and the model parameters are those assigned when the model is initialized. parameter.
- the initial machine learning model may also be a model trained using some non-confrontational samples other than the aforementioned multiple non-confrontational samples.
- the initial machine learning model can be a classification model, a regression model, a neural network model, etc., which is not limited.
- the aforementioned preset evaluation indicators may include: error rate, accuracy, recall rate, precision rate, and so on.
- the error rate refers to the ratio of the number of test samples with prediction errors to the total number of test samples.
- Accuracy refers to the proportion of the number of test samples whose predictions are correct to the total number of test samples.
- the precision rate represents the proportion of test samples that are truly positive (that is, the label is identified as positive) among the test samples predicted to be positive;
- the recall rate represents the positive examples included in the test sample ( That is, the proportion of samples that are predicted to be correct in the label identification is positive.
- the prediction evaluation index includes a precision rate
- the first comparison value may include a precision rate of 0.88.
- the predictive evaluation index includes an error rate
- the first control value may include an error rate of 0.16.
- step S250 and step S260 can refer to the above description of step S230 and step S240, and will not be repeated.
- the prediction evaluation index includes a precision rate
- the first experimental value may include a precision rate of 0.80 or 0.90.
- the predictive evaluation index includes an error rate
- the first comparison value may include an error rate of 0.10 or 0.20.
- step S210 the first experimental value corresponding to any first experimental sample set can be obtained, and accordingly, several experimental values corresponding to several experimental sample sets can be obtained.
- step S210-S260 the rest is not limited. Specifically, in an embodiment, step S210, step S230, step S220, step S250, step S240, and step S260 may be sequentially executed in sequence. In another implementation manner, step S210, step S220, step S230, step S240, step S250, and step S260 may be executed successively.
- the gain value is used to characterize the optimization effect brought by the target sample to the model performance.
- the first gain value is minus the first experimental value The difference obtained from the first control value.
- the preset evaluation index is the precision rate. If the first control value and the first experimental value are 0.88 and 0.80, respectively, the first gain value is -0.80, and if the first control value and the first experimental value are respectively If the values are 0.88 and 0.90, the first gain value is 0.20.
- step S280 a number of gain values determined based on the number of control sample sets and the number of experimental sample sets are used to determine whether the target sample belongs to an adversarial sample.
- this step may include: determining the gain average of the several gain values; further, in the case that the gain average is less than a set threshold, determining that the target sample belongs to the adversarial sample, and In the case that the gain average value is not less than the set threshold, it is determined that the target sample does not belong to the adversarial sample.
- the set threshold may be a manually set threshold, such as 0 or 0.05.
- the setting threshold may be set based on the following steps: firstly, averaging the comparison values of the above-mentioned comparison sample sets with respect to the preset evaluation index to obtain the comparison mean value; The product of the comparison mean value and the second preset ratio is determined as the set threshold.
- the second preset ratio can be set by business personnel based on expert experience or actual needs, for example, set to 0.05 or 0.02. In an example, assuming that the above-mentioned control mean value is 0.80 and the second preset ratio is 0.05, the set threshold may be determined to be 0.04.
- Fig. 3 shows a sequence diagram of steps in identifying adversarial samples according to an embodiment.
- the identification of adversarial samples includes the following steps: Step S31, sampling normal samples (that is, non-adversarial samples) to obtain a control sample set.
- Step S32 Use the control sample set to train the initial model, and use the test sample set to evaluate the performance of the trained model to obtain a control evaluation result.
- step S33 the sample to be tested is added to the control sample set to obtain an experimental sample set.
- Step S34 Use the experimental sample set to train the initial model, and use the test sample set to evaluate the performance of the trained model to obtain an experimental evaluation result.
- the target samples to be detected are added to obtain a number of experimental sample sets;
- the first training unit 430 is configured to train an initial machine learning model for any first control sample set in the plurality of control sample sets, using the first control sample set, Obtain the trained first comparison model;
- the first evaluation unit 440 is configured to evaluate the performance of the first comparison model by using a test sample set to obtain a first comparison value for a preset evaluation index, and the test sample set is based on The multiple non-adversarial samples are determined;
- the second training unit 450 is configured to use the first experimental sample set to train for the first experimental sample set obtained by adding the target sample to the first control sample set
- the initial machine learning model obtains the first experimental model after training;
- the second evaluation unit 460 is configured to evaluate the performance of the first experimental model by using the test sample set to obtain the preset evaluation index A first experimental value;
- a gain determining unit 470 configured to determine the difference between the first experimental value and the first control value as a first gain value;
- the determination unit 480 is configured to determine the gain average of a number of gain values, and, in the case that the gain average is less than a set threshold, determine that the target sample belongs to the adversarial sample; or, determine all A gain ratio of the plurality of gain values that is greater than a set threshold, and in a case where the gain ratio is less than a first preset ratio, it is determined that the target sample belongs to a confrontation sample.
- the method includes the following steps: Step S510, sampling a plurality of non-confrontational privacy samples several times to obtain a plurality of control privacy sample sets; Step S520, adding to the plurality of control privacy sample sets to be tested respectively
- Step S530 for any first control privacy sample set in the plurality of control privacy sample sets, use the first control privacy sample set to train the initial machine learning model, and obtain the post-training
- Step S540 use a test privacy sample set to evaluate the performance of the first comparison model to obtain a first comparison value for a preset evaluation index, and the test privacy sample set is based on the plurality of non-confrontational Privacy samples are determined; step S550, for the first experimental privacy sample set obtained by adding the target privacy sample to the first control privacy sample set, use the first experimental privacy sample set to train the initial machine learning model , Obtain the first experimental model after training; step S560, use the test privacy sample set to evaluate the performance of the first experimental model to obtain the first experimental value for the preset evaluation index; step S570: The difference between the first
- FIG. 6 shows a structural diagram of an apparatus for identifying an anti-privacy sample to protect privacy and security according to an embodiment.
- the device 600 may include: a sampling unit 610, configured to sample multiple non-confrontational privacy samples several times to obtain several comparative privacy sample sets; and an adding unit 620, configured to provide information to the several comparative privacy samples.
- the target privacy samples to be tested are respectively added to the sample set to obtain a number of experimental privacy sample sets;
- the first training unit 630 is configured to use the first control privacy sample set for any first control privacy sample set in the plurality of control privacy sample sets
- the sample set trains the initial machine learning model to obtain the trained first comparison model;
- the first evaluation unit 640 is configured to use the test privacy sample set to evaluate the performance of the first comparison model to obtain the first comparison model for the preset evaluation index.
- the test privacy sample set is determined based on the plurality of non-confrontational privacy samples; the second training unit 650 is configured to respond to the first comparison result obtained by adding the target privacy sample to the first control privacy sample set
- the experimental privacy sample set uses the first experimental privacy sample set to train the initial machine learning model to obtain the trained first experimental model; the second evaluation unit 660 is configured to use the test privacy sample set to compare the second Perform performance evaluation on an experimental model to obtain the first experimental value for the preset evaluation index; the gain determining unit 670 is configured to determine the difference between the first experimental value and the first control value as the first Gain value;
- the determination unit 680 is configured to use several gain values determined based on the several control privacy sample sets and the several experimental privacy sample sets to determine whether the target privacy sample belongs to the anti-privacy sample.
- a computing device including a memory and a processor, the memory stores executable code, and when the processor executes the executable code, a combination of FIG. 1 or FIG. 2 is implemented. Or the method described in Figure 3 or Figure 5.
Abstract
Description
Claims (16)
- 一种识别对抗样本以保护模型安全的方法,包括:A method of identifying adversarial samples to protect the model security, including:对多个非对抗样本进行若干次采样,得到若干对照样本集;Sampling multiple non-adversarial samples several times to obtain several control sample sets;针对所述若干对照样本集中任意的第一对照样本集,利用所述第一对照样本集训练初始机器学习模型,得到训练后的第一对照模型;For any first control sample set in the plurality of control sample sets, use the first control sample set to train an initial machine learning model to obtain a trained first control model;向所述若干对照样本集中分别加入待检测的目标样本,得到若干实验样本集;Adding target samples to be tested to the plurality of control sample sets to obtain a plurality of experimental sample sets;针对向所述第一对照样本集中加入所述目标样本而得到的第一实验样本集,利用所述第一实验样本集训练所述初始机器学习模型,得到训练后的第一实验模型;For a first experimental sample set obtained by adding the target sample to the first control sample set, train the initial machine learning model by using the first experimental sample set to obtain a trained first experimental model;利用测试样本集对所述第一对照模型进行性能评估,得到针对预设评估指标的第一对照值,所述测试样本集基于所述多个非对抗样本而确定;Performing performance evaluation on the first comparison model by using a test sample set to obtain a first comparison value for a preset evaluation index, the test sample set being determined based on the plurality of non-confrontational samples;利用所述测试样本集对所述第一实验模型进行性能评估,得到针对所述预设评估指标的第一实验值;Performing performance evaluation on the first experimental model by using the test sample set to obtain a first experimental value for the preset evaluation index;将所述第一实验值与所述第一对照值的差值,确定为第一增益值;Determining the difference between the first experimental value and the first control value as a first gain value;利用基于所述若干对照样本集和所述若干实验样本集确定出的若干增益值,判定所述目标样本是否属于对抗样本。Using several gain values determined based on the several control sample sets and the several experimental sample sets, it is determined whether the target sample belongs to the adversarial sample.
- 根据权利要求1所述的方法,其中,The method of claim 1, wherein:所述多个非对抗样本和目标样本为图像样本,所述初始机器学习模型为图像处理模型;或,The multiple non-confrontational samples and target samples are image samples, and the initial machine learning model is an image processing model; or,所述多个非对抗样本和目标样本为文本样本,所述初始机器学习模型为文本处理模型;或,The multiple non-adversarial samples and the target samples are text samples, and the initial machine learning model is a text processing model; or,所述多个非对抗样本和目标样本为语音样本,所述初始机器学习模型为语音处理模型。The multiple non-confrontational samples and target samples are speech samples, and the initial machine learning model is a speech processing model.
- 根据权利要求1所述的方法,其中,对多个非对抗样本进行若干次采样,得到若干对照样本集,包括:The method according to claim 1, wherein sampling multiple non-confrontational samples several times to obtain several control sample sets comprises:利用枚举法,对所述多个非对抗样本进行多次采样,得到多个对照样本集;或,Using enumeration method to sample the multiple non-adversarial samples multiple times to obtain multiple control sample sets; or,利用分层采样法,对所述多个非对抗样本进行若干次采样,得到所述若干对照样本集;或,Using the stratified sampling method to sample the multiple non-confrontational samples several times to obtain the several control sample sets; or,利用自助采样法,对所述多个非对抗样本进行若干次采样,得到所述若干对照样本集。Using the self-service sampling method, the multiple non-confrontational samples are sampled several times to obtain the several control sample sets.
- 根据权利要求1所述的方法,其中,所述预设评估指标包括以下中的一种或多种:错误率、精度、查全率。The method according to claim 1, wherein the preset evaluation index includes one or more of the following: error rate, accuracy, recall rate.
- 根据权利要求1所述的方法,其中,利用基于所述若干对照样本集和所述若干实验样本集确定出的若干增益值,判定所述目标样本是否为对抗样本,包括:The method according to claim 1, wherein the determining whether the target sample is an adversarial sample using a number of gain values determined based on the number of control sample sets and the number of experimental sample sets comprises:确定所述若干增益值的增益均值,并且,在所述增益均值小于设定阈值的情况下,判定所述目标样本属于对抗样本;或,Determine the gain average of the several gain values, and, in the case that the gain average is less than a set threshold, determine that the target sample belongs to the adversarial sample; or,确定所述若干增益值中大于设定阈值的增益比例,并且,在所述增益比例小于第一预设比例的情况下,判定所述目标样本属于对抗样本。Determine a gain ratio of the plurality of gain values that is greater than a set threshold, and if the gain ratio is less than a first preset ratio, determine that the target sample belongs to an adversarial sample.
- 根据权利要求5所述的方法,其中,判定所述目标样本是否为对抗样本,还包括:The method according to claim 5, wherein determining whether the target sample is an adversarial sample, further comprises:对所述若干对照样本集针对所述预设评估指标的若干对照值进行平均,得到对照均值;Averaging the comparison values of the plurality of comparison sample sets with respect to the preset evaluation index to obtain a comparison mean value;将所述对照均值与第二预设比例的乘积,确定为所述设定阈值。The product of the control mean value and the second preset ratio is determined as the set threshold.
- 一种识别对抗样本以保护模型安全的装置,包括:A device for identifying adversarial samples to protect the safety of the model, including:采样单元,配置为对多个非对抗样本进行若干次采样,得到若干对照样本集;The sampling unit is configured to sample multiple non-confrontational samples several times to obtain several control sample sets;添加单元,配置为向所述若干对照样本集中分别加入待检测的目标样本,得到若干实验样本集;The adding unit is configured to add target samples to be tested to the several control sample sets to obtain several experimental sample sets;第一训练单元,配置为针对所述若干对照样本集中任意的第一对照样本集,利用所述第一对照样本集训练初始机器学习模型,得到训练后的第一对照模型;The first training unit is configured to train an initial machine learning model with respect to any first control sample set in the plurality of control sample sets to obtain a trained first control model;第一评估单元,配置为利用测试样本集对所述第一对照模型进行性能评估,得到针对预设评估指标的第一对照值,所述测试样本集基于所述多个非对抗样本而确定;A first evaluation unit configured to evaluate the performance of the first comparison model by using a test sample set to obtain a first comparison value for a preset evaluation index, the test sample set being determined based on the plurality of non-confrontational samples;第二训练单元,配置为针对向所述第一对照样本集中加入所述目标样本而得到的第一实验样本集,利用所述第一实验样本集训练所述初始机器学习模型,得到训练后的第一实验模型;The second training unit is configured to train the initial machine learning model using the first experimental sample set for the first experimental sample set obtained by adding the target sample to the first control sample set to obtain the trained The first experimental model;第二评估单元,配置为利用所述测试样本集对所述第一实验模型进行性能评估,得到针对所述预设评估指标的第一实验值;A second evaluation unit configured to evaluate the performance of the first experimental model by using the test sample set to obtain a first experimental value for the preset evaluation index;增益确定单元,配置为将所述第一实验值与所述第一对照值的差值,确定为第一增益值;A gain determining unit configured to determine the difference between the first experimental value and the first control value as a first gain value;判定单元,配置为利用基于所述若干对照样本集和所述若干实验样本集确定出的若干增益值,判定所述目标样本是否属于对抗样本。The determining unit is configured to use several gain values determined based on the several control sample sets and the several experimental sample sets to determine whether the target sample belongs to the adversarial sample.
- 根据权利要求7所述的装置,其中,The device according to claim 7, wherein:所述多个非对抗样本和目标样本为图像样本,所述初始机器学习模型为图像处理模型;或,The multiple non-confrontational samples and target samples are image samples, and the initial machine learning model is an image processing model; or,所述多个非对抗样本和目标样本为文本样本,所述初始机器学习模型为文本处理模型;或,The multiple non-adversarial samples and the target samples are text samples, and the initial machine learning model is a text processing model; or,所述多个非对抗样本和目标样本为语音样本,所述初始机器学习模型为语音处理模型。The multiple non-confrontational samples and target samples are speech samples, and the initial machine learning model is a speech processing model.
- 根据权利要求7所述的装置,其中,所述采样单元配置为:The device according to claim 7, wherein the sampling unit is configured to:利用枚举法,对所述多个非对抗样本进行多次采样,得到多个对照样本集;或,Using the enumeration method, sampling the multiple non-confrontational samples multiple times to obtain multiple control sample sets; or,利用分层采样法,对所述多个非对抗样本进行若干次采样,得到所述若干对照样本集;或,Using a stratified sampling method to sample the multiple non-confrontational samples several times to obtain the several control sample sets; or,利用自助采样法,对所述多个非对抗样本进行若干次采样,得到所述若干对照样本集。Using the self-service sampling method, the multiple non-confrontational samples are sampled several times to obtain the several control sample sets.
- 根据权利要求7所述的装置,其中,所述预设评估指标包括以下中的一种或多种:错误率、精度、查全率。The device according to claim 7, wherein the preset evaluation index comprises one or more of the following: error rate, accuracy, recall rate.
- 根据权利要求7所述的装置,其中,所述判定单元配置为:The device according to claim 7, wherein the determining unit is configured to:确定所述若干增益值的增益均值,并且,在所述增益均值小于设定阈值的情况下,判定所述目标样本属于对抗样本;或,Determine the gain average of the several gain values, and, in the case that the gain average is less than a set threshold, determine that the target sample belongs to the adversarial sample; or,确定所述若干增益值中大于设定阈值的增益比例,并且,在所述增益比例小于第一预设比例的情况下,判定所述目标样本属于对抗样本。Determine a gain ratio of the plurality of gain values that is greater than a set threshold, and if the gain ratio is less than a first preset ratio, determine that the target sample belongs to an adversarial sample.
- 根据权利要求11所述的装置,其中,所述判定单元还配置为:The device according to claim 11, wherein the determining unit is further configured to:对所述若干对照样本集针对所述预设评估指标的若干对照值进行平均,得到对照均值;Averaging the comparison values of the plurality of comparison sample sets with respect to the preset evaluation index to obtain a comparison mean value;将所述对照均值与第二预设比例的乘积,确定为所述设定阈值。The product of the control mean value and the second preset ratio is determined as the set threshold.
- 一种识别对抗隐私样本以保护隐私安全的方法,包括:A method for identifying anti-privacy samples to protect privacy and security, including:对多个非对抗隐私样本进行若干次采样,得到若干对照隐私样本集;Sampling multiple non-confrontational privacy samples several times to obtain several comparative privacy sample sets;向所述若干对照隐私样本集中分别加入待检测的目标隐私样本,得到若干实验隐私样本集;Adding the target privacy samples to be detected to the plurality of control privacy sample sets respectively to obtain a number of experimental privacy sample sets;针对所述若干对照隐私样本集中任意的第一对照隐私样本集,利用所述第一对照隐私样本集训练初始机器学习模型,得到训练后的第一对照模型;For any first control privacy sample set in the plurality of control privacy sample sets, use the first control privacy sample set to train an initial machine learning model to obtain a trained first control model;利用测试隐私样本集对所述第一对照模型进行性能评估,得到针对预设评估指标的第一对照值,所述测试隐私样本集基于所述多个非对抗隐私样本而确定;Performing performance evaluation on the first comparison model by using a test privacy sample set to obtain a first comparison value for a preset evaluation index, the test privacy sample set being determined based on the plurality of non-confrontational privacy samples;针对向所述第一对照隐私样本集中加入所述目标隐私样本而得到的第一实验隐私样本集,利用所述第一实验隐私样本集训练所述初始机器学习模型,得到训练后的第一 实验模型;For the first experimental privacy sample set obtained by adding the target privacy sample to the first control privacy sample set, use the first experimental privacy sample set to train the initial machine learning model to obtain the first experiment after training Model;利用所述测试隐私样本集对所述第一实验模型进行性能评估,得到针对所述预设评估指标的第一实验值;Perform performance evaluation on the first experimental model by using the test privacy sample set to obtain the first experimental value for the preset evaluation index;将所述第一实验值与所述第一对照值的差值,确定为第一增益值;Determining the difference between the first experimental value and the first control value as a first gain value;利用基于所述若干对照隐私样本集和所述若干实验隐私样本集确定出的若干增益值,判定所述目标隐私样本是否属于对抗隐私样本。Using several gain values determined based on the several control privacy sample sets and the several experimental privacy sample sets, it is determined whether the target privacy sample belongs to the anti-privacy sample.
- 一种识别对抗隐私样本以保护隐私安全的装置,包括:A device for identifying and opposing privacy samples to protect privacy and security, including:采样单元,配置为对多个非对抗隐私样本进行若干次采样,得到若干对照隐私样本集;The sampling unit is configured to sample multiple non-confrontational privacy samples several times to obtain several comparative privacy sample sets;添加单元,配置为向所述若干对照隐私样本集中分别加入待检测的目标隐私样本,得到若干实验隐私样本集;The adding unit is configured to respectively add the target privacy samples to be detected to the plurality of control privacy sample sets to obtain a number of experimental privacy sample sets;第一训练单元,配置为针对所述若干对照隐私样本集中任意的第一对照隐私样本集,利用所述第一对照隐私样本集训练初始机器学习模型,得到训练后的第一对照模型;The first training unit is configured to train an initial machine learning model using the first control privacy sample set for any first control privacy sample set in the plurality of control privacy sample sets to obtain a trained first control model;第一评估单元,配置为利用测试隐私样本集对所述第一对照模型进行性能评估,得到针对预设评估指标的第一对照值,所述测试隐私样本集基于所述多个非对抗隐私样本而确定;The first evaluation unit is configured to evaluate the performance of the first comparison model by using a test privacy sample set to obtain a first comparison value for a preset evaluation index, and the test privacy sample set is based on the plurality of non-confrontational privacy samples And sure第二训练单元,配置为针对向所述第一对照隐私样本集中加入所述目标隐私样本而得到的第一实验隐私样本集,利用所述第一实验隐私样本集训练所述初始机器学习模型,得到训练后的第一实验模型;The second training unit is configured to train the initial machine learning model using the first experimental privacy sample set for the first experimental privacy sample set obtained by adding the target privacy sample to the first control privacy sample set, Get the first experimental model after training;第二评估单元,配置为利用所述测试隐私样本集对所述第一实验模型进行性能评估,得到针对所述预设评估指标的第一实验值;The second evaluation unit is configured to evaluate the performance of the first experimental model by using the test privacy sample set to obtain the first experimental value for the preset evaluation index;增益确定单元,配置为将所述第一实验值与所述第一对照值的差值,确定为第一增益值;A gain determining unit configured to determine the difference between the first experimental value and the first control value as a first gain value;判定单元,配置为利用基于所述若干对照隐私样本集和所述若干实验隐私样本集确定出的若干增益值,判定所述目标隐私样本是否属于对抗隐私样本。The determining unit is configured to use several gain values determined based on the several control privacy sample sets and the several experimental privacy sample sets to determine whether the target privacy sample belongs to the anti-privacy sample.
- 一种计算机可读存储介质,其上存储有计算机程序,其中,当所述计算机程序在计算机中执行时,令计算机执行权利要求1-6、13中任一项所述的方法。A computer-readable storage medium having a computer program stored thereon, wherein when the computer program is executed in a computer, the computer is caused to execute the method according to any one of claims 1-6 and 13.
- 一种计算设备,包括存储器和处理器,其中,所述存储器中存储有可执行代码,所述处理器执行所述可执行代码时,实现权利要求1-6、13中任一项所述的方法。A computing device, comprising a memory and a processor, wherein executable code is stored in the memory, and when the processor executes the executable code, the device described in any one of claims 1-6 and 13 is implemented method.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010040234.4A CN110852450B (en) | 2020-01-15 | 2020-01-15 | Method and device for identifying countermeasure sample to protect model security |
CN202010040234.4 | 2020-01-15 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021143478A1 true WO2021143478A1 (en) | 2021-07-22 |
Family
ID=69610734
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/138824 WO2021143478A1 (en) | 2020-01-15 | 2020-12-24 | Method and apparatus for identifying adversarial sample to protect model security |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110852450B (en) |
WO (1) | WO2021143478A1 (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110852450B (en) * | 2020-01-15 | 2020-04-14 | 支付宝(杭州)信息技术有限公司 | Method and device for identifying countermeasure sample to protect model security |
CN113449097A (en) * | 2020-03-24 | 2021-09-28 | 百度在线网络技术(北京)有限公司 | Method and device for generating countermeasure sample, electronic equipment and storage medium |
CN111340008B (en) * | 2020-05-15 | 2021-02-19 | 支付宝(杭州)信息技术有限公司 | Method and system for generation of counterpatch, training of detection model and defense of counterpatch |
CN111860698B (en) * | 2020-08-05 | 2023-08-11 | 中国工商银行股份有限公司 | Method and device for determining stability of learning model |
CN113012153A (en) * | 2021-04-30 | 2021-06-22 | 武汉纺织大学 | Aluminum profile flaw detection method |
CN114140670A (en) * | 2021-11-25 | 2022-03-04 | 支付宝(杭州)信息技术有限公司 | Method and device for model ownership verification based on exogenous features |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109543760A (en) * | 2018-11-28 | 2019-03-29 | 上海交通大学 | Confrontation sample testing method based on image filters algorithm |
US20190206057A1 (en) * | 2016-09-13 | 2019-07-04 | Ohio State Innovation Foundation | Systems and methods for modeling neural architecture |
CN110363243A (en) * | 2019-07-12 | 2019-10-22 | 腾讯科技(深圳)有限公司 | The appraisal procedure and device of disaggregated model |
CN110852450A (en) * | 2020-01-15 | 2020-02-28 | 支付宝(杭州)信息技术有限公司 | Method and device for identifying countermeasure sample to protect model security |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108304858B (en) * | 2017-12-28 | 2022-01-04 | 中国银联股份有限公司 | Generation method, verification method and system of confrontation sample recognition model |
CN108710892B (en) * | 2018-04-04 | 2020-09-01 | 浙江工业大学 | Cooperative immune defense method for multiple anti-picture attacks |
CN109902798A (en) * | 2018-05-31 | 2019-06-18 | 华为技术有限公司 | The training method and device of deep neural network |
CN108932527A (en) * | 2018-06-06 | 2018-12-04 | 上海交通大学 | Using cross-training model inspection to the method for resisting sample |
CN110674856A (en) * | 2019-09-12 | 2020-01-10 | 阿里巴巴集团控股有限公司 | Method and device for machine learning |
-
2020
- 2020-01-15 CN CN202010040234.4A patent/CN110852450B/en active Active
- 2020-12-24 WO PCT/CN2020/138824 patent/WO2021143478A1/en active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190206057A1 (en) * | 2016-09-13 | 2019-07-04 | Ohio State Innovation Foundation | Systems and methods for modeling neural architecture |
CN109543760A (en) * | 2018-11-28 | 2019-03-29 | 上海交通大学 | Confrontation sample testing method based on image filters algorithm |
CN110363243A (en) * | 2019-07-12 | 2019-10-22 | 腾讯科技(深圳)有限公司 | The appraisal procedure and device of disaggregated model |
CN110852450A (en) * | 2020-01-15 | 2020-02-28 | 支付宝(杭州)信息技术有限公司 | Method and device for identifying countermeasure sample to protect model security |
Non-Patent Citations (1)
Title |
---|
WANG, JIA: "Research on Adversarial Examples in Deep Learning based on Image Recognition Problems", COMPUTER KNOWLEDGE AND TECHNOLOGY, vol. 15, 31 October 2019 (2019-10-31), CN, pages 222 - 223, XP009529271, ISSN: 1009-3044, DOI: 10.14004/j.cnki.ckt.2019.3617 * |
Also Published As
Publication number | Publication date |
---|---|
CN110852450A (en) | 2020-02-28 |
CN110852450B (en) | 2020-04-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021143478A1 (en) | Method and apparatus for identifying adversarial sample to protect model security | |
CN109214360B (en) | Construction method and application of face recognition model based on Parasoft Max loss function | |
CN107609493B (en) | Method and device for optimizing human face image quality evaluation model | |
WO2021026805A1 (en) | Adversarial example detection method and apparatus, computing device, and computer storage medium | |
US8797140B2 (en) | Biometric authentication method and biometric authentication apparatus | |
WO2021027336A1 (en) | Authentication method and apparatus based on seal and signature, and computer device | |
WO2021056746A1 (en) | Image model testing method and apparatus, electronic device and storage medium | |
WO2021036014A1 (en) | Federated learning credit management method, apparatus and device, and readable storage medium | |
WO2019136990A1 (en) | Network data detection method, apparatus, computer device and storage medium | |
US11915311B2 (en) | User score model training and calculation | |
CN111783505A (en) | Method and device for identifying forged faces and computer-readable storage medium | |
WO2020082734A1 (en) | Text emotion recognition method and apparatus, electronic device, and computer non-volatile readable storage medium | |
CN111340144B (en) | Risk sample detection method and device, electronic equipment and storage medium | |
CN105335719A (en) | Living body detection method and device | |
US10423817B2 (en) | Latent fingerprint ridge flow map improvement | |
US20200210459A1 (en) | Method and apparatus for classifying samples | |
WO2017075913A1 (en) | Mouse behaviors based authentication method | |
WO2021190046A1 (en) | Training method for gesture recognition model, gesture recognition method, and apparatus | |
US11232182B2 (en) | Open data biometric identity validation | |
JP2020184331A (en) | Liveness detection method and apparatus, face authentication method and apparatus | |
CN114817933A (en) | Method and device for evaluating robustness of business prediction model and computing equipment | |
CN111803956B (en) | Method and device for determining game plug-in behavior, electronic equipment and storage medium | |
US20220215271A1 (en) | Detection device, detection method and detection program | |
CN111368644B (en) | Image processing method, device, electronic equipment and storage medium | |
CN117275076B (en) | Method for constructing face quality assessment model based on characteristics and application |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20913811 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20913811 Country of ref document: EP Kind code of ref document: A1 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20913811 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 16.05.2023) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20913811 Country of ref document: EP Kind code of ref document: A1 |