CN110851835A - Image model detection method and device, electronic equipment and storage medium - Google Patents

Image model detection method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN110851835A
CN110851835A CN201910901599.9A CN201910901599A CN110851835A CN 110851835 A CN110851835 A CN 110851835A CN 201910901599 A CN201910901599 A CN 201910901599A CN 110851835 A CN110851835 A CN 110851835A
Authority
CN
China
Prior art keywords
image
model
detected
sample
original image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910901599.9A
Other languages
Chinese (zh)
Inventor
王健宗
黄章成
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201910901599.9A priority Critical patent/CN110851835A/en
Priority to PCT/CN2019/118027 priority patent/WO2021056746A1/en
Publication of CN110851835A publication Critical patent/CN110851835A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/57Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
    • G06F21/577Assessing vulnerabilities and evaluating computer system security
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/217Validation; Performance evaluation; Active pattern learning techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2221/00Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/03Indexing scheme relating to G06F21/50, monitoring users, programs or devices to maintain the integrity of platforms
    • G06F2221/033Test or assess software

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Image Analysis (AREA)

Abstract

A method of image model detection, the method comprising: obtaining an original image sample; inputting the original image sample into a trained mainstream image classification model; using the mainstream image classification model and a gradient iterative algorithm based on momentum to carry out counterattack on the original image sample to obtain a counterimage; acquiring a first identification result obtained after the original image sample is identified by the image model to be detected, and acquiring a second identification result obtained after the countermeasure image is identified by the image model to be detected; judging whether the first recognition result is consistent with the second recognition result; and if the first recognition result is consistent with the second recognition result, determining that the image model to be detected successfully recognizes the confrontation image. The invention also provides an image model detection device, electronic equipment and a storage medium. The invention can detect the safety of the deep neural network model.

Description

Image model detection method and device, electronic equipment and storage medium
Technical Field
The invention relates to the technical field of intelligent terminals, in particular to an image model detection method and device, electronic equipment and a storage medium.
Background
At present, artificial intelligence is applied to many fields, such as scenes of face recognition, voiceprint recognition and the like, and the core technology of the artificial intelligence is based on machine learning or deep learning.
In practice, it is found that although artificial intelligence brings great convenience to people, there are still some potential hidden dangers, for example, in the field of image classification, if a picture is maliciously tampered with by people, model identification is wrong, and this may bring a safety hidden danger to users. This indicates that modern deep neural networks are very vulnerable to attack against the sample. These challenge samples are only so slightly perturbed that the human visual system cannot detect the perturbation (the picture looks almost the same). Such an attack may cause the neural network to completely change its classification of the picture, resulting in the problem of recognition errors.
It can be seen that how to detect the security of the deep neural network model is a technical problem to be solved urgently.
Disclosure of Invention
In view of the foregoing, it is desirable to provide an image model detection method, an image model detection apparatus, an electronic device, and a storage medium, which can detect the security of a deep neural network model.
A first aspect of the present invention provides an image model detection method, including:
obtaining an original image sample;
inputting the original image sample into a trained mainstream image classification model;
using the mainstream image classification model and a gradient iterative algorithm based on momentum to carry out counterattack on the original image sample to obtain a counterimage;
acquiring a first identification result obtained after the original image sample is identified by the image model to be detected, and acquiring a second identification result obtained after the countermeasure image is identified by the image model to be detected;
judging whether the first recognition result is consistent with the second recognition result;
and if the first recognition result is consistent with the second recognition result, determining that the image model to be detected successfully recognizes the confrontation image.
In a possible implementation manner, the performing, by using the mainstream image classification model and a momentum-based gradient iterative algorithm, a countermeasure attack on the original image sample to obtain a countermeasure image includes:
calculating the disturbance quantity by using the mainstream image classification model and based on a gradient iterative algorithm of the momentum;
performing convolution smoothing processing on the disturbance quantity;
and adding the processed disturbance quantity to the original image to obtain a confrontation image.
In a possible implementation manner, before acquiring a first recognition result obtained after the inspection image model recognizes the original image sample and acquiring a second recognition result obtained after the inspection image model recognizes the countermeasure image, the method further includes:
acquiring an image model to be detected, which needs to be subjected to model detection, from user equipment;
installing the image model to be detected;
and respectively inputting the original image sample and the confrontation image into the image model to be detected.
In a possible implementation manner, after the performing counterattack on the original image sample by using the mainstream image classification model and a momentum-based gradient iterative algorithm to obtain a counterimage, the method further includes:
the identification request carrying the original image sample and the confrontation image is sent to user equipment, wherein the user equipment is provided with an image model to be detected, the image model to be detected on the user equipment identifies the original image sample to obtain a first identification result, and identifies the confrontation image to obtain a second identification result.
In one possible implementation, the method further includes:
if the first recognition result is inconsistent with the second recognition result, determining that the image model to be detected is misjudged;
counting the number of misjudgments of the image model to be detected;
calculating the accuracy of the image model to be detected according to the number and the total number of the original image samples;
and determining the safety level of the image model to be detected according to the accuracy.
In one possible implementation, before the acquiring the original image sample, the method further includes:
acquiring a training sample from user equipment needing model detection;
extracting sample characteristics of the training sample;
and inputting the sample characteristics into an open source model frame for training to obtain a trained mainstream image classification model.
In one possible implementation, after the acquiring the original image sample, the method further includes:
and carrying out picture enhancement processing on the original image sample.
The inputting of the original image sample into the trained mainstream image classification model comprises:
and inputting the processed original image sample into a trained mainstream image classification model.
A second aspect of the present invention provides an image model detection apparatus, comprising:
the first acquisition module is used for acquiring an original image sample;
the input module is used for inputting the original image sample into a trained mainstream image classification model;
the generation module is used for carrying out counterattack on the original image sample by using the mainstream image classification model and a gradient iterative algorithm based on momentum to obtain a counterattack image;
the second acquisition module is used for acquiring a first identification result obtained after the original image sample is identified by the image model to be detected and acquiring a second identification result obtained after the confrontation image is identified by the image model to be detected;
the judging module is used for judging whether the first identification result is consistent with the second identification result;
and the determining module is used for determining that the image model to be detected successfully identifies the confrontation image if the first identification result is consistent with the second identification result.
A third aspect of the invention provides an electronic device comprising a processor and a memory, the processor being configured to implement the image model detection method when executing a computer program stored in the memory.
A fourth aspect of the present invention provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the image model detection method.
According to the technical scheme, the method can obtain an original image sample, input the original image sample into a trained mainstream image classification model, use the mainstream image classification model, perform counterattack on the original image sample based on a momentum gradient iterative algorithm to obtain a counterimage, further obtain a first identification result obtained after the original image sample is identified by the image model to be detected, obtain a second identification result obtained after the counterimage is identified by the image model to be detected, judge whether the first identification result is consistent with the second identification result, and determine that the image model to be detected is successfully identified for the counterimage if the first identification result is consistent with the second identification result. Therefore, according to the method and the device, the original image sample can be subjected to counterattack through the trained mainstream image classification model to generate the counterimage, the image model to be detected is tested through the counterimage, the safety performance of the image model to be detected can be detected through the mode, the image model to be detected can be improved conveniently according to the detection result, and the anti-interference capability of the image model to be detected is improved.
Drawings
FIG. 1 is a flowchart illustrating an image model detection method according to an embodiment of the present invention.
FIG. 2 is a functional block diagram of an image model detection apparatus according to a preferred embodiment of the present disclosure.
FIG. 3 is a schematic structural diagram of an electronic device implementing a method for detecting an image model according to a preferred embodiment of the invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention.
FIG. 1 is a flowchart illustrating an image model detection method according to an embodiment of the present invention. The order of the steps in the flowchart may be changed, and some steps may be omitted.
And S11, the electronic equipment acquires an original image sample.
If the original image sample is relatively common, the original image sample can be obtained from a network through a network crawling technology, and if the original image sample is not easily obtained through a public channel, the original image sample can be obtained from a user terminal device, namely the original image sample is provided by a user and is not local. Wherein the original image sample is a clean sample without adding any perturbations.
As an optional implementation manner, before step S11, the method further includes:
acquiring a training sample from user equipment needing model detection;
extracting sample characteristics of the training sample;
and inputting the sample characteristics into an open source model frame for training to obtain a trained mainstream image classification model.
In this alternative embodiment, the open source model framework is based on a mainstream picture classification framework, such as a model framework disclosed in an open source community, such as resnet, inceptionV3, and the like. During training, training samples, such as face image samples, non-face image samples and the like, need to be obtained from user side equipment in advance, sample features of the training samples, such as face features, are further extracted, the sample features are input into an open source model frame for training, a picture classification result is obtained, and finally parameters of the open source model frame are continuously updated according to the picture classification result until convergence is achieved, so that a trained mainstream image classification model is obtained. And subsequently, attacking the trained mainstream image classification model.
Wherein, a model with known specific structure and parameters (such as the trained mainstream image classification model) may be referred to as a white-box model, and a model without known specific structure and parameters may be referred to as a black-box model. Generally, attack means of picture disturbance are mainly divided into white box attack and black box attack.
The method can be used for white box attack, namely attack on a known model, and meanwhile, the robustness and the transferability of the white box attack are improved, so that a model without a specific structure and parameters can be successfully attacked by a result after the white box attack.
And S12, inputting the original image sample into the trained mainstream image classification model by the electronic equipment.
After the mainstream image classification model is trained, the mainstream image classification model also needs to be attacked, such as white box attack or black box attack.
As an optional implementation manner, after step S11, the method further includes:
and carrying out picture enhancement processing on the original image sample.
The inputting of the original image sample into the trained mainstream image classification model comprises:
and inputting the processed original image sample into a trained mainstream image classification model.
In order to enable a subsequently obtained confrontation image to better attack the black box model, the disturbance capability of the image is enhanced so as to simulate a real attack situation, and the original image sample is required to be subjected to image enhancement processing. Specifically, before reasoning the trained mainstream image classification model, the original image sample can be randomly changed in size, then the original image sample is randomly filled to be 331x331 in size, and the changed size becomes the input size of the mainstream image classification model, such as 224x 224.
S13, the electronic device uses the mainstream image classification model to carry out counterattack on the original image sample based on a gradient iterative algorithm of momentum, and a counterimage is obtained.
The white box model adopts white box attack, and a gradient iterative algorithm based on momentum is adopted for the white box attack.
Specifically, the using the mainstream image classification model and a gradient iterative algorithm based on momentum to perform counterattack on the original image sample to obtain a counterattack image includes:
calculating the disturbance quantity by using the mainstream image classification model and based on a gradient iterative algorithm of the momentum;
performing convolution smoothing processing on the disturbance quantity;
and adding the processed disturbance quantity to the original image to obtain a confrontation image.
Wherein the formula of the momentum-based gradient iterative algorithm is as follows:
Figure BDA0002212000540000071
x′t+1=x′t+∈·clip[-10,10]gt+1
wherein g is the disturbance amount, gtIs the disturbance quantity of the t-th iteration, mu is the noise of which the momentum coefficient is used for controlling the change, Jθ(x′tAnd y) is defined as'tY is input to the model function Jθ(x, y) and calculating the cross entropy to obtain l, l is the cross entropy loss value between the output result of the second last layer full-connection layer of the mainstream image classification model and the category of the original image sample,then the variance of the magnitude of each pixel's gradient direction change divided by its perturbation is solved for the computation of the amount of perturbation, x't+1The method includes the steps that the original image sample is added with the result of the t-th iteration disturbance, belongs to a disturbance coefficient and is used for controlling the difference between the image added with the disturbance and the original image sample, clip[-10,10]gt+1For mixing gt+1Cut to [ -10,10 ]]The range of (1).
In order to enable a subsequently obtained confrontation image to better attack a black box model and enhance the disturbance capability of the image so as to simulate a real attack situation, after the disturbance amount is obtained through calculation, convolution smoothing processing needs to be carried out on the disturbance amount, specifically, convolution processing is carried out on gt through a randomly generated 4x4 Gaussian convolution kernel, and meanwhile the dimension of the gt is not changed, so that disturbance smoothing is more transitive.
The iteration number can be tested and set in advance according to a plurality of tests, and is usually between 100 and 200, so that the disturbance capacity of the disturbance amount is enhanced by multiple times compared with the conventional attack method.
S14, the electronic equipment acquires a first recognition result obtained after the image model to be detected recognizes the original image sample, and acquires a second recognition result obtained after the image model to be detected recognizes the confrontation image.
Wherein, the image model to be detected is different from the trained mainstream image classification model, and a user may modify the mainstream frame model to obtain the image model to be detected.
The method includes the steps of obtaining a first recognition result obtained after an image model to be detected recognizes an original image sample locally, obtaining a second recognition result obtained after the image model to be detected recognizes a counterimage locally, or obtaining the first recognition result obtained after the image model to be detected recognizes the original image sample on a user terminal device, and obtaining the second recognition result obtained after the image model to be detected recognizes the counterimage on the user terminal device.
Optionally, before step S14, the method further includes:
acquiring an image model to be detected, which needs to be subjected to model detection, from user equipment;
installing the image model to be detected;
and respectively inputting the original image sample and the confrontation image into the image model to be detected, and obtaining a first identification result of the original image sample and a second identification result of the confrontation image.
In this optional embodiment, an image model to be detected, which needs to be subjected to model detection, needs to be acquired from user equipment, the image model to be detected is installed on electronic equipment, and then the original image sample and the confrontation image are directly and respectively input to the image model to be detected on the electronic equipment, so that a first identification result of the original image sample and a second identification result of the confrontation image are obtained. The whole identification process is carried out on the electronic equipment without any processing of the user end equipment, so that the resource consumption of the user end equipment can be saved, and the user time is saved.
Optionally, after the primary image classification model is used and a gradient iterative algorithm based on momentum is used to perform counterattack on the original image sample to obtain a counterimage, the method further includes:
the identification request carrying the original image sample and the confrontation image is sent to user equipment, wherein the user equipment is provided with an image model to be detected, the image model to be detected on the user equipment identifies the original image sample to obtain a first identification result, and identifies the confrontation image to obtain a second identification result.
In this optional embodiment, the model to be detected on the user side does not need to be acquired, that is, the model to be detected does not need to be installed on the electronic device, but only an Application Programming Interface (API) is provided on the user side, the electronic device can send the identification request carrying the original image sample and the countermeasure image to the user side device through the API, and after the user side device receives the identification request, the original image sample and the countermeasure image can be identified by using the image model to be detected, a first identification result for the original image sample and a second identification result for the countermeasure image are obtained, and finally the user side equipment can return the first identification result for the original image sample and the second identification result for the countermeasure image to the electronic equipment through the API. The electronic equipment can make judgment according to the first recognition result and the second recognition result.
The original image sample can be from a public channel, such as a network, or from a user end device.
In this alternative embodiment, details such as a model specifically used by the user-side device and innovative technology applied by the user on the model do not need to be actively obtained, so that the client model technology can be kept secret, and meanwhile, the security problem of the client model can be detected, so that improved guidance and user security protection are given to the client model.
The user end device is an electronic device capable of automatically performing numerical calculation and/or information processing according to a preset or stored instruction, and the hardware thereof includes but is not limited to a microprocessor, an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), an embedded device, and the like, such as a personal computer, a tablet computer, a personal digital assistant, and the like.
The electronic device is an electronic device capable of automatically performing numerical calculation and/or information processing according to a preset or stored instruction, and the hardware thereof includes, but is not limited to, a microprocessor, an Application Specific Integrated Circuit (ASIC), a programmable gate array (FPGA), a Digital Signal Processor (DSP), an embedded device, and the like, such as a personal computer, a tablet computer, a personal digital assistant, and the like.
And S15, the electronic equipment judges whether the first recognition result is consistent with the second recognition result, if so, the step S16 is executed, and if not, the process is ended.
If the original image sample and the confrontation sample are identified aiming at a face picture, the image model to be detected identifies a first identification result obtained after the original image sample is identified, the first identification result is that the original image sample has a face, and the image model to be detected identifies a second identification result obtained after the confrontation image is identified, because the confrontation image is an image added with disturbance, the second identification result may be that the confrontation image has a face, or the confrontation image does not have a face.
S16, the electronic equipment determines that the image model to be detected successfully identifies the confrontation image.
If the first recognition result is consistent with the second recognition result, the image model to be detected successfully and correctly recognizes the confrontation image.
As an optional implementation, the method further comprises:
if the first recognition result is inconsistent with the second recognition result, determining that the image model to be detected is misjudged;
counting the number of misjudgments of the image model to be detected;
calculating the accuracy of the image model to be detected according to the number and the total number of the original image samples;
and determining the safety level of the image model to be detected according to the accuracy.
If the first identification result is inconsistent with the second identification result, the misjudgment of the image model to be detected on the countermeasure image is shown, the disturbance on the original image sample is also shown to successfully interfere with the identification result of the image model to be detected, the further prediction that the image model to be detected cannot defend the countermeasure image is also shown, the image model to be detected is easily attacked successfully, and the safety is poor.
If the misjudgment is excessive, the safety of the image model to be detected can be reflected to have a great problem. Therefore, the number of misjudgments of the image model to be detected needs to be counted, and the accuracy of the image model to be detected is calculated according to the number and the total number of the original image samples, wherein the total number of the original image samples is the same as the total number of the confrontation images.
Different application scenes are different in the standard of the image model to be detected, and the safety level can be set according to the different application scenes. The different security levels represent the degree of security of the image model to be examined.
For example, the situation of 1000 or more tested pictures is integrated, if the accuracy of the image model to be detected is reduced by 10% in the disturbed picture, the image model to be detected is a slight safety problem, if the accuracy of the image model to be detected is reduced by 20% in the disturbed picture, the image model to be detected is a moderate safety problem, and if the accuracy of the image model to be detected is reduced by 30% or more in the disturbed picture, the image model to be detected is a serious safety problem.
In the method flow described in fig. 1, an original image sample may be obtained, the original image sample is input into a trained mainstream image classification model, the mainstream image classification model is used, a momentum-based gradient iteration algorithm is used to perform counterattack on the original image sample, a counterimage is obtained, further, a first recognition result obtained after the original image sample is recognized by an image model to be detected may be obtained, a second recognition result obtained after the counterimage is recognized by the image model to be detected may be obtained, whether the first recognition result is consistent with the second recognition result is determined, and if the first recognition result is consistent with the second recognition result, it is determined that the image model to be detected succeeds in recognizing the counterimage. Therefore, according to the method and the device, the original image sample can be subjected to counterattack through the trained mainstream image classification model to generate the counterimage, the image model to be detected is tested through the counterimage, the safety performance of the image model to be detected can be detected through the mode, the image model to be detected can be improved conveniently according to the detection result, and the anti-interference capability of the image model to be detected is improved.
The above description is only a specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and it will be apparent to those skilled in the art that modifications may be made without departing from the inventive concept of the present invention, and these modifications are within the scope of the present invention.
FIG. 2 is a functional block diagram of an image model detection apparatus according to a preferred embodiment of the present disclosure.
In some embodiments, the image model detection apparatus operates in an electronic device. The image model detection means may comprise a plurality of functional modules consisting of program code segments. Program codes of respective program segments in the image model detection apparatus may be stored in a memory and executed by at least one processor to perform some or all of the steps of the image model detection method described in fig. 1.
In this embodiment, the image model detection apparatus may be divided into a plurality of functional modules according to the functions performed by the image model detection apparatus. The functional module may include: the device comprises a first acquisition module 201, an input module 202, a generation module 203, a second acquisition module 204, a judgment module 205 and a determination module 206. The module referred to herein is a series of computer program segments capable of being executed by at least one processor and capable of performing a fixed function and is stored in memory. The functions of the respective modules will be described in detail in the following embodiments.
A first obtaining module 201, configured to obtain an original image sample.
If the original image sample is relatively common, the original image sample can be obtained from a network through a network crawling technology, and if the original image sample is not easily obtained through a public channel, the original image sample can be obtained from a user terminal device, namely the original image sample is provided by a user and is not local. Wherein the original image sample is a clean sample without adding any perturbations.
An input module 202, configured to input the original image sample into a trained mainstream image classification model.
After the mainstream image classification model is trained, the mainstream image classification model also needs to be attacked, such as white box attack or black box attack.
And the generating module 203 is configured to perform counterattack on the original image sample by using the mainstream image classification model and a gradient iterative algorithm based on momentum to obtain a counterimage.
The white box model adopts white box attack, and a gradient iterative algorithm based on momentum is adopted for the white box attack.
Specifically, the using the mainstream image classification model and a gradient iterative algorithm based on momentum to perform counterattack on the original image sample to obtain a counterattack image includes:
calculating the disturbance quantity by using the mainstream image classification model and based on a gradient iterative algorithm of the momentum;
performing convolution smoothing processing on the disturbance quantity;
and adding the processed disturbance quantity to the original image to obtain a confrontation image.
Wherein the formula of the momentum-based gradient iterative algorithm is as follows:
Figure BDA0002212000540000131
x′t+1=x′t+∈·clip[-10,10]gt+1
wherein g is the disturbance amount, gtIs the disturbance quantity of the t-th iteration, mu is the noise of which the momentum coefficient is used for controlling the change, Jθ(x′tAnd y) is defined as'tY is input to the model function Jθ(x, y) and calculating the cross entropy to obtain l, l is the cross entropy loss value between the output result of the second last layer full-connection layer of the mainstream image classification model and the category of the original image sample,
Figure BDA0002212000540000132
then the variance of the magnitude of each pixel's gradient direction change divided by its perturbation is solved for the computation of the amount of perturbation, x't+1The method includes the steps that the original image sample is added with the result of the t-th iteration disturbance, belongs to a disturbance coefficient and is used for controlling the difference between the image added with the disturbance and the original image sample, clip[-10,10]gt+1For mixing gt+1Cut to [ -10,10 ]]The range of (1).
In order to enable a subsequently obtained confrontation image to better attack a black box model, enhance the disturbance capability of the image so as to simulate a real attack situation, after the disturbance amount is obtained through calculation, convolution smoothing processing needs to be carried out on the disturbance amount, and specifically, a randomly generated 4x4 Gaussian convolution kernel is used for checking gtConvolution processing is carried out, and meanwhile the dimension size of the convolution processing is not changed, so that disturbance is smooth and is more transitive.
The iteration number can be tested and set in advance according to a plurality of tests, and is usually between 100 and 200, so that the disturbance capacity of the disturbance amount is enhanced by multiple times compared with the conventional attack method.
The second obtaining module 204 is configured to obtain a first recognition result obtained after the original image sample is recognized by the to-be-detected image model, and obtain a second recognition result obtained after the countermeasure image is recognized by the to-be-detected image model.
Wherein, the image model to be detected is different from the trained mainstream image classification model, and a user may modify the mainstream frame model to obtain the image model to be detected.
The method includes the steps of obtaining a first recognition result obtained after an image model to be detected recognizes an original image sample locally, obtaining a second recognition result obtained after the image model to be detected recognizes a counterimage locally, or obtaining the first recognition result obtained after the image model to be detected recognizes the original image sample on a user terminal device, and obtaining the second recognition result obtained after the image model to be detected recognizes the counterimage on the user terminal device.
The determining module 205 is configured to determine whether the first recognition result is consistent with the second recognition result.
If the original image sample and the confrontation sample are identified aiming at a face picture, the image model to be detected identifies a first identification result obtained after the original image sample is identified, the first identification result is that the original image sample has a face, and the image model to be detected identifies a second identification result obtained after the confrontation image is identified, because the confrontation image is an image added with disturbance, the second identification result may be that the confrontation image has a face, or the confrontation image does not have a face.
A determining module 206, configured to determine that the image model to be detected successfully identifies the countermeasure image if the first identification result is consistent with the second identification result.
If the first recognition result is consistent with the second recognition result, the image model to be detected successfully and correctly recognizes the confrontation image.
In the image model detection apparatus described in fig. 2, an original image sample may be obtained, the original image sample is input into a trained mainstream image classification model, the mainstream image classification model is used, a momentum-based gradient iteration algorithm is used to perform counterattack on the original image sample, a counterimage is obtained, further, a first recognition result obtained after the original image sample is recognized by an image model to be detected may be obtained, a second recognition result obtained after the counterimage is recognized by the image model to be detected may be obtained, and whether the first recognition result is consistent with the second recognition result is determined, and if the first recognition result is consistent with the second recognition result, it is determined that the image model to be detected succeeds in recognizing the counterimage. Therefore, according to the method and the device, the original image sample can be subjected to counterattack through the trained mainstream image classification model to generate the counterimage, the image model to be detected is tested through the counterimage, the safety performance of the image model to be detected can be detected through the mode, the image model to be detected can be improved conveniently according to the detection result, and the anti-interference capability of the image model to be detected is improved.
FIG. 3 is a schematic structural diagram of an electronic device implementing a method for detecting an image model according to a preferred embodiment of the invention. The electronic device 3 comprises a memory 31, at least one processor 32, a computer program 33 stored in the memory 31 and executable on the at least one processor 32, and at least one communication bus 34.
Those skilled in the art will appreciate that the schematic diagram shown in fig. 3 is merely an example of the electronic device 3, and does not constitute a limitation of the electronic device 3, and may include more or less components than those shown, or combine some components, or different components, for example, the electronic device 3 may further include an input/output device, a network access device, and the like.
The at least one Processor 32 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. The processor 32 may be a microprocessor or the processor 32 may be any conventional processor or the like, and the processor 32 is a control center of the electronic device 3 and connects various parts of the whole electronic device 3 by various interfaces and lines.
The memory 31 may be used to store the computer program 33 and/or the module/unit, and the processor 32 may implement various functions of the electronic device 3 by running or executing the computer program and/or the module/unit stored in the memory 31 and calling data stored in the memory 31. The memory 31 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data) created according to the use of the electronic device 3, and the like. Further, the memory 31 may include a non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other non-volatile solid state storage device.
With reference to fig. 1, the memory 31 of the electronic device 3 stores a plurality of instructions to implement an image model detection method, and the processor 32 can execute the plurality of instructions to implement:
obtaining an original image sample;
inputting the original image sample into a trained mainstream image classification model;
using the mainstream image classification model and a gradient iterative algorithm based on momentum to carry out counterattack on the original image sample to obtain a counterimage;
acquiring a first identification result obtained after the original image sample is identified by the image model to be detected, and acquiring a second identification result obtained after the countermeasure image is identified by the image model to be detected;
judging whether the first recognition result is consistent with the second recognition result;
and if the first recognition result is consistent with the second recognition result, determining that the image model to be detected successfully recognizes the confrontation image.
Specifically, the processor 32 may refer to the description of the relevant steps in the embodiment corresponding to fig. 1 for a specific implementation method of the instruction, which is not described herein again.
In the electronic device 3 depicted in fig. 3, an original image sample may be obtained, the original image sample is input into a trained mainstream image classification model, the mainstream image classification model is used, a momentum-based gradient iterative algorithm is used to perform counterattack on the original image sample, a counterimage is obtained, further, a first recognition result obtained after the original image sample is recognized by an image model to be detected may be obtained, a second recognition result obtained after the counterimage is recognized by the image model to be detected is obtained, whether the first recognition result is consistent with the second recognition result is determined, and if the first recognition result is consistent with the second recognition result, it is determined that the image model to be detected succeeds in recognizing the counterimage. Therefore, according to the method and the device, the original image sample can be subjected to counterattack through the trained mainstream image classification model to generate the counterimage, the image model to be detected is tested through the counterimage, the safety performance of the image model to be detected can be detected through the mode, the image model to be detected can be improved conveniently according to the detection result, and the anti-interference capability of the image model to be detected is improved.
The integrated modules/units of the electronic device 3 may be stored in a computer-readable storage medium if they are implemented in the form of software functional units and sold or used as separate products. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, U-disk, removable hard disk, magnetic disk, optical disk, computer Memory, and Read-Only Memory (ROM).
In the embodiments provided in the present invention, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned. Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the system claims may also be implemented by one unit or means in software or hardware. The terms second, etc. are used to denote names, but not any particular order.
Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims (10)

1. An image model detection method, characterized in that the method comprises:
obtaining an original image sample;
inputting the original image sample into a trained mainstream image classification model;
using the mainstream image classification model and a gradient iterative algorithm based on momentum to carry out counterattack on the original image sample to obtain a counterimage;
acquiring a first identification result obtained after the original image sample is identified by the image model to be detected, and acquiring a second identification result obtained after the countermeasure image is identified by the image model to be detected;
judging whether the first recognition result is consistent with the second recognition result;
and if the first recognition result is consistent with the second recognition result, determining that the image model to be detected successfully recognizes the confrontation image.
2. The method of claim 1, wherein the performing a challenge attack on the original image sample by using the mainstream image classification model and a momentum-based gradient iterative algorithm to obtain a challenge image comprises:
calculating the disturbance quantity by using the mainstream image classification model and based on a gradient iterative algorithm of the momentum;
performing convolution smoothing processing on the disturbance quantity;
and adding the processed disturbance quantity to the original image to obtain a confrontation image.
3. The method of claim 1, wherein before obtaining a first recognition result obtained after the inspection image model recognizes the original image sample and obtaining a second recognition result obtained after the inspection image model recognizes the countermeasure image, the method further comprises:
acquiring an image model to be detected, which needs to be subjected to model detection, from user equipment;
installing the image model to be detected;
and respectively inputting the original image sample and the confrontation image into the image model to be detected.
4. The method of claim 1, wherein after the using the mainstream image classification model and the momentum-based gradient iterative algorithm to perform a counterattack on the original image sample to obtain a counterimage, the method further comprises:
the identification request carrying the original image sample and the confrontation image is sent to user equipment, wherein the user equipment is provided with an image model to be detected, the image model to be detected on the user equipment identifies the original image sample to obtain a first identification result, and identifies the confrontation image to obtain a second identification result.
5. The method according to any one of claims 1 to 4, further comprising:
if the first recognition result is inconsistent with the second recognition result, determining that the image model to be detected is misjudged;
counting the number of misjudgments of the image model to be detected;
calculating the accuracy of the image model to be detected according to the number and the total number of the original image samples;
and determining the safety level of the image model to be detected according to the accuracy.
6. The method of any of claims 1 to 4, wherein prior to said obtaining the original image sample, the method further comprises:
acquiring a training sample from user equipment needing model detection;
extracting sample characteristics of the training sample;
and inputting the sample characteristics into an open source model frame for training to obtain a trained mainstream image classification model.
7. The method of any of claims 1 to 4, wherein after the obtaining of the original image sample, the method further comprises:
carrying out picture enhancement processing on the original image sample;
and inputting the original image sample subjected to the image enhancement processing into a trained mainstream image classification model.
8. An image model detection apparatus, characterized in that the apparatus comprises:
the first acquisition module is used for acquiring an original image sample;
the input module is used for inputting the original image sample into a trained mainstream image classification model;
the generation module is used for carrying out counterattack on the original image sample by using the mainstream image classification model and a gradient iterative algorithm based on momentum to obtain a counterattack image;
the second acquisition module is used for acquiring a first identification result obtained after the original image sample is identified by the image model to be detected and acquiring a second identification result obtained after the confrontation image is identified by the image model to be detected;
the judging module is used for judging whether the first identification result is consistent with the second identification result;
and the determining module is used for determining that the image model to be detected successfully identifies the confrontation image if the first identification result is consistent with the second identification result.
9. An electronic device, characterized in that the electronic device comprises a processor and a memory, the processor being configured to execute a computer program stored in the memory to implement the image model detection method according to any one of claims 1 to 7.
10. A computer-readable storage medium storing at least one instruction which, when executed by a processor, implements an image model detection method as claimed in any one of claims 1 to 7.
CN201910901599.9A 2019-09-23 2019-09-23 Image model detection method and device, electronic equipment and storage medium Pending CN110851835A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201910901599.9A CN110851835A (en) 2019-09-23 2019-09-23 Image model detection method and device, electronic equipment and storage medium
PCT/CN2019/118027 WO2021056746A1 (en) 2019-09-23 2019-11-13 Image model testing method and apparatus, electronic device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910901599.9A CN110851835A (en) 2019-09-23 2019-09-23 Image model detection method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN110851835A true CN110851835A (en) 2020-02-28

Family

ID=69596011

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910901599.9A Pending CN110851835A (en) 2019-09-23 2019-09-23 Image model detection method and device, electronic equipment and storage medium

Country Status (2)

Country Link
CN (1) CN110851835A (en)
WO (1) WO2021056746A1 (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111476228A (en) * 2020-04-07 2020-07-31 海南阿凡题科技有限公司 White-box confrontation sample generation method for scene character recognition model
CN111723865A (en) * 2020-06-19 2020-09-29 北京瑞莱智慧科技有限公司 Method, apparatus and medium for evaluating performance of image recognition model and attack method
CN111866004A (en) * 2020-07-27 2020-10-30 中国工商银行股份有限公司 Security assessment method, apparatus, computer system, and medium
CN112488172A (en) * 2020-11-25 2021-03-12 北京有竹居网络技术有限公司 Method, device, readable medium and electronic equipment for resisting attack
CN112507811A (en) * 2020-11-23 2021-03-16 广州大学 Method and system for detecting face recognition system to resist masquerading attack
CN112560039A (en) * 2020-12-25 2021-03-26 河南交通职业技术学院 Computer safety protection method
CN112613543A (en) * 2020-12-15 2021-04-06 重庆紫光华山智安科技有限公司 Enhanced policy verification method and device, electronic equipment and storage medium
CN112907552A (en) * 2021-03-09 2021-06-04 百度在线网络技术(北京)有限公司 Robustness detection method, device and program product for image processing model
CN113111833A (en) * 2021-04-23 2021-07-13 中国科学院深圳先进技术研究院 Safety detection method and device of artificial intelligence system and terminal equipment
CN113378118A (en) * 2020-03-10 2021-09-10 百度在线网络技术(北京)有限公司 Method, apparatus, electronic device, and computer storage medium for processing image data
CN113807400A (en) * 2021-08-17 2021-12-17 西安理工大学 Hyperspectral image classification method, system and equipment based on anti-attack
CN114510715A (en) * 2022-01-14 2022-05-17 中国科学院软件研究所 Model functional safety testing method and device, storage medium and equipment
CN114724014A (en) * 2022-06-06 2022-07-08 杭州海康威视数字技术股份有限公司 Anti-sample attack detection method and device based on deep learning and electronic equipment
WO2022222143A1 (en) * 2021-04-23 2022-10-27 中国科学院深圳先进技术研究院 Security test method and apparatus for artificial intelligence system, and terminal device
CN115439377A (en) * 2022-11-08 2022-12-06 电子科技大学 Method for enhancing resistance to image sample migration attack

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113656813B (en) * 2021-07-30 2023-05-23 深圳清华大学研究院 Image processing method, system, equipment and storage medium based on attack resistance

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106296692A (en) * 2016-08-11 2017-01-04 深圳市未来媒体技术研究院 Image significance detection method based on antagonism network
CN107025284A (en) * 2017-04-06 2017-08-08 中南大学 The recognition methods of network comment text emotion tendency and convolutional neural networks model
CN108257116A (en) * 2017-12-30 2018-07-06 清华大学 A kind of method for generating confrontation image
CN109492582A (en) * 2018-11-09 2019-03-19 杭州安恒信息技术股份有限公司 A kind of image recognition attack method based on algorithm confrontation sexual assault

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10319076B2 (en) * 2016-06-16 2019-06-11 Facebook, Inc. Producing higher-quality samples of natural images
US10636141B2 (en) * 2017-02-09 2020-04-28 Siemens Healthcare Gmbh Adversarial and dual inverse deep learning networks for medical image analysis
CN108537271B (en) * 2018-04-04 2021-02-05 重庆大学 Method for defending against sample attack based on convolution denoising self-encoder
CN108615048B (en) * 2018-04-04 2020-06-23 浙江工业大学 Defense method for image classifier adversity attack based on disturbance evolution
CN109165671A (en) * 2018-07-13 2019-01-08 上海交通大学 Confrontation sample testing method based on sample to decision boundary distance
CN110245598B (en) * 2019-06-06 2020-10-09 北京瑞莱智慧科技有限公司 Countermeasure sample generation method, apparatus, medium, and computing device
CN110222831B (en) * 2019-06-13 2022-05-17 百度在线网络技术(北京)有限公司 Robustness evaluation method and device of deep learning model and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106296692A (en) * 2016-08-11 2017-01-04 深圳市未来媒体技术研究院 Image significance detection method based on antagonism network
WO2018028255A1 (en) * 2016-08-11 2018-02-15 深圳市未来媒体技术研究院 Image saliency detection method based on adversarial network
CN107025284A (en) * 2017-04-06 2017-08-08 中南大学 The recognition methods of network comment text emotion tendency and convolutional neural networks model
CN108257116A (en) * 2017-12-30 2018-07-06 清华大学 A kind of method for generating confrontation image
CN109492582A (en) * 2018-11-09 2019-03-19 杭州安恒信息技术股份有限公司 A kind of image recognition attack method based on algorithm confrontation sexual assault

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113378118A (en) * 2020-03-10 2021-09-10 百度在线网络技术(北京)有限公司 Method, apparatus, electronic device, and computer storage medium for processing image data
CN113378118B (en) * 2020-03-10 2023-08-22 百度在线网络技术(北京)有限公司 Method, apparatus, electronic device and computer storage medium for processing image data
CN111476228A (en) * 2020-04-07 2020-07-31 海南阿凡题科技有限公司 White-box confrontation sample generation method for scene character recognition model
CN111723865A (en) * 2020-06-19 2020-09-29 北京瑞莱智慧科技有限公司 Method, apparatus and medium for evaluating performance of image recognition model and attack method
CN111866004A (en) * 2020-07-27 2020-10-30 中国工商银行股份有限公司 Security assessment method, apparatus, computer system, and medium
CN112507811A (en) * 2020-11-23 2021-03-16 广州大学 Method and system for detecting face recognition system to resist masquerading attack
CN112488172A (en) * 2020-11-25 2021-03-12 北京有竹居网络技术有限公司 Method, device, readable medium and electronic equipment for resisting attack
CN112613543A (en) * 2020-12-15 2021-04-06 重庆紫光华山智安科技有限公司 Enhanced policy verification method and device, electronic equipment and storage medium
CN112560039A (en) * 2020-12-25 2021-03-26 河南交通职业技术学院 Computer safety protection method
CN112560039B (en) * 2020-12-25 2023-04-18 河南交通职业技术学院 Computer safety protection method
CN112907552A (en) * 2021-03-09 2021-06-04 百度在线网络技术(北京)有限公司 Robustness detection method, device and program product for image processing model
CN112907552B (en) * 2021-03-09 2024-03-01 百度在线网络技术(北京)有限公司 Robustness detection method, device and program product for image processing model
CN113111833A (en) * 2021-04-23 2021-07-13 中国科学院深圳先进技术研究院 Safety detection method and device of artificial intelligence system and terminal equipment
WO2022222143A1 (en) * 2021-04-23 2022-10-27 中国科学院深圳先进技术研究院 Security test method and apparatus for artificial intelligence system, and terminal device
CN113111833B (en) * 2021-04-23 2022-11-25 中国科学院深圳先进技术研究院 Safety detection method and device of artificial intelligence system and terminal equipment
CN113807400A (en) * 2021-08-17 2021-12-17 西安理工大学 Hyperspectral image classification method, system and equipment based on anti-attack
CN113807400B (en) * 2021-08-17 2024-03-29 西安理工大学 Hyperspectral image classification method, hyperspectral image classification system and hyperspectral image classification equipment based on attack resistance
CN114510715A (en) * 2022-01-14 2022-05-17 中国科学院软件研究所 Model functional safety testing method and device, storage medium and equipment
CN114724014B (en) * 2022-06-06 2023-06-30 杭州海康威视数字技术股份有限公司 Deep learning-based method and device for detecting attack of countered sample and electronic equipment
CN114724014A (en) * 2022-06-06 2022-07-08 杭州海康威视数字技术股份有限公司 Anti-sample attack detection method and device based on deep learning and electronic equipment
CN115439377A (en) * 2022-11-08 2022-12-06 电子科技大学 Method for enhancing resistance to image sample migration attack

Also Published As

Publication number Publication date
WO2021056746A1 (en) 2021-04-01

Similar Documents

Publication Publication Date Title
CN110851835A (en) Image model detection method and device, electronic equipment and storage medium
CN109948408B (en) Activity test method and apparatus
CN111723865B (en) Method, apparatus and medium for evaluating performance of image recognition model and attack method
CN108416343B (en) Face image recognition method and device
CN109815797B (en) Living body detection method and apparatus
CN112200081A (en) Abnormal behavior identification method and device, electronic equipment and storage medium
CN111160110A (en) Method and device for identifying anchor based on face features and voice print features
CN112668453B (en) Video identification method and related equipment
CN105405130A (en) Cluster-based license image highlight detection method and device
CN111507320A (en) Detection method, device, equipment and storage medium for kitchen violation behaviors
CN111241873A (en) Image reproduction detection method, training method of model thereof, payment method and payment device
TWI803243B (en) Method for expanding images, computer device and storage medium
Fang et al. Backdoor attacks on the DNN interpretation system
CN116311214A (en) License plate recognition method and device
Chen et al. Image splicing forgery detection using simplified generalized noise model
CN110688878B (en) Living body identification detection method, living body identification detection device, living body identification detection medium, and electronic device
CN112699811A (en) Living body detection method, apparatus, device, storage medium, and program product
CN115223022B (en) Image processing method, device, storage medium and equipment
CN111163332A (en) Video pornography detection method, terminal and medium
CN114898137A (en) Face recognition-oriented black box sample attack resisting method, device, equipment and medium
CN114021136A (en) Back door attack defense system for artificial intelligence model
CN109409325B (en) Identification method and electronic equipment
CN114463799A (en) Living body detection method and device and computer readable storage medium
CN111563276A (en) Webpage tampering detection method, detection system and related equipment
CN113221820B (en) Object identification method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40019488

Country of ref document: HK

SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination