WO2021164334A1 - 对抗攻击模型的训练方法及装置、对抗图像的产生方法及装置、电子设备及存储介质 - Google Patents
对抗攻击模型的训练方法及装置、对抗图像的产生方法及装置、电子设备及存储介质 Download PDFInfo
- Publication number
- WO2021164334A1 WO2021164334A1 PCT/CN2020/128009 CN2020128009W WO2021164334A1 WO 2021164334 A1 WO2021164334 A1 WO 2021164334A1 CN 2020128009 W CN2020128009 W CN 2020128009W WO 2021164334 A1 WO2021164334 A1 WO 2021164334A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- attack
- image
- training
- confrontation
- model
- Prior art date
Links
- 238000012549 training Methods 0.000 title claims abstract description 210
- 238000000034 method Methods 0.000 title claims abstract description 136
- 230000006870 function Effects 0.000 claims description 53
- 230000009466 transformation Effects 0.000 claims description 49
- 230000015654 memory Effects 0.000 claims description 22
- 238000007639 printing Methods 0.000 claims description 11
- 238000004590 computer program Methods 0.000 claims description 10
- 238000010008 shearing Methods 0.000 claims description 3
- 238000013519 translation Methods 0.000 claims description 3
- 238000004519 manufacturing process Methods 0.000 claims 2
- 238000013528 artificial neural network Methods 0.000 description 32
- 238000010801 machine learning Methods 0.000 description 27
- 238000010586 diagram Methods 0.000 description 16
- 238000005516 engineering process Methods 0.000 description 14
- 238000013473 artificial intelligence Methods 0.000 description 12
- 230000000694 effects Effects 0.000 description 10
- 238000012545 processing Methods 0.000 description 10
- 238000006243 chemical reaction Methods 0.000 description 9
- 238000002474 experimental method Methods 0.000 description 9
- 230000000306 recurrent effect Effects 0.000 description 6
- 230000008569 process Effects 0.000 description 5
- 238000000844 transformation Methods 0.000 description 5
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 description 4
- 238000012360 testing method Methods 0.000 description 4
- 238000013527 convolutional neural network Methods 0.000 description 3
- 230000007123 defense Effects 0.000 description 3
- 230000007787 long-term memory Effects 0.000 description 3
- 239000011159 matrix material Substances 0.000 description 3
- 230000006403 short-term memory Effects 0.000 description 3
- 238000013256 Gubra-Amylin NASH model Methods 0.000 description 2
- 230000009471 action Effects 0.000 description 2
- 230000003042 antagnostic effect Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000003066 decision tree Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
- H04L63/1416—Event detection, e.g. attack signature detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
- G06F21/56—Computer malware detection or handling, e.g. anti-virus arrangements
- G06F21/566—Dynamic detection, i.e. detection performed at run-time, e.g. emulation, suspicious activities
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/24—Aligning, centring, orientation detection or correction of the image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/28—Quantising the image, e.g. histogram thresholding for discrimination between background and foreground patterns
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/776—Validation; Performance evaluation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/778—Active pattern-learning, e.g. online learning of image or video features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1433—Vulnerability analysis
Definitions
- This application relates to the field of artificial intelligence technology, and in particular to a method and device for training a counter-attack model, a method and device for generating a counter-image, electronic equipment, and a storage medium.
- AI Artificial Intelligence
- Artificial Intelligence is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge, and use knowledge to obtain the best results.
- artificial intelligence is a comprehensive technology of computer science, which attempts to understand the essence of intelligence and produce a new kind of intelligent machine that can react in a similar way to human intelligence.
- Artificial intelligence is the study of the design principles and implementation methods of various intelligent machines, so that the machines have the functions of perception, reasoning and decision-making.
- artificial intelligence technology is being applied to various fields, such as smart home, smart wearable devices, virtual assistants, smart speakers, smart marketing, unmanned driving, autonomous driving, drones, robots , Smart medical, smart customer service, etc.
- Artificial intelligence technology is a comprehensive discipline, covering a wide range of fields, including both hardware-level technology and software-level technology.
- Basic artificial intelligence technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, and mechatronics.
- Artificial intelligence software technology mainly includes computer vision technology, speech processing technology, natural language processing technology, and machine learning/deep learning.
- Machine learning specializes in the study of how computers simulate or implement human learning behaviors to acquire new knowledge or skills, and reorganize the existing knowledge structure to continuously improve its own performance.
- Machine learning is the core of artificial intelligence, the fundamental way to make computers intelligent, and its applications cover all fields of artificial intelligence.
- various forms of machine learning models have completely changed many areas of artificial intelligence.
- machine learning models such as Deep Neural Networks (DNN) are now used for many machine vision tasks.
- DNN Deep Neural Networks
- deep neural networks Although deep neural networks have good performance, they are extremely vulnerable to adversarial attacks. Adversarial attacks are manifested as small disturbances of the artificial calculation of the input of the attacker to the deep neural network, which causes the deep neural network to produce wrong output, that is, to deceive the deep neural network. Since deep neural networks are vulnerable to adversarial sample attacks, deep neural networks are required to improve their defense capabilities and reduce the possibility of adversarial attack samples deceiving the deep neural network.
- a method for training a counter attack model which is executed by an electronic device.
- the counter attack model includes a generator network.
- the training method includes: using the generator network to generate a countermeasure based on training digital images. Attack image; based on the confrontation attack image, conduct a confrontation attack on the target model, and obtain the confrontation attack result; obtain the physical image corresponding to the training digital image; and, based on the training digital image, the confrontation attack image, Training the generator network with the result of the counter attack and the physical image.
- a method for generating a confrontation image executed by an electronic device, including: training a confrontation attack model including a generator network to obtain a trained confrontation attack model; and using the trained confrontation attack model
- the attack model generates a confrontational image based on the input digital image, wherein when training the confrontational attack model, the generator network is used to generate a confrontational attack image based on the training digital image; based on the confrontational attack image, the target
- the model performs a confrontation attack and obtains a confrontation attack result; acquires a physical image corresponding to the training digital image; based on the training digital image, the confrontation attack image, the confrontation attack result, and the physical image,
- the generator network is trained.
- a training device for resisting attack models includes a generator network
- the training device includes: a generation module for using the generator network to generate a confrontation attack image based on the training digital image; an attack module for performing a target model based on the confrontation attack image Confront the attack and obtain the result of the confrontation attack; an acquisition module for obtaining the physical image corresponding to the training digital image; and a training module for obtaining the result of the confrontation attack based on the training digital image, the confrontation attack image And the physical image to train the generator network.
- a device for generating a confrontation image including: a first training module for training a confrontation attack model including a generator network to obtain a trained confrontation attack model; a generation module, It is used to use the trained confrontation attack model to generate confrontation images based on the input digital image, wherein when training the confrontation attack model, the generator network is used to generate the confrontation attack image based on the training digital image; The confrontation attack image is used to conduct a confrontation attack on the target model and obtain the confrontation attack result; obtain the physical image corresponding to the training digital image; based on the training digital image, the confrontation attack image, the confrontation attack result and the result The physical image is used to train the generator network.
- an electronic device including: a processor; and a memory, on which computer-readable instructions are stored, and when the computer-readable instructions are executed by the processor, the above is achieved.
- a computer-readable storage medium having one or more computer programs stored thereon, wherein when the one or more computer programs are executed by a processor, the anti-attack model described above is implemented.
- the embodiments of the present application also provide a computer program product or computer program.
- the computer program product or computer program includes computer instructions, and the computer instructions are stored in a computer-readable storage medium.
- the processor of the computer device reads the computer instruction from the computer-readable storage medium, and the processor executes the computer instruction, so that the computer device executes the aforementioned user device access processing method.
- FIG. 1 shows a block diagram of an example system to which the training of the anti-attack model according to an embodiment of the present application can be applied;
- Fig. 2A shows a block diagram of an anti-attack model according to some embodiments of the present application
- FIG. 2B shows a block diagram of an anti-attack model according to some embodiments of the present application
- Fig. 3A shows a block diagram of an anti-attack model according to some embodiments of the present application
- FIG. 3B shows a block diagram of an anti-attack model according to some embodiments of the present application.
- FIG. 4 shows a method for training an anti-attack model according to some embodiments of the present application, where the anti-attack model includes a generator network and a decider network;
- FIG. 5 shows a method for training an anti-attack model according to some embodiments of the present application, where the anti-attack model includes a generator network, a decider network, and a geometric transformation module;
- Fig. 6 shows a method for generating a counter-attack image according to an embodiment of the present application
- FIG. 7A shows a block diagram of an apparatus for training an anti-attack model according to some embodiments of the present application
- FIG. 7B shows a block diagram of an apparatus for generating a confrontation image according to some embodiments of the present application.
- 8A to 8C respectively show the original digital image and some examples of countermeasures generated by the EOT method, the RP2 method, the D2P method, and the method of the present application.
- FIG. 9 shows a schematic diagram of the distribution of users' answers when using the EOT method, the RP2 method, the D2P method, and the method of the present application to conduct experiments.
- Fig. 10 shows a block diagram of an electronic device according to an embodiment of the present application.
- Confrontation attacks are generally divided into two types due to their different areas of action: digital confrontation attacks and physical confrontation attacks.
- a digital countermeasure attack is a method of directly inputting digital countermeasure samples such as digital images in the digital world (also called digital domain or digital space) into a deep neural network for attack.
- Physical confrontation attack is a method of attacking deep neural networks through physical confrontation samples in the physical world (also called physical domain or physical space).
- the difficulty of physical counterattacks is that effective countermeasures in the digital domain (for example, counter images) usually lose the attack effect due to image distortion after the conversion from the digital domain to the physical domain.
- the conversion from the digital domain to the physical domain has high uncertainty and is difficult to accurately model.
- the embodiments of the present application provide a confrontation attack model for confronting attacks, a training method for the confrontation attack model, generation of confrontation samples (for example, confrontation images) through the confrontation attack model, and use of the confrontation examples to The method of training the target model.
- FIG. 1 shows a block diagram of an example system 10 to which the training of an anti-attack model according to an embodiment of the present application can be applied.
- the system 10 may include a user equipment 110, a server 120 and a training device 130.
- the user equipment 110, the server 120, and the training device 130 may be communicatively coupled to each other through the network 140.
- the user equipment 110 may be any type of electronic device, such as a personal computer (for example, a laptop or desktop computer), a mobile device (for example, a smart phone or a tablet computer), a game console, a wearable device, or any other type Electronic equipment.
- a personal computer for example, a laptop or desktop computer
- a mobile device for example, a smart phone or a tablet computer
- a game console for example, a wearable device, or any other type Electronic equipment.
- the user equipment 110 may include one or more processors 111 and a memory 112.
- the one or more processors 111 may be any suitable processing devices (for example, processor cores, microprocessors, ASICs, FPGAs, controllers, microcontrollers, etc.), and may be a processor or operably connected Multiple processors.
- the memory 112 may include one or more non-transitory computer-readable storage media, such as RAM, ROM, EEPROM, EPROM, flash memory devices, magnetic disks, etc., and combinations thereof.
- the memory 112 may store data and instructions executed by the processor 111 to make the user equipment 110 perform operations.
- the user equipment 110 may store or include one or more counter attack models.
- the user equipment 110 may also store or otherwise include one or more target models.
- the target model may refer to the model to be attacked.
- the target model may be or may include various machine learning models in other ways, such as neural networks (for example, deep neural networks) or other types of machine learning models (including non-linear models and/or linear models).
- Neural networks may include feedforward neural networks, recurrent neural networks (e.g., long and short-term memory recurrent neural networks), convolutional neural networks, or other forms of neural networks.
- one or more adversarial attack models may be received from the server 120 through the network 140, stored in the memory 114 of the user equipment, and then used by the one or more processors 111 or implemented in other ways.
- the server 120 may include one or more counter attack models.
- the server 120 communicates with the user equipment 110 according to the client-server relationship.
- the anti-attack model can be implemented by the server 140 as part of a web service. Therefore, one or more adversarial attack models may be stored and implemented at the user equipment 110, and/or one or more adversarial attack models may be stored and implemented at the server 120.
- the server 120 includes one or more processors 121 and a memory 122.
- the one or more processors 121 may be any suitable processing devices (for example, processor cores, microprocessors, ASICs, FPGAs, controllers, microcontrollers, etc.), and may be a processor or operably connected Multiple processors.
- the memory 122 may include one or more non-transitory computer-readable storage media, such as RAM, ROM, EEPROM, EPROM, flash memory devices, magnetic disks, etc., and combinations thereof.
- the memory 122 may store data and instructions executed by the processor 121 to cause the server 120 to perform operations.
- the server 120 may also store or otherwise include one or more target models.
- the target model may refer to the model to be attacked.
- the target model may be or may include various machine learning models in other ways, such as neural networks (for example, deep neural networks) or other types of machine learning models (including non-linear models and/or linear models).
- Neural networks may include feedforward neural networks, recurrent neural networks (for example, long and short-term memory recurrent neural networks), convolutional neural networks, or other forms of neural networks.
- the user equipment 110 and/or the server 120 may use interactions with the training device 130 communicatively coupled through the network 140 to train the anti-attack model and/or the target model.
- the training device 130 may be separate from the server 120 or may be a part of the server 120.
- the training device 130 includes one or more processors 131 and a memory 132.
- the one or more processors 131 may be any suitable processing devices (for example, processor cores, microprocessors, ASICs, FPGAs, controllers, microcontrollers, etc.), and may be a processor or operably connected Multiple processors.
- the memory 132 may include one or more non-transitory computer-readable storage media, such as RAM, ROM, EEPROM, EPROM, flash memory devices, magnetic disks, etc., and combinations thereof.
- the memory 132 may store data and instructions executed by the processor 131 to cause the training device 130 to perform operations.
- the training device 130 may include a machine learning engine 133.
- the machine learning engine 133 may use various training techniques or learning techniques to train the anti-attack model and/or target model stored at the user equipment 110 and/or the server 120.
- the machine learning engine 133 may perform various techniques (eg, weight attenuation, loss, etc.) to improve the generalization ability of the model being trained.
- the machine learning engine 133 may include one or more machine learning platforms, frameworks, and/or libraries, such as TensorFlow, Caffe/Caffe2, Theano, Torch/PyTorch, MXnet, CNTK, etc.
- the machine learning engine 133 can implement the training of the anti-attack model and/or target model.
- Figure 1 shows an example system that can be used to implement the present application.
- the user equipment 110 may include a machine learning engine and a training data set.
- the adversarial attack model and/or the target model may be trained and used locally on the user equipment 110, or the adversarial sample may be generated through the trained adversarial attack model.
- FIG. 2A shows an example of an anti-attack model 20 according to some embodiments of the present application.
- FIG. 2B shows an example of the anti-attack model 20 including a certain digital image sample.
- the anti-attack model 20 may include a generator network 201 and a decider network 202.
- the adversarial attack model 20 is trained using training samples.
- the training sample may be a digital image sample (referred to as a training digital image).
- the generator network 201 and the arbiter network 202 may include various types of machine learning models.
- Machine learning models can include linear models and non-linear models.
- machine learning models may include regression models, support vector machines, decision tree-based models, Bayesian models, and/or neural networks (eg, deep neural networks).
- the neural network may include a feedforward neural network, a recurrent neural network (for example, a long and short-term memory recurrent neural network), a convolutional neural network, or other forms of neural networks.
- the generator network and the arbiter network are referred to as "networks”, but the generator network and the arbiter network are not necessarily limited to neural networks, but can also include other forms of machine learning models. .
- the generator network 201 and the arbiter network 202 form a Generative Adversarial Network (GAN).
- GAN Generative Adversarial Network
- the generator network 201 may generate a counter-attack image based on the training digital image, and the generated counter-attack image may be output to the arbiter network 202 and the target model 21.
- the target model 21 may refer to a model to be attacked against.
- the arbiter network 202 may generate a discrimination result based on the physical image and the counter attack image generated by the generator network 201.
- the physical image can be obtained by performing the conversion from the physical domain to the digital domain on the training digital image.
- FIG. 2B shows an example form of training digital image to physical image conversion.
- Performing the conversion from the physical domain to the digital domain on the training digital image may include one of the following: after printing and scanning the training digital image, the physical image is obtained; or, after the training digital image is printed and photographed, the physical image is obtained.
- image For example, the training digital image can be printed by a printer and the printed image can be scanned by a scanner to obtain a physical image. Alternatively, the training digital image may be printed by a printer and the printed image may be photographed by a camera to obtain a physical image.
- the training digital image can be mapped to the physical domain at a ratio of 1:1.
- the first objective function used to train the generator network 201 can be expressed as:
- the counter attack image generated by the generator network 201 also needs to be close enough to the physical image without noise, so that the arbiter network 202 can be deceived.
- the arbiter network 202 deceive the arbiter network 202 with the requirements of GAN. Therefore, the second objective function used to train the arbiter network 202 can be expressed as:
- G( ⁇ ) represents the generator network
- D( ⁇ ) represents the arbiter network
- x represents the training digital image input to the generator network
- x p represents the physical image input to the arbiter network.
- the function can indicate that the decision loss needs to be maximized when updating D, and the decision loss needs to be minimized when updating G.
- judgment loss It can be obtained by referring to the GAN model, but this application is not limited to this, and various judgment losses can be used.
- the confrontation attack model 20 can be trained to obtain the variables of the generator network 201 and the arbiter network 202 based on the above-mentioned confrontation attack loss and decision loss.
- the image quality of the counter-attack image generated by the trained counter-attack model can be significantly improved, so that the countermeasure Images can be used for effective attacks or for effective training of target models.
- the image generated by the generator network during the training of the confrontation attack model is referred to as "antagonistic attack image”, and the image generated by the trained confrontation attack model is called As “confrontational image”.
- the influence of noise on the physical image can be limited through the arbiter network.
- the counter-attack model 20 can be jointly optimized through the conversion process of digital images to physical images and the generation process of counter-attack images.
- the adversarial attack model 20 can be used in a universal physical attack.
- the training digital image may include a plurality of different digital images obtained by subjecting the original image to different random crops.
- Corresponding multiple physical images can be obtained by performing the conversion from the physical domain to the digital domain on multiple different digital images.
- the multiple digital images and multiple physical images form multiple sets of digital images and physical images.
- Each of the multiple sets of digital images and physical images is used as the input of the anti-attack model 20 for training, wherein the digital images in each set of digital images and physical images are used as training digital images, and the physical images are used as training digital images.
- the physical image corresponding to the digital image.
- the adversarial attack model 20 can be used to attack other different input images.
- the adversarial attack model can learn a more widely applicable anti-noise model.
- FIG. 3A shows an example of an anti-attack model 30 according to some embodiments of the present application.
- FIG. 3B shows an example of an anti-attack model 30 including a certain digital image sample.
- the anti-attack model 30 may include a generator network 301, a decider network 302, and a geometric transformation module 303.
- the generator network 301 may generate a counter-attack image based on the training digital image, and the generated counter-attack image may be output to the arbiter network 302 and the geometric transformation module 303.
- the target model 31 may refer to a model to be attacked against.
- FIG. 3B shows an example form of geometric transformation of the counter attack image.
- the geometric transformation module 303 may be configured to perform geometric transformation on the anti-attack image generated by the generator network 301.
- the geometric transformation may include affine transformation.
- the geometric transformation may include at least one of translation, scaling, flipping, rotation, and shearing. In this way, the target model 31 can be counter-attacked using the counter-attack image after geometric transformation.
- the confrontational attack image generated by the generator network 301 needs to deceive the target model 31.
- an EOT (expectation over transformation) method can be used to conduct adversarial attacks.
- the first objective function used to train the generator network 301 can be expressed as:
- the generated counter attack image generated by the generator network 301 also needs to be close enough to the physical image without noise, so that the arbiter network 302 can be deceived.
- the second objective function used to train the arbiter network 302 can be expressed as:
- G( ⁇ ) represents the generator network
- D( ⁇ ) represents the arbiter network
- x represents the training digital image input to the generator network 301
- x p represents the physical image input to the arbiter network .
- the function can indicate that the decision loss needs to be maximized when updating D, and the decision loss needs to be minimized when updating G.
- the final objective function can be obtained as:
- ⁇ is a weight coefficient (called attack weight).
- attack weight may be a pre-defined hyperparameter.
- the attack weight can range from 5 to 20.
- the anti-attack model 30 including the generator network 301 and the arbiter network 302 can be trained based on the above objective function to obtain the variables of the generator network 301 and the arbiter network 302.
- the image quality of the counter-attack image generated by the trained counter-attack model can be significantly improved, so that the countermeasure Images can be used for effective attacks or for effective training of target models.
- the influence of noise on the physical image can be limited through the arbiter network, and the conversion process from digital image to physical image and the generation process of the anti-attack image can be jointly optimized.
- the attack effect can be stabilized in the case of geometric transformation, thereby improving the robustness of the confrontation attack.
- the adversarial attack model 30 can be used in a universal physical attack.
- the training digital image may include a plurality of different digital images obtained by subjecting the original image to different random crops.
- Corresponding multiple physical images can be obtained by performing the conversion from the physical domain to the digital domain on multiple different digital images.
- the multiple digital images and multiple physical images form multiple sets of digital images and physical images.
- Each of the multiple sets of digital images and physical images is used as the input of the anti-attack model 30 for training, wherein the digital images in each set of digital images and physical images are used as training digital images, and the physical images are used as training digital images.
- the physical image corresponding to the digital image.
- the adversarial attack model 30 can be used to attack other different input images.
- the adversarial attack model can learn a more widely applicable anti-noise model.
- FIGS. 2A and 2B and FIGS. 3A and 3B Some examples of anti-attack models according to some embodiments of the present application are described above in conjunction with FIGS. 2A and 2B and FIGS. 3A and 3B.
- a method for training an adversarial attack model according to some embodiments of the present application will be described in conjunction with FIG. 4 and FIG. 5.
- Fig. 4 shows a method 40 for training an anti-attack model according to some embodiments of the present application, where the anti-attack model includes a generator network and a decider network.
- this method can be used to train the adversarial attack model 20 shown in FIG. 2A or FIG. 2B.
- step S41 a generator network is used to generate an anti-attack image based on the training digital image.
- the counter attack image is generated according to the machine learning model in the generator network.
- step S43 use the confrontation attack image to conduct a confrontation attack on the target model, and obtain the confrontation attack result.
- the result of the counter attack may be the recognition result or the classification result output by the target model.
- step S45 the physical image corresponding to the training digital image is obtained.
- obtaining the physical image corresponding to the training digital image may include one of the following: after printing and scanning the training digital image (print-scan), obtaining the physical image; or printing and photographing the training digital image (print-scan) After shooting), the physical image is acquired.
- step S45 may include directly receiving or reading the physical image corresponding to the training digital image, wherein the physical image is determined using the example method described above.
- the physical image corresponding to the training digital image can be determined in advance.
- step S45 is shown after steps S41 and S43, the present application is not limited to this.
- step S45 may be executed before step S41 or S43, or executed in parallel.
- step S47 the generator network is trained based on the training digital image, the counter attack image, the counter attack result, and the physical image.
- the adversarial attack model further includes a decider network
- step S47 may include: obtaining a target label corresponding to the training digital image; determining the adversarial attack loss based on the target label and the result of the adversarial attack, and based on the adversarial attack loss, Train the generator network; use the judge network to make image judgments based on the counter attack image and physical images to determine the judgment loss; based on the counter attack loss and judgment loss, joint training of the generator network and the judge network.
- the joint training of the generator network and the arbiter network includes: using the counter attack loss and the decision loss to construct a target loss; based on the target loss, Joint training is performed on the generator network and the arbiter network.
- using the counter attack loss and the decision loss to construct a target loss includes: constructing a first objective function according to the counter attack loss; constructing a second objective function according to the decision loss; constructing a second objective function according to the first target Function and the second objective function to determine the final objective function.
- performing joint training on the generator network and the decider network includes: training the generator network and the decider network in combination based on the final objective function.
- the loss of the confrontation attack may be determined as in, Represents the adversarial attack loss of an adversarial attack on the target model, f( ⁇ ) represents the target model, G( ⁇ ) represents the generator network, x represents the input training digital image, and y represents the target label set relative to the label of the training digital image .
- Judgment loss can be determined as in, Represents the decision loss of the arbiter network, G( ⁇ ) represents the generator network, D( ⁇ ) represents the arbiter network, x represents the training digital image input to the generator network, and x p represents the physical image input to the arbiter network.
- the first objective function can be determined as
- the second objective function can be determined as
- the final objective function may be determined based on the first objective function and the second objective function.
- the final objective function can be determined as: Where ⁇ is the pre-defined attack weight.
- the joint training of the generator network and the decider network may include: training the generator network and the decider network based on the first objective function and the second objective function.
- the joint training of the generator network and the arbiter network may include: concurrently training the generator network and the arbiter network in parallel, wherein the generator network is trained based on the first objective function and the second objective function, and based on The second objective function is to train the arbiter network.
- the method for training an adversarial attack model described with reference to FIG. 4 may be implemented in at least one of the user equipment 110, the server 120, the training device 130, and the machine learning engine 133 in FIG. 1, for example.
- FIG. 5 shows a method 50 for training an anti-attack model according to some embodiments of the present application, where the anti-attack model includes a generator network, a decider network, and a geometric transformation module.
- this method can be used to train the adversarial attack model 30 shown in FIG. 3A or FIG. 3B.
- step S51 a generator network is used to generate an anti-attack image based on the training digital image.
- the training digital image is input to the generator network to generate the counterattack image.
- step S53 geometrically transform the counter-attack image to obtain a geometrically-transformed counter-attack image.
- step S53 geometric transformation is performed on the counter attack image generated by the generator network through the geometric transformation module.
- the geometric transformation may be an affine transformation.
- the affine transformation may include at least one of translation, scaling, flipping, rotation, and shearing.
- the geometrically transformed counterattack image can be used to counterattack the target model.
- An example of geometric transformation is described below.
- a point p(p x , p y ) on the counter-attack image its homogeneous coordinate form is p(p x , p y , 1), and the geometric transformation is represented by the homogeneous geometric transformation matrix A, then the point p(p x , p y )
- the coordinates (p x ′, p y ′) after geometric transformation satisfy:
- a 1 to a 6 are the parameters of geometric transformation, which reflect the geometric transformation such as rotation and scaling of the anti-attack image.
- the parameters of the geometric transformation can be predefined values.
- the geometric transformation parameters can be set according to different transformation requirements.
- step S55 use the geometrically transformed counterattack image to conduct a counterattack attack on the target model to obtain a counterattack result.
- the result of the counter attack may be the recognition result or the classification result output by the target model.
- step S57 the physical image corresponding to the training digital image is obtained.
- obtaining the physical image corresponding to the training digital image may include one of the following: printing and scanning the training digital image to obtain the physical image; or printing and shooting the training digital image to obtain the physical image.
- step S57 may include directly receiving or reading a physical image corresponding to the training digital image, wherein the physical image is determined using the example method described above.
- the physical image corresponding to the training digital image can be determined in advance.
- step S57 is shown after steps S51, S53, and S55, the present application is not limited to this.
- step S57 may be executed before one of steps S51, S53, and S55, or executed in parallel.
- step S59 the generator network and the arbiter network are trained based on the training digital image, the counter attack image, the counter attack result, and the physical image.
- the method for training an adversarial attack model described with reference to FIG. 5 may be implemented in at least one of the user equipment 110, the server 120, and the machine learning engine 133 in FIG. 1, for example.
- the anti-attack model and the training method thereof according to the embodiments of the present application are described above.
- the method of generating a counter-attack image will be described below.
- Fig. 6 shows a method for generating a confrontation image according to an embodiment of the present application.
- the image generated by the generator network during the training of the counter-attack model is referred to as the "antagonistic attack image”, and the image generated by the trained counter-attack model is referred to as " Confronting the image”.
- step S61 a confrontation attack model including a generator network is trained to obtain a trained confrontation attack model.
- step S63 the trained anti-attack model is used to generate an anti-attack image based on the input digital image.
- the input digital image can be the same or different from the training digital image.
- the adversarial attack model may be the adversarial attack model 20 described with reference to FIG. 2A or FIG. 2B.
- step S61 may include training the adversarial attack model by the method described with reference to FIG. 4 to obtain a trained adversarial attack model.
- the adversarial attack model may be the adversarial attack model 30 described with reference to FIG. 3A or FIG. 3B.
- step S61 may include training the adversarial attack model by the method described with reference to FIG. 5 to obtain a trained adversarial attack model.
- Step S63 may include: using a generator network to generate a confrontation image based on the input digital image.
- step S63 may include: using a generator network to generate a first confrontation image based on the input digital image; and performing geometric transformation on the first confrontation image to obtain a geometrically transformed second confrontation image, and combining the second confrontation image The image serves as the confrontation image.
- the generated confrontation image can be used to conduct a confrontation attack on the target model to deceive the target model.
- the generated confrontation image can be used to train the target model to defend against the confrontation attack using the confrontation image.
- the target model can be attacked to determine the stability of the target model.
- the generated confrontation images can also be used to train the target model to improve the target model's defense against such confrontation attacks.
- each block in the flowchart or block diagram may represent a module, segment, or code portion including at least one executable instruction for realizing a specified logical function.
- the functions mentioned in the blocks may occur out of the order indicated in the drawings. For example, depending on the functions involved, two blocks shown in succession may actually be executed substantially simultaneously, or the blocks may sometimes be executed in the reverse order.
- each block of the block diagram and/or flowchart and the combination of blocks in the block diagram and/or flowchart can be implemented by a dedicated hardware-based system that performs specified functions or actions, or a combination of dedicated hardware and computer instructions. accomplish.
- FIG. 7A shows a block diagram of a training device 70 for a counter-attack model according to an embodiment of the present application.
- the counter-attack model includes a generator network.
- the training device 70 can be used to train the various anti-attack models described above.
- the training device 70 of the anti-attack model may include:
- the generating module 701 is configured to use the generator network to generate a counter attack image based on the training digital image;
- the attack module 702 is configured to perform a confrontation attack on the target model based on the confrontation attack image, and obtain a confrontation attack result;
- the obtaining module 703 is used to obtain the physical image corresponding to the training digital image.
- the training module 704 is configured to train the generator network based on the training digital image, the counter attack image, the counter attack result, and the physical image.
- the attack module 702 is used to perform geometric transformation on the confrontation attack image to obtain a geometrically transformed confrontation attack image; and use the geometrically transformed confrontation attack image to conduct a confrontation attack on the target model to obtain a confrontation attack result .
- the adversarial attack model further includes a decider network, where the training module 704 is used to obtain a target label corresponding to the training digital image; based on the target label and the result of the adversarial attack, determine the adversarial Attack loss, and train the generator network based on the counter-attack loss; use the arbiter network to perform image judgment based on the counter-attack image and the physical image to determine the judgment loss; based on the counter-attack For the attack loss and the decision loss, joint training is performed on the generator network and the arbiter network.
- the training module 704 is used to obtain a target label corresponding to the training digital image; based on the target label and the result of the adversarial attack, determine the adversarial Attack loss, and train the generator network based on the counter-attack loss; use the arbiter network to perform image judgment based on the counter-attack image and the physical image to determine the judgment loss; based on the counter-attack For the attack loss and the decision loss, joint training is performed on the generator network and the arbiter network.
- the training device 70 of the anti-attack model can be implemented as at least one of the user device 110, the server 120, the training device 130, and the machine learning engine 133 in FIG.
- FIG. 7B shows a device 71 for generating a confrontation image according to an embodiment of the present application.
- the generating device 71 includes:
- the first training module 711 is used to train the confrontation attack model including the generator network to obtain the trained confrontation attack model;
- the generating module 712 is used to generate a confrontational attack based on the input digital image by using the trained confrontational attack model.
- the generator network is used to generate a confrontational attack based on the training digital image. Image; based on the confrontation attack image, conduct a confrontation attack on the target model, and obtain the confrontation attack result; obtain the physical image corresponding to the training digital image; based on the training digital image, the confrontation attack image, and the confrontation The attack result and the physical image train the generator network.
- the device 71 further includes:
- the second training module 713 is used to train the target model using the confrontation image to defend against the confrontation attack using the confrontation image.
- the confrontation image generation device 71 may be implemented in at least one of the user equipment 110, the server 120, the training device 130, and the machine learning engine 133 in FIG. 1.
- the following describes experiments based on the confrontation attack model and its training method according to some embodiments of the present application to illustrate some effects of the confrontation attack model through the confrontation attack model.
- the confrontation attack model described with reference to FIG. 3A or FIG. 3B is used, and the training method described in FIG. 5 is used to train the model.
- the counter-attack model in FIG. 3A or FIG. 3B and the training method in FIG. 5 are used for experiments, the same or similar effects can also be obtained by using other embodiments of the present application.
- the target model is a VGG-16 model pre-trained on ImageNet.
- the data set used in this experiment is 100 digital images of different categories randomly selected on ImageNet.
- Each digital image targets two different tag attacks. These two different tags (ie, target tags) are respectively determined as the original tag +100 and the original tag -100 of the image. For example, for an image with a label of 14, it will be used for two attacks, and the target labels are 914 and 114).
- a total of 200 attacks on the target model are carried out.
- This experiment uses the adversarial attack model described with reference to FIG. 3 for training and generates adversarial images (also known as adversarial samples) for adversarial attacks.
- the generator network of the anti-attack model includes 3 convolutional layers, 6 residual blocks, and 2 deconvolutional layers, and the arbiter network includes 5 convolutional layers.
- the scale of the geometric transformation module in the counter attack model ranges from 0.7 to 1.3, and the rotation angle ranges from -30° to 30°.
- the geometric transformation matrix A′ after adding random noise can be expressed as:
- the anti-attack image generated by the generator network and before the geometric transformation is added, for example, Gaussian random noise with an intensity of 0.1 to improve the stability of the anti-attack model against color changes.
- Training the anti-attack model mainly includes: for each original digital image, printing the original digital image and scanning the image obtained by printing to obtain the corresponding physical image, and normalizing the physical image Converted to a pixel size of 288*288; the original digital image and physical image are randomly cropped to generate 50 sets of digital images and physical images.
- the pixel size of each group of digital and physical images is 256*256 and the cutting methods are the same ; Use these 50 sets of digital images and physical images for training.
- each time a set of digital images and physical images are input into the generator network and the arbiter network in the counter attack model.
- the images generated by the generator network are converted by the geometric transformation module and then attack the target model. Training will be completed after epoch. After completing the training, the original digital image is input to the generator network, and the output of the generator network is the final counter image used for the attack.
- the EOT method, the RP2 (Robust Physical Perturbations) method, and the D2P (digital domain to physical domain mapping) method are compared with the method of the present application.
- the attack success rate attack success rate, ASR
- ASR attack success rate
- the attack success rate and corresponding confidence of various methods in the digital and physical domains are as follows: Table 1 shows.
- the PGD (Project Gradient Descent) method which is a digital domain attack, is used as a reference.
- the other three physical domain attack methods (EOT method, RP2 method, and D2P method) are also optimized using the PGD method.
- the three physical domain attack methods (EOT method, RP2 method, and D2P method) use noise intensity limited to 30 (for RGB images with intensity values ranging from 0 to 255).
- digital domain attack refers to the use of generated confrontation samples to conduct confrontation attacks
- physical domain attack refers to the use of confrontation attacks on images obtained by printing the confrontation samples and scanning the printed images. It can be seen that the attack success rate and confidence of the method of the present application in the digital domain and the physical domain are significantly higher than other methods.
- Table 2 shows the stability of adversarial samples generated by different methods to geometric transformations in the physical domain.
- the attack effect is obtained by printing and scanning the adversarial samples, and then undergoing scale transformation, rotation transformation and affine transformation.
- the results show that the anti-sample attack effect generated by the method of this application is the most stable, and its attack success rate (66.0%) is 11.2% higher than the highest (54.8%) among other methods. It is worth noting that the average attack success rate of the adversarial samples generated by the method of the present application after various geometric transformation processing in Table 2 is higher than the success rate of the adversarial samples without any transformation processing.
- the method of acquiring a physical image includes printing-scanning a digital image or printing-photographing a digital image.
- the images obtained by scanning and shooting are obvious differences between the images obtained by scanning and shooting. For example, shooting is more susceptible to complex external conditions such as lighting and lens distortion.
- the method of acquiring physical images was changed from printing-scanning to printing-shooting.
- the attack success rate of the method of the present application is still higher than other comparison methods by more than 10%.
- FIGS. 8A-8C examples of original digital images and some adversarial samples generated using the EOT method, the RP2 method, the D2P method, and the method of the present application are respectively shown.
- the user selects the image with the least distortion and the most natural look.
- a total of 106 users participated in the test. Since users were not required to make a choice in each question, a total of 10237 answers were received. The distribution of the final answers is shown in Table 5 and Figure 9.
- Fig. 10 shows a block diagram of an electronic device according to an embodiment of the present application.
- the electronic device 100 may include one or more processors 1001 and a memory 1002.
- the memory 1002 may be used to store one or more computer programs.
- the processor 1001 may include various processing circuits, such as but not limited to one or more of a dedicated processor, a central processing unit, an application processor, or a communication processor.
- the processor 1001 may perform control on at least one other component of the electronic device 100, and/or perform communication-related operations or data processing.
- the memory 1002 may include volatile and/or non-volatile memory.
- processors 1001 when one or more computer programs are executed by one or more processors 1001, one or more processors 1001 are caused to implement the method of the present application as described above.
- the electronic device 100 may be implemented as at least one of the user equipment 110, the server 120, the training device 130, and the machine learning engine 133 in FIG. 1.
- the electronic device 100 in the embodiments of the present application may include devices such as smart phones, tablet personal computers (PCs), servers, mobile phones, video phones, e-book readers, desktop PCs, laptop computers, netbook computers, and personal computers.
- PDA digital assistants
- PMP portable multimedia players
- MP3 players portable multimedia players
- mobile medical devices cameras or wearable devices (such as head-mounted devices (HMD), electronic clothes, electronic bracelets, electronic necklaces, electronic accessories, Electronic tattoos or smart watches) etc.
- HMD head-mounted devices
- module may include a unit configured in hardware, software, or firmware, and/or any combination thereof, and may be used interchangeably with other terms (for example, logic, logic block, component, or circuit).
- a module may be a single integral component or the smallest unit or component that performs one or more functions.
- the module can be implemented mechanically or electronically, and can include, but is not limited to, known or to be developed dedicated processors, CPUs, application-specific integrated circuits (ASIC) chips, field programmable gate arrays (FPGAs) to perform certain operations. ) Or programmable logic device.
- a part of a device e.g., a module or its function
- a method e.g., operation or step
- a computer-readable storage medium e.g., the memory 112 in the form of a program module.
- Memory 114, memory 122, memory 132, or memory 1002 When the instruction is executed by a processor (for example, the processor 111, the processor 121, the processor 131, or the processor 1001), the instruction may enable the processor to perform a corresponding function.
- the computer-readable medium may include, for example, a hard disk, a floppy disk, a magnetic medium, an optical recording medium, a DVD, and a magneto-optical medium.
- the instruction may include code created by the compiler or code executable by the interpreter.
- the module or programming module according to various embodiments of the present application may include at least one or more of the above-mentioned components, some of them may be omitted, or other additional components may also be included.
- the operations performed by the modules, programming modules, or other components according to various embodiments of the present application may be performed sequentially, in parallel, repeatedly, or heuristically, or at least some operations may be performed in a different order or omitted, or You can add other operations.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computer Security & Cryptography (AREA)
- General Engineering & Computer Science (AREA)
- Computer Hardware Design (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Molecular Biology (AREA)
- Computer Networks & Wireless Communication (AREA)
- Virology (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims (18)
- 一种对抗攻击模型的训练方法,其特征在于,由电子设备执行,所述对抗攻击模型包括生成器网络,所述训练方法包括:利用所述生成器网络,基于训练数字图像,产生对抗攻击图像;基于所述对抗攻击图像,对目标模型进行对抗攻击,并且获得对抗攻击结果;获得所述训练数字图像所对应的物理图像;及,基于所述训练数字图像、所述对抗攻击图像、所述对抗攻击结果和所述物理图像,对所述生成器网络进行训练。
- 如权利要求1所述的方法,其中,所述基于所述对抗攻击图像,对目标模型进行对抗攻击,并且获得对抗攻击结果,包括:对所述对抗攻击图像进行几何变换,得到几何变换后的对抗攻击图像;利用所述几何变换后的对抗攻击图像,对所述目标模型进行对抗攻击,获得所述对抗攻击结果。
- 根据权利要求1或2所述的训练方法,其中,所述对抗攻击模型还包括判决器网络,其中,所述基于所述训练数字图像、所述对抗攻击图像、所述对抗攻击结果和所述物理图像,对所述生成器网络进行训练,包括:获得与所述训练数字图像相对应的目标标签;基于所述目标标签和所述对抗攻击结果,确定对抗攻击损失,并基于所述对抗攻击损失,对所述生成器网络进行训练;利用所述判决器网络,基于所述对抗攻击图像和所述物理图像进行图像判决,确定判决损失;基于所述对抗攻击损失和所述判决损失,对所述生成器网络和所述判决器网络进行联合训练。
- 根据权利要求3所述的训练方法,其中,所述基于所述对抗攻击损失和所述判决损失,对所述生成器网络和所述判决器网络进行联合训练,包括:利用所述对抗攻击损失和所述判决损失,构造目标损失;基于所述目标损失,对所述生成器网络和所述判决器网络进行联合训练。
- 根据权利要求4所述的训练方法,其中,所述利用所述对抗攻击损失和所述判决损失,构造目标损失,包括:根据所述对抗攻击损失,构建第一目标函数;根据所述判决损失,构建第二目标函数;根据所述第一目标函数和所述第二目标函数,确定最终目标函数;所述基于所述目标损失,对所述生成器网络和所述判决器网络进行联合训练,包括:基于所述最终目标函数,组合地训练所述生成器网络和所述判决器网络。
- 根据权利要求3所述的训练方法,其中,所述基于所述对抗攻击损失和所述判决损失,对所述生成器网络和所述判决器网络进行联合训练,包括:根据所述对抗攻击损失,构建第一目标函数;根据所述判决损失,构建第二目标函数;基于所述第一目标函数和所述第二目标函数,训练所述生成器网络;基于所述第二目标函数,训练所述判决器网络。
- 根据权利要求2所述的训练方法,其中,所述几何变换包括平移、缩放、翻转、旋转和剪切中的至少一个。
- 根据权利要求1、2或者7中任一项所述的训练方法,其中,所述获得所述训练数字图像所对应的物理图像,包括:对所述训练数字图像进行打印并扫描后,获取所述物理图像。
- 根据权利要求1、2或者7中任一项所述的训练方法,其中,所述获得所述训练数字图像所对应的物理图像,包括:对所述训练数字图像进行打印并拍摄后,获取所述物理图像。
- 一种对抗图像的产生方法,其特征在于,由电子设备执行,包括:对包括生成器网络的对抗攻击模型进行训练,获得经训练的对抗攻击模型;利用经训练的对抗攻击模型,基于输入数字图像,产生对抗图像,其中,对所述对抗攻击模型进行训练时,利用所述生成器网络,基于训练数字图像,产生对抗攻击图像;基于所述对抗攻击图像,对目标模型进行对抗攻击,并且获得对抗攻击结果;获得所述训练数字图像所对应的物理图像;基于所述训练数字图像、所述对抗攻击图像、所述对抗攻击结果和所述物理图像,对所述生成器网络进行训练。
- 根据权利要求10所述的产生方法,其中,还包括:使用所述对抗图像对所述目标模型进行训练,以防御使用所述对抗图像进行的对抗攻击。
- 一种对抗攻击模型的训练装置,其特征在于,所述对抗攻击模型包括生成 器网络,所述训练装置包括:生成模块,用于利用所述生成器网络,基于训练数字图像,产生对抗攻击图像;攻击模块,用于基于所述对抗攻击图像,对目标模型进行对抗攻击,并且获得对抗攻击结果;获取模块,用于获得所述训练数字图像所对应的物理图像;以及训练模块,用于基于所述训练数字图像、所述对抗攻击图像、所述对抗攻击结果和所述物理图像,对所述生成器网络进行训练。
- 根据权利要求12所述的训练装置,其中,所述攻击模块用于,对所述对抗攻击图像进行几何变换,得到几何变换后的对抗攻击图像;利用所述几何变换后的对抗攻击图像,对所述目标模型进行对抗攻击,获得所述对抗攻击结果。
- 根据权利要求12或13所述的训练装置,其中,所述对抗攻击模型还包括判决器网络,其中,所述训练模块用于,获得与所述训练数字图像相对应的目标标签;基于所述目标标签和所述对抗攻击结果,确定对抗攻击损失,并基于所述对抗攻击损失,对所述生成器网络进行训练;利用所述判决器网络,基于所述对抗攻击图像和所述物理图像进行图像判决,确定判决损失;基于所述对抗攻击损失和所述判决损失,对所述生成器网络和所述判决器网络进行联合训练。
- 一种对抗图像的产生装置,其特征在于,包括:第一训练模块,用于对包括生成器网络的对抗攻击模型进行训练,获得经训练的对抗攻击模型;生成模块,用于利用经训练的对抗攻击模型,基于输入数字图像,产生对抗图像,其中,对所述对抗攻击模型进行训练时,利用所述生成器网络,基于训练数字图像,产生对抗攻击图像;基于所述对抗攻击图像,对目标模型进行对抗攻击,并且获得对抗攻击结果;获得所述训练数字图像所对应的物理图像;基于所述训练数字图像、所述对抗攻击图像、所述对抗攻击结果和所述物理图像,对所述生成器网络进行训练。
- 根据权利要求15所述的产生装置,其中,所述装置还包括:第二训练模块,用于使用所述对抗图像对所述目标模型进行训练,以防御使用所述对抗图像进行的对抗攻击。
- 一种电子设备,其特征在于,包括:处理器;及存储器,所述存储器上存储有计算机可读指令,所述计算机可读指令被所述处理器执行时实现如权利要求1至9中任一项所述的对抗攻击模型的训练方法,或者,如权利要求10或11中所述的对抗图像的产生方法。
- 一种计算机可读存储介质,其特征在于,其上存储有一个或多个计算机程序,其中:当所述一个或多个计算机程序被处理器执行时,实现如权利要求1至9中任一项所述的对抗攻击模型的训练方法,或者,如权利要求10或11中所述的对抗图像的产生方法。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/690,797 US20220198790A1 (en) | 2020-02-21 | 2022-03-09 | Training method and apparatus of adversarial attack model, generating method and apparatus of adversarial image, electronic device, and storage medium |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010107342.9 | 2020-02-21 | ||
CN202010107342.9A CN111340214B (zh) | 2020-02-21 | 2020-02-21 | 对抗攻击模型的训练方法及装置 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/690,797 Continuation US20220198790A1 (en) | 2020-02-21 | 2022-03-09 | Training method and apparatus of adversarial attack model, generating method and apparatus of adversarial image, electronic device, and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021164334A1 true WO2021164334A1 (zh) | 2021-08-26 |
Family
ID=71181672
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/128009 WO2021164334A1 (zh) | 2020-02-21 | 2020-11-11 | 对抗攻击模型的训练方法及装置、对抗图像的产生方法及装置、电子设备及存储介质 |
Country Status (3)
Country | Link |
---|---|
US (1) | US20220198790A1 (zh) |
CN (1) | CN111340214B (zh) |
WO (1) | WO2021164334A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115909020A (zh) * | 2022-09-30 | 2023-04-04 | 北京瑞莱智慧科技有限公司 | 模型鲁棒性检测方法、相关装置及存储介质 |
Families Citing this family (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114981836A (zh) * | 2020-01-23 | 2022-08-30 | 三星电子株式会社 | 电子设备和电子设备的控制方法 |
CN111340214B (zh) * | 2020-02-21 | 2021-06-08 | 腾讯科技(深圳)有限公司 | 对抗攻击模型的训练方法及装置 |
AU2020437435B2 (en) * | 2020-03-26 | 2023-07-20 | Shenzhen Institutes Of Advanced Technology | Adversarial image generation method, apparatus, device, and readable storage medium |
CN112035834A (zh) * | 2020-08-28 | 2020-12-04 | 北京推想科技有限公司 | 对抗训练方法及装置、神经网络模型的应用方法及装置 |
CN112488172B (zh) * | 2020-11-25 | 2022-06-21 | 北京有竹居网络技术有限公司 | 对抗攻击的方法、装置、可读介质和电子设备 |
CN113177497B (zh) * | 2021-05-10 | 2024-04-12 | 百度在线网络技术(北京)有限公司 | 视觉模型的训练方法、车辆识别方法及装置 |
US20230185912A1 (en) * | 2021-12-13 | 2023-06-15 | International Business Machines Corporation | Defending deep generative models against adversarial attacks |
CN114067184B (zh) * | 2022-01-17 | 2022-04-15 | 武汉大学 | 一种基于噪声模式分类的对抗样本检测方法及系统 |
CN115348115B (zh) * | 2022-10-19 | 2022-12-20 | 广州优刻谷科技有限公司 | 智能家居的攻击预测模型训练方法、攻击预测方法及系统 |
CN115439719B (zh) * | 2022-10-27 | 2023-03-28 | 泉州装备制造研究所 | 一种针对对抗攻击的深度学习模型防御方法及模型 |
CN115631085B (zh) * | 2022-12-19 | 2023-04-11 | 浙江君同智能科技有限责任公司 | 一种用于图像保护的主动防御方法及装置 |
CN116702634B (zh) * | 2023-08-08 | 2023-11-21 | 南京理工大学 | 全覆盖隐蔽定向对抗攻击方法 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170351935A1 (en) * | 2016-06-01 | 2017-12-07 | Mitsubishi Electric Research Laboratories, Inc | Method and System for Generating Multimodal Digital Images |
CN110210573A (zh) * | 2019-06-11 | 2019-09-06 | 腾讯科技(深圳)有限公司 | 对抗图像的生成方法、装置、终端及存储介质 |
CN110443203A (zh) * | 2019-08-07 | 2019-11-12 | 中新国际联合研究院 | 基于对抗生成网络的人脸欺骗检测系统对抗样本生成方法 |
CN110728629A (zh) * | 2019-09-03 | 2020-01-24 | 天津大学 | 一种用于对抗攻击的图像集增强方法 |
CN111340214A (zh) * | 2020-02-21 | 2020-06-26 | 腾讯科技(深圳)有限公司 | 对抗攻击模型的训练方法及装置 |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11341368B2 (en) * | 2017-04-07 | 2022-05-24 | Intel Corporation | Methods and systems for advanced and augmented training of deep neural networks using synthetic data and innovative generative networks |
CN108510061B (zh) * | 2018-03-19 | 2022-03-29 | 华南理工大学 | 基于条件生成对抗网络的多监控视频人脸合成正脸的方法 |
CN109447263B (zh) * | 2018-11-07 | 2021-07-30 | 任元 | 一种基于生成对抗网络的航天异常事件检测方法 |
CN109801221A (zh) * | 2019-01-18 | 2019-05-24 | 腾讯科技(深圳)有限公司 | 生成对抗网络的训练方法、图像处理方法、装置和存储介质 |
CN110163093B (zh) * | 2019-04-15 | 2021-03-05 | 浙江工业大学 | 一种基于遗传算法的路牌识别对抗防御方法 |
-
2020
- 2020-02-21 CN CN202010107342.9A patent/CN111340214B/zh active Active
- 2020-11-11 WO PCT/CN2020/128009 patent/WO2021164334A1/zh active Application Filing
-
2022
- 2022-03-09 US US17/690,797 patent/US20220198790A1/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170351935A1 (en) * | 2016-06-01 | 2017-12-07 | Mitsubishi Electric Research Laboratories, Inc | Method and System for Generating Multimodal Digital Images |
CN110210573A (zh) * | 2019-06-11 | 2019-09-06 | 腾讯科技(深圳)有限公司 | 对抗图像的生成方法、装置、终端及存储介质 |
CN110443203A (zh) * | 2019-08-07 | 2019-11-12 | 中新国际联合研究院 | 基于对抗生成网络的人脸欺骗检测系统对抗样本生成方法 |
CN110728629A (zh) * | 2019-09-03 | 2020-01-24 | 天津大学 | 一种用于对抗攻击的图像集增强方法 |
CN111340214A (zh) * | 2020-02-21 | 2020-06-26 | 腾讯科技(深圳)有限公司 | 对抗攻击模型的训练方法及装置 |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115909020A (zh) * | 2022-09-30 | 2023-04-04 | 北京瑞莱智慧科技有限公司 | 模型鲁棒性检测方法、相关装置及存储介质 |
CN115909020B (zh) * | 2022-09-30 | 2024-01-09 | 北京瑞莱智慧科技有限公司 | 模型鲁棒性检测方法、相关装置及存储介质 |
Also Published As
Publication number | Publication date |
---|---|
US20220198790A1 (en) | 2022-06-23 |
CN111340214A (zh) | 2020-06-26 |
CN111340214B (zh) | 2021-06-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021164334A1 (zh) | 对抗攻击模型的训练方法及装置、对抗图像的产生方法及装置、电子设备及存储介质 | |
Baldassarre et al. | Deep koalarization: Image colorization using cnns and inception-resnet-v2 | |
Qian et al. | Recurrent color constancy | |
WO2020103700A1 (zh) | 一种基于微表情的图像识别方法、装置以及相关设备 | |
WO2019227479A1 (zh) | 人脸旋转图像的生成方法及装置 | |
Gudi et al. | Efficiency in real-time webcam gaze tracking | |
Pang et al. | Dpe: Disentanglement of pose and expression for general video portrait editing | |
CN110968734A (zh) | 一种基于深度度量学习的行人重识别方法及装置 | |
CN113435264A (zh) | 基于寻找黑盒替代模型的人脸识别对抗攻击方法及装置 | |
Li et al. | Efficient and low-cost deep-learning based gaze estimator for surgical robot control | |
Wang et al. | Domain shift preservation for zero-shot domain adaptation | |
Wang et al. | Improved knowledge distillation for training fast low resolution face recognition model | |
Leksut et al. | Learning visual variation for object recognition | |
Li et al. | Manipllm: Embodied multimodal large language model for object-centric robotic manipulation | |
Skočaj | Robust subspace approaches to visual learning and recognition | |
Salvalaio et al. | Self-adaptive appearance-based eye-tracking with online transfer learning | |
Jin et al. | FedCrack: Federated Transfer Learning With Unsupervised Representation for Crack Detection | |
Lahiri et al. | Improved techniques for GAN based facial inpainting | |
Pareek et al. | Human boosting | |
Xue et al. | Robust landmark‐free head pose estimation by learning to crop and background augmentation | |
Sato et al. | Affine template matching by differential evolution with adaptive two‐part search | |
Zeng et al. | Combined training strategy for low‐resolution face recognition with limited application‐specific data | |
Chen et al. | Conditional adaptation deep networks for unsupervised cross domain image classifcation | |
Ge et al. | Dynamic saliency-driven associative memories based on network potential field | |
Ibrahim et al. | Evaluating the Impact of Emotions and Awareness on User Experience in Virtual Learning Environments for Sustainable Development Education |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20919919 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20919919 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 200223) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20919919 Country of ref document: EP Kind code of ref document: A1 |