WO2023019970A1 - 一种攻击检测方法及装置 - Google Patents

一种攻击检测方法及装置 Download PDF

Info

Publication number
WO2023019970A1
WO2023019970A1 PCT/CN2022/085391 CN2022085391W WO2023019970A1 WO 2023019970 A1 WO2023019970 A1 WO 2023019970A1 CN 2022085391 W CN2022085391 W CN 2022085391W WO 2023019970 A1 WO2023019970 A1 WO 2023019970A1
Authority
WO
WIPO (PCT)
Prior art keywords
model
attack detection
physical
data set
samples
Prior art date
Application number
PCT/CN2022/085391
Other languages
English (en)
French (fr)
Inventor
唐文
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP22857289.7A priority Critical patent/EP4375860A1/en
Publication of WO2023019970A1 publication Critical patent/WO2023019970A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the present application relates to the field of artificial intelligence (AI), in particular to an attack detection method and device.
  • AI artificial intelligence
  • Deep Neural Network is widely used in computer vision (Computer Vision, CV), speech recognition, natural language processing (Natural Language Processing, NLP) and other fields.
  • the attacker uses digital counterattacks or physical counterattacks to attack the application model in order to steal the parameter configuration or data of the application model.
  • digital adversarial attacks the attacker can control bit-level data to attack the application model; in physical adversarial attacks, the attacker constructs an adversarial example based on the real physical world to attack the application model .
  • the application model uses a model-related static defense method to detect physical adversarial attacks
  • the static defense method is a model retraining method or defensive distillation.
  • the static defense method relies on the reconstruction of the application model, which leads to a decrease in the accuracy of the application model in processing samples. Therefore, how to detect physical adversarial attacks has become an urgent problem to be solved.
  • the present application provides an attack detection method and device, which solves the problem that the static defense method reconstructs the application model, resulting in a decrease in the precision of the application model processing samples.
  • the embodiment of the present application provides an attack detection method, which can be applied to a terminal device, or the method can be applied to a server that can support the terminal device to implement the method, for example, the server includes a chip system, and the method includes: First, the attack detection model obtains an inference request, which carries a data set to be processed of the application model, and the data set to be processed includes one or more samples. Second, the attack detection model detects whether there are physical adversarial examples in the dataset to be processed. Finally, if there are physical adversarial samples in the data set to be processed, the attack detection model performs protection processing on the application model.
  • This embodiment uses an attack detection model different from the application model to detect whether there are physical adversarial samples in the inference request. Since the application model does not need to resist physical adversarial attacks (adversarial attack), it is not necessary to perform model retraining or defensive distillation on the application model. , avoiding the reduction of the precision of the application model caused by reconstruction of the application model.
  • the attack detection model can also use the detected physical adversarial samples for training to optimize the attack detection capability of the attack detection model and improve the accuracy of physical adversarial attack detection.
  • the attack detection model is determined according to a training data set, and the training data set includes multiple physical adversarial samples and multiple standard samples for the application model.
  • the processing capability of the computing device dedicated to model training and generation is stronger than that of the terminal.
  • the training process of the attack detection model is executed by the computing device (such as a server)
  • the attack The training time of the detection model will be shorter and the training efficiency will be higher.
  • the terminal can use the attack detection model to determine whether there are physical adversarial samples in the inference request, and then perform protection processing on the application model when the inference request has physical adversarial samples. Since the application model does not need to resist physical adversarial attacks, the server does not need to The model performs model retraining or defensive distillation, etc., to avoid the reduction of the accuracy of the application model caused by the reconstruction of the application model.
  • the attack detection model detects whether there are physical adversarial samples in the data set to be processed, including: for each sample included in the data set to be processed, the attack detection model outputs the security information of the sample; the security information Used to indicate the confidence that the sample contains physical adversarial perturbation. And, if the confidence of the sample reaches the first threshold, the attack detection model identifies the sample as a physical adversarial sample against the application model. For example, if only one sample is included in the data set to be processed, when the sample is identified as a physical adversarial sample by the attack detection model, the attack detection model may use the reasoning request as an attack request for the application model.
  • the security information of the sample is obtained by a feature detection module included in the attack detection model.
  • the attack detection model can perform physical confrontation attack detection on the sample carried by the inference request, it replaces the process of the application model resisting the physical confrontation attack, so that the attack detection model replaces the application model to realize the function of physical confrontation attack detection; Therefore, the application model does not need to be reconstructed in order to resist physical confrontation attacks, which avoids the reduction of the accuracy of the application model caused by the reconstruction process of the application model in common technologies.
  • the attack detection model detects whether there are physical adversarial samples in the data set to be processed, and further includes: the attack detection model outputs the data set to be processed according to the security information of multiple samples included in the data set to be processed test results.
  • the attack detection model can determine the detection result of the physical adversarial attack detection of the reasoning request according to the confidence that each sample in the plurality of samples contains physical adversarial disturbances, It prevents the attack detection model from accidentally judging that there are a small number of physical confrontation samples in the data set to be processed, and then determines that the reasoning request is an attack request, which improves the accuracy of the attack detection model for physical confrontation attack detection.
  • the attack detection model outputs the detection results of the data set to be processed according to the security information of multiple samples included in the data set to be processed, including: the attack detection model stores the physical adversarial samples in the attack detection In the sequence detection module contained in the model; and in the case that the number of physical adversarial samples in the multiple samples is greater than or equal to the first number, the sequence detection module determines that the reasoning request is an attack request.
  • the sequence detection module recognizes the inference request as an attack request, so as to prevent the feature detection module from erroneously identifying a single or a small number of physical adversarial samples in the inference request.
  • the wrong recognition reasoning request is an attack request, which improves the recognition accuracy of the attack detection model.
  • the attack detection model forwards the inference request to the application model, so that the application model can carry out the inference request AI processing.
  • the attack detection model has already detected the inference request for physical adversarial attacks to ensure that the inference request obtained by the application model does not carry physical adversarial samples, which greatly reduces the risk of the application model being attacked. The probability of attack improves the security of the application model.
  • the attack detection model performs protection processing on the application model, including: the attack detection model blocks the application model from processing the reasoning request. For example, the protected application model does not need to make any changes.
  • the attack detection model will deduce whether the input sample contained in the request is a physical confrontation sample, and feed back the result to the corresponding access control mechanism.
  • the attack detection model will implement the physical confrontation attack. block.
  • blocking refers to setting the output information of the application model to be invalid.
  • the attack detection model blocks the application model from processing the inference request, including: the attack detection model sets the processing result output by the application model to is an invalid result. Since the processing result output by the application model is set as an invalid result, there is no valid output for the AI processing of the application model for physical adversarial samples, and the attacker cannot obtain the model parameters or Additional data, improving the security of the application model.
  • blocking refers to discarding input information of the application model, for example, the attack detection model blocks the application model from processing the inference request, including: the attack detection model discards the inference request. Since the attack detection model D31 determines that the inference request contains physical adversarial samples and discards the inference request, the application model cannot obtain the physical adversarial samples used by the attacker for physical adversarial attacks, that is, the application model has no input, and the application model also The physical adversarial example will not be processed, and the attacker cannot obtain the model parameters or other data of the application model, which improves the security of the application model.
  • the protected application model since the protected application model does not need to be modified in any way, it can also be defended by the attack detection model against physical counterattacks. Therefore, the accuracy of the application model will not be reduced due to reconstruction, and the accuracy of the application model will be improved.
  • the attack detection method further includes: the attack detection model records an alarm log, and the alarm log is used to indicate that the reasoning request includes a physical confrontation example.
  • the alarm log can be used for subsequent protection of the protected application model. For example, within a certain period of time, if there are only a small number of physical adversarial samples in the inference request, it will not affect the application model, that is, a small number of physical adversarial samples in an inference request will generally not be judged as a physical adversarial attack on the application model , the security of the application model will only be affected if the application model is continuously attacked by physical adversarial examples in multiple inference requests. In this way, the attack detection model can only record the alarm log, so that the application model can be protected only when the physical confrontation attack continues to occur, so as to avoid the application model from being disabled due to the misjudgment of the attack detection model, and improve the application Model availability.
  • an embodiment of the present application provides an attack detection device, where the attack detection device includes various modules for executing the attack detection method in the first aspect or any possible implementation manner of the first aspect.
  • the device may be realized by a software module; it may also be hardware with corresponding functions, such as a server, a terminal, and the like.
  • the attack detection device has the function of implementing the behaviors in the method examples of any one of the above first aspects.
  • the functions can be implemented by hardware, or by executing corresponding software by hardware, for example, the attack detection device is applied to a server, or a device that supports the server to implement the above attack detection method.
  • the aforementioned hardware or software includes one or more modules corresponding to the aforementioned functions.
  • the attack detection device is applied to an attack detection model, and the attack detection device includes: a communication unit for obtaining an inference request, the inference request carries a data set to be processed of the application model, and the data set to be processed includes one or multiple samples.
  • a detection unit is configured to detect whether there is a physical adversarial example in the data set to be processed.
  • the protection unit is configured to perform protection processing on the application model if there is a physical adversarial example in the data set to be processed.
  • the attack detection model is determined according to a training data set, and the training data set includes multiple physical adversarial samples and multiple standard samples for the application model.
  • the detection unit is specifically configured to, for each sample included in the data set to be processed, output the security information of the sample; the security information is used to indicate the confidence level that the sample contains physical anti-disturbance.
  • the detection unit is specifically configured to identify the sample as a physical adversarial sample for the application model if the confidence degree of the sample reaches a first threshold.
  • the security information of the sample is obtained by a feature detection module included in the attack detection model.
  • the detection unit is further configured to output a detection result of the data set to be processed according to the security information of the samples included in the data set to be processed.
  • the detection unit is specifically configured to store the physical adversarial samples in the sequence detection module included in the attack detection model.
  • the detection unit is specifically configured to determine that the reasoning request is an attack request if the number of physical confrontation samples in the multiple samples is greater than or equal to the first number.
  • the protection unit is specifically configured to block the application model from processing the reasoning request.
  • the protection unit is specifically used to set the processing result output by the application model as an invalid result.
  • the protection unit is specifically configured to discard the reasoning request.
  • the attack detection apparatus further includes: an alarm unit, configured to record an alarm log, and the alarm log is used to indicate that the reasoning request includes a physical confrontation example.
  • embodiments of the present application provide an attack detection system, where the attack detection system includes various models for executing the attack detection method in the first aspect or any possible implementation manner of the first aspect.
  • the system can be realized by software modules; it can also be hardware with corresponding functions, such as the first device and the second device, etc., for example, the first device is a server and the second device is a terminal.
  • the attack detection system has the function of implementing the behaviors in the method examples of any one of the first aspects above.
  • the functions can be implemented by hardware, or by executing corresponding software on the hardware, for example, the attack detection system is applied to a server, or a device that supports the server to implement the above attack detection method.
  • the attack detection system includes: a first device and a second device, an attack detection model is deployed in the first device, and an application model is deployed in the second device.
  • the first device acquires an inference request from the client, and the inference request carries a data set to be processed of an application model, and the data set to be processed includes one or more samples; the first device detects whether there is a physical adversarial sample in the data set to be processed; if the data to be processed Physical confrontation samples exist in a centralized manner, and the first device performs protection processing on the application model deployed in the second device.
  • the first device and the second device refer to different processing units running on the same physical device, and in this case, the attack detection system refers to the physical device.
  • the physical device may refer to a server, a mobile phone, a tablet computer, a personal computer or other computing devices with model processing capabilities.
  • the physical device is a server
  • the first device refers to a first processor in the server
  • the second device refers to a second processor in the server that is different from the first processor
  • the physical device is a processor in the server, the first device refers to a first processing unit in the processor, and the second device refers to a second processing unit in the processor that is different from the first processing unit.
  • an embodiment of the present application provides a server, the server includes a processor and an interface circuit, and the interface circuit is used to receive signals from other devices other than the server and transmit them to the processor, or transmit signals from the processor Send to other devices other than the server, and the processor implements the operation steps of the method described in the first aspect or any possible implementation manner of the first aspect through a logic circuit or executing code instructions.
  • the embodiment of the present application provides a terminal, the terminal includes a processor and an interface circuit, the interface circuit is used to receive signals from other devices other than the server and transmit them to the processor, or transmit signals from the processor The signal is sent to other devices other than the server, and the processor implements the operation steps of the method described in the first aspect or any possible implementation manner of the first aspect through a logic circuit or executing code instructions.
  • embodiments of the present application provide a computer-readable storage medium, in which computer programs or instructions are stored, and when the computer programs or instructions are executed by a processor in the server, they are used to implement the first aspect or Operation steps of the method described in any possible implementation manner of the first aspect.
  • the embodiments of the present application provide a computer program product, the computer program product includes instructions, and when the computer program product runs on the server or terminal, the server or terminal executes the instructions, so as to realize the first aspect or the second aspect.
  • an attack detection method in any possible implementation manner.
  • the embodiments of the present application provide a chip, the chip includes a control circuit and an interface circuit, the interface circuit is used to input information and/or output information; the logic circuit is used to perform the above-mentioned aspects or the various aspects
  • a possible implementation manner is to process input information and/or generate output information.
  • the interface circuit may obtain an inference request, and the logic circuit may implement the attack detection method described in the first aspect or possible implementation manners of the first aspect for the inference request.
  • FIG. 1 is a schematic structural diagram of an attack detection system provided by the present application
  • FIG. 2 is a schematic diagram of acquisition of an attack detection model provided by the present application
  • FIG. 3 is a schematic flow diagram 1 of an attack detection method provided by the present application.
  • FIG. 4 is a schematic flow diagram II of an attack detection method provided by the present application.
  • FIG. 5 is a schematic flow diagram III of an attack detection method provided by the present application.
  • FIG. 6 is a schematic structural diagram of an attack detection device provided by the present application.
  • FIG. 7 is a schematic structural diagram of a physical device provided by the present application.
  • an embodiment of the present application provides an attack detection method, which includes: first, the attack detection model obtains an inference request, and the inference request carries a data set to be processed of the application model, and the data set to be processed includes one or more samples. Second, the attack detection model detects whether there are physical adversarial examples in the dataset to be processed. Finally, if there are physical adversarial samples in the data set to be processed, the attack detection model performs protection processing on the application model.
  • This embodiment uses an attack detection model different from the application model to detect whether there are physical adversarial samples in the inference request. Since the application model does not need to resist physical adversarial attacks, it is not necessary to perform model retraining or defensive distillation on the application model, avoiding Refactoring the application model results in a reduction in the accuracy of the application model.
  • the attack detection model can also use the detected physical adversarial samples for training to optimize the attack detection capability of the attack detection model and improve the accuracy of physical adversarial attack detection.
  • FIG. 1 is a schematic structural diagram of an attack detection system provided by the present application.
  • the attack detection system includes a data center and a plurality of terminals (terminal 111-terminal 113 shown in Figure 1), the data center can communicate with the terminals through a network, the network can be the Internet, or other network.
  • the network may include one or more network devices, for example, the network devices may be routers or switches.
  • the data center includes one or more servers, such as the server 120 shown in FIG. 1 , such as an application server supporting application services.
  • the application servers can provide video services, image services, and other AI processing services based on videos or images.
  • the server 120 refers to a server cluster deployed with multiple servers, and the server cluster may have a rack, and the rack can establish communication for the multiple servers through a wired connection, such as a universal serial bus ( universal serial bus (USB) or peripheral component interconnect express (PCIe) high-speed bus, etc.
  • USB universal serial bus
  • PCIe peripheral component interconnect express
  • the server 120 may also acquire data from a terminal, perform AI processing on the data, and send the AI processing result to a corresponding terminal.
  • the AI processing may refer to using an AI model to perform object recognition, target detection, etc. on data, and may also refer to obtaining an AI model that meets requirements based on samples collected by the terminal.
  • the data center can also include other physical devices with AI processing capabilities, such as mobile phones, tablets or other devices.
  • a terminal may also be called a terminal device, a user equipment (user equipment, UE), a mobile station (mobile station, MS), a mobile terminal (mobile terminal, MT), etc.
  • the terminal can be a mobile phone (terminal 111 as shown in Figure 1), a facial recognition payment device with mobile payment function (terminal 112 as shown in Figure 1), a mobile phone with data (such as image or video) collection and processing functions Camera equipment (terminal 113 shown in Figure 1), etc.
  • the terminal can also be a tablet computer (Pad), a computer with a wireless transceiver function, a virtual reality (Virtual Reality, VR) terminal device, an augmented reality (Augmented Reality, AR ) terminal equipment, wireless terminals in industrial control, wireless terminals in self driving, wireless terminals in smart grid, wireless terminals in transportation safety, smart Wireless terminals in smart cities, wireless terminals in smart homes, etc.
  • the embodiment of the present application does not limit the specific technology and specific device form adopted by the terminal device.
  • the user can obtain the AI model and the like stored in the server 120 through the terminal.
  • the AI model may be a model for performing operations such as target detection, object recognition, or classification on data.
  • user 1 uses the AI model deployed in the terminal 111 to realize functions such as mobile phone face recognition and fingerprint recognition.
  • the user 2 uses the AI model deployed in the terminal 112 to realize functions such as facial recognition payment and object classification (such as product classification).
  • functions such as facial recognition payment and object classification (such as product classification).
  • the AI model deployed in the terminal 113 may be used to implement functions such as object detection.
  • Fig. 1 is only a schematic diagram and should not be construed as limiting the present application.
  • the embodiments of the present application do not limit the application scenarios of the terminal and the server.
  • this application provides a training method for an attack detection model, as shown in Figure 2, which is a schematic diagram of obtaining an attack detection model provided by this application, the The training method of the attack detection model may be executed by the server 120, or may be executed by any one of the terminals 111-113, and in some possible examples, the training method may also be executed by other devices.
  • the attack detection model training method is executed by the server 120 as an example for illustration.
  • the process of acquiring the attack detection model may include the following steps S210-S230.
  • the server 120 acquires a training data set.
  • the training data set includes multiple physical adversarial samples and multiple standard samples for the applied model.
  • a standard sample refers to a sample that meets the input requirements of the application model, and there is no adversarial attack in this sample. In some cases, a standard sample is called a normal sample.
  • Adversarial attack refers to adding some undetectable noise to the input data, which makes the application model make wrong judgments on the input data.
  • the added noise is called adversarial perturbation, and the samples obtained after adding noise It is called an adversarial example.
  • Adversarial attacks include physical adversarial attacks and digital adversarial attacks.
  • Digital adversarial attack refers to the computer controlling bit-level data changes in samples, thereby generating digital adversarial samples with digital adversarial perturbations, resulting in errors in the processing of the digital adversarial samples by the application model.
  • Physical adversarial attack refers to the physical adversarial disturbance generated in the real world when the deep learning model is deployed in the real world (such as an occluder in front of the face in the image, which will interfere with the recognition of the face) construction
  • a physical adversarial example causes an error in the application model's processing of the physical adversarial example.
  • Physical adversarial samples refer to samples with physical adversarial perturbations constructed by attackers in the real world.
  • the physical adversarial samples can be samples in different scenarios, such as target detection, object recognition, object classification, or other possible scenarios, such as facial recognition payment, video surveillance, access control detection, etc.
  • the server 120 can mark the physical confrontation sample as a positive sample, and the Standard samples are marked as negative samples.
  • the server 120 further marks the adversarial disturbance or attack mode in the physical adversarial samples.
  • the server 120 uses the training data set to train the first model, and judges whether the first model converges.
  • the first model is determined based on the application scenario or field to which the application model is applicable.
  • the first model and the application model are neural network models of the same type.
  • the application model is a convolutional neural network (Convolutional Neural Network, CNN) model
  • the first model may also be a CNN model.
  • the application model is a Recurrent Neural Network (RNN) model or a Transformer-based model
  • the first model may also be an RNN model or a Transformer-based model.
  • RNN Recurrent Neural Network
  • the first model and the application model are different types of neural network models.
  • the application model is a CNN model
  • the first model may be an RNN model.
  • the application model may have multiple types of networks.
  • the application model is Composite neural network with CNN and RNN
  • the first model can have one or more types of neural network.
  • the server 120 may set training (hyper) parameters such as loss function and optimization method for the first model based on the application model, and train the first model with the help of the above-mentioned training data set.
  • the training method includes but is not limited to: using an existing pre-trained model as the above-mentioned first model, and performing fine-tuning training on the pre-trained model.
  • the server 120 uses the training data set to train the first model for multiple times until the first model converges.
  • the convergence of the first model may mean that the number of times the server 120 trains the first model reaches a threshold (such as 50,000 times), or the detection accuracy of the first model on physical confrontation samples and standard samples reaches a threshold (such as 99%), This application is not limited to this.
  • the server 120 uses the first model as an attack detection model.
  • the process of training the attack detection model may be executed by the server or by the terminal, which is not limited in the present application.
  • the attack detection model may also be obtained through training by other computing devices, and the other computing device sends the trained attack detection model to the server or terminal provided in the embodiment of the present application.
  • the other computing device may include at least one processor, and the processor may be an integrated circuit chip with signal processing capability.
  • the processor can be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; it can also be a digital signal processor (Digital Signal Processing, DSP), an application-specific integrated circuit (Application Specific Integrated Circuit, ASIC), Field Programmable Gate Array (Field Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • CPU Central Processing Unit
  • NP Network Processor
  • DSP digital signal processor
  • ASIC Application Specific Integrated Circuit
  • FPGA Field Programmable Gate Array
  • the processor can implement the above S210-S230 and possible sub-steps thereof.
  • the multiple processors may cooperate to implement the above S210-S230 and possible sub-steps thereof, for example, the multiple processors include a first processor and a second processor, The first processor may implement the above S210 and S220, and the second processor may implement the above S230.
  • the processing capability of the computing device dedicated to model training and generation is stronger than that of the terminal.
  • the training process of the attack detection model is controlled by the computing device (such as the above-mentioned server 120) Execution, the training time of the attack detection model will be shorter and the training efficiency will be higher.
  • the terminal can use the attack detection model to determine whether there are physical adversarial samples in the inference request, and then perform protection processing on the application model if the inference request has physical adversarial samples. Since the application model does not need to resist physical adversarial attacks, the server does not need to The model performs model retraining or defensive distillation, etc., to avoid the reduction of the accuracy of the application model caused by the reconstruction of the application model.
  • the attack detection model can also use the detected physical adversarial samples for training to optimize the attack detection capability of the attack detection model and improve the accuracy of attack detection.
  • the server can also train physical adversarial samples based on the deep neural network to obtain a feature detection module, which is a part of the above-mentioned attack detection model; furthermore, the server combines the feature detection module with Other modules obtain attack detection models.
  • this embodiment provides a possible specific implementation manner.
  • the server uses the latest physical adversarial attack methods disclosed in the industry to generate corresponding physical adversarial samples (videos, photos) as positive samples in different environments (indoors, outdoors, different lighting, different faces, etc.).
  • Adversarial examples can lead to false positives in AI-based object recognition systems.
  • the server uses manual or automated methods to further mark the areas where physical countermeasure disturbances exist in the above physical countermeasure samples (videos, photos).
  • the server acquires normal samples (videos, photos) of corresponding objects in the same or similar environments (indoors, outdoors, different lighting, etc.) as negative samples.
  • the server uses the above-mentioned marked positive and negative samples, selects the appropriate DNN architecture, sets the training (hyper) parameters such as loss function and optimization method, and uses the samples (training data set) marked in the above steps to adopt a data-driven approach , to train the corresponding attack detection model.
  • the attack detection model can detect the physical adversarial disturbance in the video and picture with high accuracy, so as to distinguish the physical adversarial input (physical adversarial sample) from the normal input (standard sample), and give the corresponding confidence of the judgment or probability etc.
  • FIG. 3 is a first schematic flow diagram of an attack detection method provided by the present application.
  • the attack detection method is executed by an attack detection model D31, which is used to detect physical confrontation attacks against an application model D32.
  • an attack detection model D31 which is used to detect physical confrontation attacks against an application model D32.
  • the training process of the attack detection model D31 reference may be made to the relevant content in FIG. 2 above, which will not be repeated here.
  • attack detection model D31 and the application model D32 are deployed on the same physical device. Any one of terminals 111 to 113 shown in FIG. 1, or server 120, or other processing equipment,
  • the attack detection model D31 and the application model D32 are deployed on different physical devices.
  • the attack detection model D31 is deployed on the terminal 111
  • the application model D32 is deployed on the server 120 communicating with the terminal 111 .
  • the management of the attack detection model D31 and the application model D32 can be realized by the AI application gateway D30, which uses A physical machine for managing multiple AI models may also be a virtual machine (virtual machine, VM) or a container (container), etc., and the VM or container refers to a software process for realizing the functions of the AI application gateway D30.
  • the AI application gateway D30 is a VM or a container
  • the AI application gateway D30 can be deployed in the aforementioned terminal or server, which is not limited in this application.
  • the above AI application gateway D30, attack detection model D31 and application model D32 are deployed on one physical device as an example for illustration.
  • the attack detection method provided in this embodiment includes the following steps.
  • the attack detection model D31 acquires an inference request.
  • the reasoning request may be generated by the terminal where the attack detection model D31 is deployed, or may be sent by other devices communicating with the terminal.
  • the reasoning request carries a data set to be processed of the application model D32, and the data set to be processed includes one or more samples.
  • the above-mentioned samples refer to the data to be processed by applying the model D32, such as pictures, videos, audios or texts.
  • the sample may be a picture or a video that needs to be detected.
  • the sample may be a picture or video that needs to be recognized.
  • the sample may be audio or text that needs to be analyzed.
  • the inference request also carries an indication of the type of AI processing required, such as the above-mentioned object recognition or target detection.
  • the inference request acquired by the attack detection model D31 may be forwarded by the AI application gateway D30.
  • the attack detection model D31 and the application model D32 are deployed in parallel, and the AI application gateway D30 forwards the reasoning request to the application model D32.
  • the attack detection model D31 and the application model D32 are deployed serially, and after the attack detection model D31 does not carry physical adversarial samples in the detection reasoning request, the reasoning request is forwarded to the application Model D32.
  • the attack detection model D31 has already detected the inference request for physical counterattacks, so as to ensure that the application model D32 obtains
  • the received inference request does not carry physical confrontation samples, which greatly reduces the probability of the application model D32 being attacked and improves the security of the application model D32.
  • the attack detection model D31 detects whether there is a physical adversarial example in the data set to be processed.
  • the above S320 includes the following content: for each sample included in the data set to be processed, the attack detection model D31 outputs the security information of the sample, and the security information is used to indicate that the sample contains physical countermeasures confidence level; if the confidence level of the sample reaches the first threshold, the attack detection model D31 identifies the sample as a physical countermeasure against the application model D32.
  • the confidence level reaching the first threshold means that the confidence level is greater than or equal to the first threshold, and the first threshold can be set according to different application scenarios. For example, in the scenario of facial recognition payment, the first threshold is 40%; as another example, in the scenario of access control detection, the first threshold is 60%.
  • Table 1 provides a possible situation regarding the confidence indicated by the above security information.
  • the confidence level of sample 1 containing physical countermeasure disturbance is 12%
  • the confidence level of sample 2 containing physical countermeasure disturbance is 13%
  • the confidence level of sample 3 containing physical countermeasure disturbance is 65%
  • sample 4 contains physical countermeasure
  • the confidence level of the perturbation is 5%.
  • the attack detection model D31 takes sample 3 as a physical confrontation sample.
  • FIG. 1 is only an example provided by this embodiment, and the data set to be processed may only include one sample, such as sample 3 shown in Table 1.
  • sample 3 is identified as a physical adversarial sample by the attack detection model D31
  • the attack detection model D31 may use the reasoning request as an attack request for the application model D32.
  • the attack detection model can perform physical confrontation attack detection on the sample carried by the inference request, it replaces the process of the application model resisting the physical confrontation attack, so that the attack detection model replaces the application model to realize the function of physical confrontation attack detection; Therefore, the application model does not need to be reconstructed in order to resist physical confrontation attacks, which avoids the reduction of the accuracy of the application model caused by the reconstruction process of the application model in common technologies.
  • the above S320 may further include: the attack detection model D31 outputs the detection result of the data set to be processed according to the security information of the multiple samples included in the data set to be processed.
  • the attack detection model D31 determines that the data set to be processed is provided by an attacker, and the inference request as an attack request.
  • the attack detection model can determine the detection result of the physical adversarial attack detection of the reasoning request according to the confidence that each sample in the plurality of samples contains physical adversarial disturbances, It prevents the attack detection model from accidentally judging that there are a small number of physical confrontation samples in the data set to be processed, and then determines that the reasoning request is an attack request, which improves the accuracy of the attack detection model for physical confrontation attack detection.
  • the attack detection model D31 includes a feature detection module D311 and a sequence detection module D312.
  • the feature detection module D311 is used to identify and detect the features included in the sample. In some possible examples, the feature detection module D311 can also identify and mark the attack mode of physical countermeasures included in the sample.
  • the sequence detection module D312 is used to cache one or more physical adversarial samples.
  • the above S320 may include the following steps S321-S323.
  • the feature detection module D311 outputs the security information of the sample.
  • the feature detection module D311 identifies the sample as a physical adversarial sample for the application model D32.
  • the feature detection module D311 stores the physical adversarial example in the sequence detection module D312 included in the attack detection model D31.
  • the sequence detection module includes a vector sequence formed according to the detection results corresponding to successive reasoning requests of a certain/group of users. As shown in FIG. 4, multiple sequences (sequence 1 and sequence 2) may be stored in the sequence detection module D312, and each sequence may be generated by a different user.
  • an inference request may include a video, and each frame image in the video is a sample in the inference request, and each sample may indicate Faces of the same or different users. Then, during the process of the attacker's physical confrontation attack, the attacker can set multiple physical confrontation samples generated by different users in one inference request.
  • the sequence 1 stored by the sequence detection module D312 is a plurality of physical adversarial samples (sample 1 and sample 3) generated by user 1
  • the sequence 2 stored by the sequence detection module D312 is a plurality of physical adversarial samples generated by user 2 (sample 2).
  • the sequence detection module D312 can use methods such as statistical analysis or threshold judgment to analyze the above detection result sequence, and judge whether the sequence/user's input contains physical countermeasure disturbance, or is performing physical countermeasure attack. Please continue to refer to FIG. 4, the above S320 may further include the following step S323.
  • the sequence detection module D312 determines that the reasoning request is an attack request.
  • the sequence detection module recognizes the reasoning request as an attack request, so as to prevent the feature detection module from erroneously identifying a single or a small amount of physical confrontation in the reasoning request.
  • the wrong identification reasoning request is an attack request, which improves the recognition accuracy of the attack detection model.
  • the attack detection model D31 forwards the inference request to the application model D32, so that the application model D32 performs AI processing on inference requests.
  • the attack detection model Before the application model receives the inference request, the attack detection model has already detected the inference request for physical adversarial attacks to ensure that the inference request obtained by the application model does not carry physical adversarial samples, which greatly reduces the risk of the application model being attacked. The probability of attack improves the security of the application model.
  • the attack detection method provided in this embodiment further includes the following step S330.
  • the attack detection model D31 performs protection processing on the application model D32.
  • the attack detection model D31 performs protection processing on the application model D32, including many possible implementation manners, and this embodiment provides the following two possible situations.
  • the attack detection model D31 blocks the application model D32 from processing the reasoning request.
  • the attack detection model will deduce whether the input sample contained in the request is a physical confrontation sample, and feed back the result to the corresponding access control mechanism. The attack is blocked.
  • the attack detection model D31 sets the processing result output by the application model D32 as an invalid result. Since the processing result output by the application model D32 is set as an invalid result, there is no valid output for the AI processing of the application model D32 on physical adversarial samples, and the attacker cannot obtain the physical adversarial samples contained in the reasoning request. Model parameters or other data, increasing the security of the application model D32.
  • the attack detection model D31 discards the above reasoning request. Since the attack detection model D31 determines that the inference request contains physical adversarial samples, and discards the inference request, the application model D32 cannot obtain the physical adversarial samples used by the attacker for physical adversarial attacks, that is, the application model D32 has no input, and the application model D32 D32 will not process the physical adversarial example, and the attacker cannot obtain model parameters or other data of the application model D32, which improves the security of the application model D32.
  • the protected application model since the protected application model does not need to be modified in any way, it can also be defended by the attack detection model against physical confrontation attacks, so that the accuracy of the application model will not be reduced due to reconstruction. Improved accuracy of applied models.
  • the attack detection model D31 records an alarm log, and the alarm log is used to indicate that the reasoning request includes a physical confrontation example.
  • the attack detection model D31 can also send the alarm log to the AI application gateway D30 so that the AI application gateway D30 can record the alarm log. Subsequent protection of the protected application model. For example, if within a certain period of time, the AI application gateway D30 records multiple inference requests sent by the same client, and the multiple inference requests all contain physical confrontation samples, the AI application gateway D30 can identify the client as an attack or, and block other subsequent requests sent by the client, so as to achieve the purpose of protecting the application model D32.
  • the application model D32 Due to the particularity of the application model D32, within a certain period of time, if there are only a small number of physical adversarial samples in the inference request, it will not affect the application model D32, that is, a small number of physical adversarial samples in an inference request will generally not be judged as The physical adversarial attack of the application model D32 will affect the security of the application model D32 only when the application model 32 is continuously attacked by physical adversarial samples in multiple reasoning requests.
  • Situations where the application model 32 is continuously attacked by physical adversarial examples in multiple reasoning requests may include, but not limited to: model extraction, membership inference, and model inversion.
  • model extraction refers to the behavior that the server sets the model parameters of the application model in the cloud and provides them to the user, and the attacker uses different samples to poll the model parameters of the application model.
  • Membership inference refers to the behavior of an attacker using samples (including physical adversarial samples and standard samples) to repeatedly poll the application model to determine whether a certain sample or a certain group is in the training sample set of the application model.
  • Model reverse refers to the behavior that an attacker uses samples (including physical confrontation samples and standard samples) to poll the application model repeatedly, and constructs an equivalent model of the application model based on the polling results, or reversely constructs a training sample set of the application model .
  • the attack detection model records the reasoning requests with physical confrontation samples, so that the attack detection model realizes the security situation awareness of physical confrontation attacks.
  • the security situation refers to the security situation of the application model within a certain period of time. As in this application, the security situation refers to whether the application model is attacked by physical confrontation by an attacker.
  • the attack detection model circumvents the above-mentioned behaviors such as model extraction, member reasoning, and model inversion, avoiding the leakage of model parameters and training sample sets of the application model, which improves the security of the application model.
  • the AI application gateway can identify an event with a small number of physical adversarial samples in an inference request as an attacker's heuristic attack, or the AI application gateway can identify an event with a large number of physical adversarial samples in multiple inference requests The event is identified as the attacker's physical counterattack flow.
  • the AI application gateway and the attack detection model identify the above-mentioned tentative attack or physical confrontation attack flow, they can mark the client sending the inference request as an attacker and refuse to receive other requests from the client to protect the above-mentioned application model and improve Apply model security.
  • this embodiment uses an attack detection model different from the application model to detect whether there are physical adversarial samples in the inference request. Since the application model does not need to resist physical adversarial attacks, it is not necessary to perform model retraining or defensive distillation on the application model to avoid In order to reconstruct the application model, the accuracy of the application model is reduced. In addition, in some possible situations, the attack detection model can also use the detected physical adversarial samples for training to optimize the attack detection capability of the attack detection model and improve the accuracy of physical adversarial attack detection.
  • model manufacturer of the application model does not need to consider the threat of physical confrontation attacks faced by the application model, it only needs to choose to deploy its application model to the computing power platform that provides the attack detection model to obtain the perception of the application model on physical confrontation attacks ( detection) and defense capabilities.
  • this application also provides a possible specific implementation, as shown in Figure 5, which is The third schematic flow chart of an attack detection method provided in this application, the attack detection model D31 and the application model D32 are deployed in parallel, and the application model D32 is also connected to the database D33.
  • the database 33 stores a plurality of comparison features, and the comparison features are used to support the realization of the AI processing function of the application model D32.
  • the AI processing process may be, but not limited to: object recognition, target detection, object classification, and the like.
  • Biometric features can be but not limited to face, fingerprint, body, eye iris and other information.
  • the attack detection method provided in this embodiment includes the following steps .
  • the AI application gateway D30 obtains the reasoning request.
  • the AI application gateway D30 sends an inference request to the attack detection model D31.
  • the AI application gateway D30 sends an inference request to the application model D32.
  • the AI application gateway When the AI application gateway receives an inference request (biometric input information, such as video, photo) from the customer (terminal or client), when the AI application gateway sends the inference request to the application model, it also sends the inference request or Sampling from the reasoning request (for example, intercepting a frame of image every second from the video, so as to control the computing power required for attack detection) is forwarded to the attack detection model.
  • an inference request biometric input information, such as video, photo
  • the AI application gateway sends the inference request to the application model, it also sends the inference request or Sampling from the reasoning request (for example, intercepting a frame of image every second from the video, so as to control the computing power required for attack detection) is forwarded to the attack detection model.
  • the application model D32 can use the multiple comparison features (feature 1 to feature 3) stored in the database D33 to process the four samples carried in the inference request, and determine that the inference request contains two samples with feature 1 , and 1 sample with feature 2.
  • the attack detection model D31 may also perform a physical confrontation attack detection process, as shown in S530 below.
  • the attack detection model D31 performs physical adversarial attack detection on the multiple samples contained in the inference request.
  • the attack detection model D31 uses its pre-trained feature detection module D311 to conduct reasoning analysis on the photos (or photo sequences) forwarded by the AI application gateway D30 to determine whether they contain physical countermeasure disturbances, and give the confidence or probability of the judgment.
  • the specific implementation and beneficial effects of S530 reference may be made to the content of S320 above, which will not be repeated here.
  • the attack detection model D31 can generate and output alarm information, as shown in S541 in FIG. 5 .
  • the attack detection model D31 sends alarm information to the AI application gateway D30.
  • the warning information indicates that the inference request contains physical countermeasure disturbance/attack.
  • the alarm information is recorded in the form of a log (log).
  • the AI application gateway D30 performs protection processing on the application model D32 according to the alarm information.
  • the AI application gateway D30 and the attack detection model D31 can intercept the biometric identification result of the application model D32, preventing attackers from using physical adversarial samples to bypass biometric identification-based certification, increasing the security of the application model D32.
  • This embodiment can provide a "model-independent" physical adversarial attack detection mechanism, detect whether there are physical adversarial samples in the inference request in the biometric recognition scenario, and provide security situation awareness when the application model is running. Alerts against attacks.
  • the server generates (and marks) various physical adversarial samples for biometric identification, and uses a data-driven method to train a corresponding attack detection model based on the DNN model.
  • the server can deploy the attack detection engine integrated with the attack detection model to the corresponding AI application environment in a manner independent of the (protected AI) model, and the attack detection engine and the application model can simultaneously receive the application's reasoning request (input sample).
  • the attack detection model can simultaneously judge whether the inference request contains physical adversarial samples (disturbances), and realize the detection of physical adversarial attacks for biometric recognition.
  • the server and the terminal device include hardware structures and/or software modules corresponding to each function.
  • the present application can be implemented in the form of hardware or a combination of hardware and computer software with reference to the units and method steps of the examples described in the embodiments disclosed in the present application. Whether a certain function is executed by hardware or computer software drives the hardware depends on the specific application scenario and design constraints of the technical solution.
  • FIG. 6 and FIG. 7 are schematic structural diagrams of possible attack detection apparatuses and physical devices provided by the embodiments of the present application.
  • the attack detection device and the physical device can be used to implement the functions of the server 120 and any terminal in the above method embodiment, so the beneficial effects of the above method embodiment can also be realized.
  • the attack detection device 600 can be the server 120 as shown in Figure 1, or a module (such as a chip) applied to the server 120, or any terminal shown in Figure 1 .
  • the attack detection device 600 includes a communication unit 610 , a detection unit 620 , a protection unit 630 , an alarm unit 640 , a storage unit 650 and a training unit 660 .
  • the attack detection apparatus 600 is configured to realize the functions of the method embodiments shown in the foregoing FIGS. 2 to 5 .
  • the communication unit 610 is used to perform S210
  • the training unit 660 is used to perform S220 and S230.
  • the attack detection apparatus 600 When the attack detection apparatus 600 is used to implement the method embodiment shown in FIG. 3 : the communication unit 610 is used to perform S310 , the detection unit 620 is used to perform S320 , and the protection unit 630 is used to perform S330 .
  • the communication unit 610 is used to execute S310
  • the detection unit 620 is used to execute S321-S323
  • the protection unit 630 is used to execute S330.
  • the communication unit 610 is used to perform S510, S521 and S522; the detection unit 620 is used to perform S530, and the alarm unit 640 is used to perform S541 and S542.
  • the storage unit 650 may be used to store the above-mentioned reasoning requests, physical adversarial samples identified by the attack detection model D31, and the like.
  • FIG. 7 is a schematic structural diagram of a physical device provided in the present application, and the physical device 700 includes a processor 710 and an interface circuit 720 .
  • the physical device 700 may refer to the aforementioned server, terminal, or other computing devices.
  • the processor 710 and the interface circuit 720 are coupled to each other. It can be understood that the interface circuit 720 may be a transceiver or an input-output interface.
  • the physical device 700 may further include a memory 730 for storing instructions executed by the processor 710 or storing input data required by the processor 710 to execute the instructions or storing data generated by the processor 710 after executing the instructions.
  • the physical device 700 When the physical device 700 is used to implement the methods shown in FIGS. Function of unit 660 .
  • the processor 710, the interface circuit 720, and the memory 730 can also cooperate to implement each operation step in the attack detection method.
  • the physical device 700 may also perform the functions of the attack detection apparatus 600 shown in FIG. 6 , which will not be described in detail here.
  • the embodiment of the present application does not limit the specific connection medium among the interface circuit 720, the processor 710, and the memory 730.
  • the interface circuit 720, the processor 710, and the memory 730 are connected through the bus 740.
  • the bus is represented by a thick line in FIG. 7, and the connection mode between other components is only for schematic illustration. , is not limited.
  • the bus can be divided into address bus, data bus, control bus and so on. For ease of representation, only one thick line is used in FIG. 7 , but it does not mean that there is only one bus or one type of bus.
  • the memory 730 can be used to store software programs and modules, such as the program instructions/modules corresponding to the attack detection method provided in the embodiment of the present application.
  • the processor 710 executes various functional applications by executing the software programs and modules stored in the memory 730 and data processing.
  • the interface circuit 720 can be used for signaling or data communication with other devices. In this application, the physical device 700 may have multiple interface circuits 720 .
  • the processor in the embodiments of the present application can be a CPU, a neural processing unit (Neural processing unit, NPU) or a graphics processing unit (Graphic processing unit, GPU), and can also be other general-purpose processors, DSP, ASIC, FPGA or other programmable logic devices, transistor logic devices, hardware components or any combination thereof.
  • a general-purpose processor can be a microprocessor, or any conventional processor.
  • the method steps in the embodiments of the present application may be implemented by means of hardware, or may be implemented by means of a processor executing software instructions.
  • Software instructions can be composed of corresponding software modules, and software modules can be stored in random access memory (Random Access Memory, RAM), flash memory, read-only memory (Read-Only Memory, ROM), programmable read-only memory (Programmable ROM) , PROM), erasable programmable read-only memory (Erasable PROM, EPROM), electrically erasable programmable read-only memory (Electrically EPROM, EEPROM), register, hard disk, mobile hard disk, CD-ROM or well-known in the art any other form of storage medium.
  • RAM Random Access Memory
  • ROM read-only memory
  • PROM programmable read-only memory
  • PROM erasable programmable read-only memory
  • Erasable PROM Erasable PROM
  • EPROM electrically erasable programmable read-only memory
  • register hard disk, mobile hard disk
  • An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium.
  • the storage medium may also be a component of the processor.
  • the processor and storage medium can be located in the ASIC.
  • the ASIC can be located in a network device or a terminal device.
  • the processor and the storage medium may also exist in the network device or the terminal device as discrete components.
  • all or part of them may be implemented by software, hardware, firmware or any combination thereof.
  • software When implemented using software, it may be implemented in whole or in part in the form of a computer program product.
  • the computer program product comprises one or more computer programs or instructions. When the computer program or instructions are loaded and executed on the computer, the processes or functions described in the embodiments of the present application are executed in whole or in part.
  • the computer may be a general purpose computer, a special purpose computer, a computer network, network equipment, user equipment, or other programmable devices.
  • the computer program or instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer program or instructions may be downloaded from a website, computer, A server or data center transmits to another website site, computer, server or data center by wired or wireless means.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer, or a data storage device such as a server or a data center integrating one or more available media. Described usable medium can be magnetic medium, for example, floppy disk, hard disk, magnetic tape; It can also be optical medium, for example, digital video disc (digital video disc, DVD); It can also be semiconductor medium, for example, solid state drive (solid state drive) , SSD).
  • “at least one” means one or more, and “multiple” means two or more.
  • “And/or” describes the association relationship of associated objects, indicating that there may be three types of relationships, for example, A and/or B, which can mean: A exists alone, A and B exist simultaneously, and B exists alone, where A, B can be singular or plural.
  • the character “/” generally indicates that the contextual objects are an “or” relationship; in the formulas of this application, the character “/” indicates that the contextual objects are a “division” Relationship.
  • the singular forms “a”, “an” and “the” do not mean “one or only one” but “one or more” unless the context clearly dictates otherwise. in one".
  • a device means reference to one or more such devices.
  • at least one (at least one of). «" means one or any combination of subsequent associated objects, such as "at least one of A, B and C” includes A, B, C, AB, AC, BC, or ABC.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Virology (AREA)
  • Computer And Data Communications (AREA)

Abstract

公开了一种攻击检测方法及装置,涉及AI领域,解决了静态防御方法对应用模型进行重构,导致应用模型处理样本的精度降低的问题。该攻击检测方法包括:首先,攻击检测模型获取推理请求,该推理请求携带有应用模型的待处理数据集,待处理数据集包括一个或多个样本。其次,攻击检测模型检测待处理数据集中是否存在物理对抗样本。最后,若待处理数据集中存在物理对抗样本,攻击检测模型对应用模型执行防护处理。本实施例采用不同于应用模型的攻击检测模型来检测推理请求中是否具有物理对抗样本,由于应用模型无需抵抗物理对抗攻击,因此无需对应用模型进行模型重训练或防御性蒸馏等,避免了对应用模型进行重构导致应用模型的精度降低。

Description

一种攻击检测方法及装置
本申请要求于2021年8月20日提交国家知识产权局、申请号为202110959827.5、申请名称为“攻击检测的方法、装置和系统”的中国专利申请的优先权,本申请还要求于2021年9月10日提交国家知识产权局、申请号为202111064335.6、申请名称为“一种攻击检测方法及装置”的中国专利申请的优先权,这些申请的全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能(artificial intelligence,AI)领域,尤其涉及一种攻击检测方法及装置。
背景技术
深度神经网络(Deep Neural Network,DNN)广泛应用于计算机视觉(Computer Vision,CV)、语音识别、自然语言处理(Natural Language Processing,NLP)等领域。在基于DNN的应用模型的使用过程中,攻击者为窃取应用模型的参数配置或数据等,采用数字对抗攻击或物理对抗攻击的方式来对该应用模型展开攻击。在数字对抗攻击中,攻击者可控制位(bit)级的数据来攻击应用模型;在物理对抗攻击中,攻击者基于真实的物理世界中构造物理对抗样本(adversarial example)对该应用模型展开攻击。
以检测物理对抗攻击为例,应用模型采用模型相关的静态防御方法来检测物理对抗攻击,如该静态防御方法为模型重训练方法或防御性蒸馏。而静态防御方法依赖于对应用模型进行重构,导致应用模型处理样本的精度降低。因此,如何检测物理对抗攻击成为目前亟需解决的问题。
发明内容
本申请提供一种攻击检测方法及装置,解决了静态防御方法对应用模型进行重构,导致应用模型处理样本的精度降低的问题。
为达到上述目的,本申请采用如下技术方案。
第一方面,本申请的实施例提供一种攻击检测方法,该方法可应用于终端设备,或者该方法可应用于可以支持终端设备实现该方法的服务器,例如该服务器包括芯片系统,方法包括:首先,攻击检测模型获取推理请求,该推理请求携带有应用模型的待处理数据集,待处理数据集包括一个或多个样本。其次,攻击检测模型检测待处理数据集中是否存在物理对抗样本。最后,若待处理数据集中存在物理对抗样本,攻击检测模型对应用模型执行防护处理。
本实施例采用不同于应用模型的攻击检测模型来检测推理请求中是否具有物理对抗样本,由于应用模型无需抵抗物理对抗攻击(adversarial attack),因此无需对应用模型进行模型重训练或防御性蒸馏等,避免了对应用模型进行重构导致应用模型的精度降低。
另外,在一些可能的情形中,攻击检测模型还可以利用检测到的物理对抗样本进行训练,以优化攻击检测模型的攻击检测能力,提供物理对抗攻击检测的准确率。
在一种可选的实现方式中,攻击检测模型是依据训练数据集确定的,训练数据集包括针对应用模型的多个物理对抗样本和多个标准样本。通常,专用于模型训练和生成的计算 设备的处理能力比终端的处理能力更强,相较于终端来训练得到攻击检测模型,若攻击检测模型的训练过程由计算设备(如服务器)执行,攻击检测模型的训练时间会更短,训练效率更高。终端可以利用攻击检测模型来确定推理请求中是否具有物理对抗样本,进而在推理请求具有物理对抗样本的情况下,对应用模型执行防护处理,由于应用模型无需抵抗物理对抗攻击,因此服务器无需对应用模型进行模型重训练或防御性蒸馏等,避免了对应用模型进行重构导致应用模型的精度降低。
在另一种可选的实现方式中,攻击检测模型检测待处理数据集中是否存在物理对抗样本,包括:对于待处理数据集包括的每一个样本,攻击检测模型输出样本的安全信息;该安全信息用于指示样本包含物理对抗扰动(adversarial perturbation)的置信度。以及,若样本的置信度达到第一阈值,攻击检测模型将样本识别为针对应用模型的物理对抗样本。示例的,若待处理数据集中仅包括一个样本,当该样本被攻击检测模型识别为物理对抗样本的情况下,攻击检测模型可以将该推理请求作为针对于应用模型的攻击请求。
在另一种可选的实现方式中,样本的安全信息是由攻击检测模型包含的特征检测模块获取的。在本实施例中,由于攻击检测模型可以对推理请求携带的样本进行物理对抗攻击检测,替代了应用模型抵抗物理对抗攻击的过程,使得攻击检测模型替代了应用模型实现物理对抗攻击检测的功能;因此,应用模型无需为了抵抗物理对抗攻击而进行重构,避免了通常技术中应用模型的重构过程所导致的应用模型的精度降低。
在另一种可选的实现方式中,攻击检测模型检测待处理数据集中是否存在物理对抗样本,还包括:攻击检测模型依据待处理数据集包括的多个样本的安全信息,输出待处理数据集的检测结果。在本实施例中,若待处理数据集包括多个样本,则攻击检测模型可以依据这多个样本中每个样本包含物理对抗扰动的置信度,确定推理请求的物理对抗攻击检测的检测结果,避免了攻击检测模型在偶然的情况下,错误的判定待处理数据集存在少量物理对抗样本,进而确定推理请求为攻击请求,提高了攻击检测模型进行物理对抗攻击检测的准确率。
在另一种可选的实现方式中,攻击检测模型依据待处理数据集包括的多个样本的安全信息,输出待处理数据集的检测结果,包括:攻击检测模型将物理对抗样本存储在攻击检测模型包含的序列检测模块中;并在多个样本中物理对抗样本的数量大于或等于第一数量的情况下,序列检测模块确定推理请求为攻击请求。如此,在本实施例中,在推理请求具有一定数量的物理对抗样本的情况,序列检测模块才识别该推理请求为攻击请求,避免特征检测模块在错误的识别出推理请求中具有单个或少量的物理对抗样本时,错误的识别推理请求为攻击请求,提高了攻击检测模型的识别准确率。
可选的,在应用检测模型和攻击检测模型为串行部署的情况下,若推理请求中未携带有物理对抗样本,攻击检测模型将推理请求转发至应用模型,以便该应用模型对推理请求进行AI处理。由于应用模型接收到推理请求之前,攻击检测模型已对该推理请求进行物理对抗攻击的检测,以保证应用模型获取到的推理请求中未携带有物理对抗样本,这极大的降低了应用模型被攻击的概率,提高了应用模型的安全性。
在另一种可选的实现方式中,攻击检测模型对应用模型执行防护处理,包括:攻击检测模型阻断应用模型处理推理请求。示例的,被保护的应用模型无需做任何改动,攻击检测模型将推理请求包含的输入样本是否为物理对抗样本,并将反馈结果给相应的访问控制 机制,由攻击检测模型对该物理对抗攻击进行阻断。
在一种可能的示例中,“阻断”是指将应用模型的输出信息设置为无效,例如,攻击检测模型阻断应用模型处理推理请求,包括:攻击检测模型将应用模型输出的处理结果设置为无效结果。由于应用模型输出的处理结果被设置为无效结果,使得该应用模型针对于物理对抗样本的AI处理不存在有效输出,攻击者利用该推理请求包含的物理对抗样本无法获取到应用模型的模型参数或其他数据,提高了应用模型的安全性。
在另一种可能的示例中,“阻断”是指丢弃应用模型的输入信息,例如,攻击检测模型阻断应用模型处理推理请求,包括:攻击检测模型丢弃推理请求。由于攻击检测模型D31确定推理请求中包含有物理对抗样本,并将推理请求丢弃,使得应用模型无法获取到攻击者进行物理对抗攻击所采用的物理对抗样本,即应用模型没有输入,应用模型也就不会处理该物理对抗样本,进而,攻击者无法获取到应用模型的模型参数或其他数据,提高了应用模型的安全性。
如此,由于被保护的应用模型无需做任何改动,也可以受到攻击检测模型针对于物理对抗攻击的防御,因此,使得应用模型的精度不会由于重构而降低,提高了应用模型的准确性。
在另一种可选的实现方式中,该攻击检测方法还包括:攻击检测模型记录告警日志,告警日志用于指示推理请求包括物理对抗样本。该告警日志可以用于被保护的应用模型的后续保护。示例的,在一定时长内,若推理请求中仅具有少量的物理对抗样本,不会对应用模型造成影响,即一个推理请求中的少量物理对抗样本一般不会被判定为应用模型的物理对抗攻击,仅在应用模型持续受到多个推理请求中物理对抗样本进行攻击的情况下,才会对应用模型的安全性造成影响。如此,攻击检测模型可以仅记录告警日志,以便在物理对抗攻击持续发生时,才对应用模型进行保护,避免应用模型由于攻击检测模型的误判所导致的应用模型处于停用状态,提高了应用模型的可用性。
第二方面,本申请的实施例提供一种攻击检测装置,所述攻击检测装置包括用于执行第一方面或第一方面任一种可能实现方式中的攻击检测方法的各个模块。该装置可以是软件模块实现;也可以是拥有相应功能的硬件,例如服务器、终端等。
有益效果可以参见第一方面中任一方面的描述,此处不再赘述。所述攻击检测装置具有实现上述第一方面中任一方面的方法实例中行为的功能。所述功能可以通过硬件实现,也可以通过硬件执行相应的软件实现,例如,该攻击检测装置应用于服务器,或支持服务器实现上述攻击检测方法的装置。
上述的硬件或软件包括一个或多个与上述功能相对应的模块。在一个可能的设计中,该攻击检测装置应用于攻击检测模型,攻击检测装置包括:通信单元,用于获取推理请求,推理请求携带有应用模型的待处理数据集,待处理数据集包括一个或多个样本。检测单元,用于检测待处理数据集中是否存在物理对抗样本。防护单元,用于若待处理数据集中存在物理对抗样本,对应用模型执行防护处理。
在一种可选的实现方式中,攻击检测模型是依据训练数据集确定的,训练数据集包括针对应用模型的多个物理对抗样本和多个标准样本。
在另一种可选的实现方式中,检测单元,具体用于对于待处理数据集包括的每一个样本,输出样本的安全信息;安全信息用于指示样本包含物理对抗扰动的置信度。检测单元, 具体用于若样本的置信度达到第一阈值,将样本识别为针对应用模型的物理对抗样本。
在另一种可选的实现方式中,样本的安全信息是由攻击检测模型包含的特征检测模块获取的。
在另一种可选的实现方式中,检测单元,还用于依据待处理数据集包括的多个样本的安全信息,输出待处理数据集的检测结果。
在另一种可选的实现方式中,检测单元,具体用于将物理对抗样本存储在攻击检测模型包含的序列检测模块中。检测单元,具体用于若多个样本中物理对抗样本的数量大于或等于第一数量,确定推理请求为攻击请求。
在另一种可选的实现方式中,防护单元,具体用于阻断应用模型处理推理请求。例如,防护单元具体用于将应用模型输出的处理结果设置为无效结果。又如,防护单元具体用于丢弃推理请求。
在另一种可选的实现方式中,攻击检测装置还包括:告警单元,用于记录告警日志,告警日志用于指示推理请求包括物理对抗样本。
第三方面,本申请的实施例提供一种攻击检测系统,所述攻击检测系统包括用于执行第一方面或第一方面任一种可能实现方式中的攻击检测方法的各个模型。该系统可以是软件模块实现;也可以是拥有相应功能的硬件,例如第一设备和第二设备等,如第一设备为服务器、第二设备为终端。
有益效果可以参见第一方面中任一方面的描述,此处不再赘述。所述攻击检测系统具有实现上述第一方面中任一方面的方法实例中行为的功能。所述功能可以通过硬件实现,也可以通过硬件执行相应的软件实现,例如,该攻击检测系统应用于服务器,或支持服务器实现上述攻击检测方法的装置。
上述的硬件或软件包括一个或多个与上述功能相对应的模块。在一个可能的设计中,攻击检测系统包括:第一设备和第二设备,第一设备中部署有攻击检测模型,第二设备中部署有应用模型。第一设备获取客户端的推理请求,推理请求携带有应用模型的待处理数据集,待处理数据集包括一个或多个样本;第一设备检测待处理数据集中是否存在物理对抗样本;若待处理数据集中存在物理对抗样本,第一设备对第二设备中部署的应用模型执行防护处理。
值得注意的是,在一些可能的示例中,第一设备和第二设备是指运行在同一个物理设备中的不同处理单元,此时,攻击检测系统是指该物理设备。如该物理设备可以是指服务器、手机、平板电脑,个人计算机或其他具有模型处理能力的计算设备等。
例如,该物理设备为服务器,第一设备是指服务器中的第一处理器,第二设备是指该服务器中不同于第一处理器的第二处理器。
又如,该物理设备为服务器中的处理器,第一设备是指处理器中的第一处理单元,第二设备是指该处理器中不同于第一处理单元的第二处理单元。
第四方面,本申请的实施例提供一种服务器,该服务器包括处理器和接口电路,接口电路用于接收来自服务器之外的其它设备的信号并传输至处理器,或将来自处理器的信号发送给服务器之外的其它设备,处理器通过逻辑电路或执行代码指令用于实现第一方面或第一方面任一种可能实现方式中所述的方法的操作步骤。
第五方面,本申请的实施例提供一种终端,该终端包括包括处理器和接口电路,接口 电路用于接收来自服务器之外的其它设备的信号并传输至处理器,或将来自处理器的信号发送给服务器之外的其它设备,处理器通过逻辑电路或执行代码指令用于实现第一方面或第一方面任一种可能实现方式中所述的方法的操作步骤。
第六方面,本申请的实施例提供一种计算机可读存储介质,该存储介质中存储有计算机程序或指令,当计算机程序或指令被服务器中的处理器执行时,用于实现第一方面或第一方面任一种可能实现方式中所述的方法的操作步骤。
第七方面,本申请的实施例提供一种计算机程序产品,该计算程序产品包括指令,当计算机程序产品在服务器或终端上运行时,使得服务器或终端执行该指令,以实现第一方面或第一方面任一种可能实现方式中的攻击检测方法。
第八方面,本申请的实施例提供一种芯片,该芯片包括控制电路和接口电路,该接口电路,用于输入信息和/或输出信息;该逻辑电路用于执行上述各个方面或各个方面的可能的实现方式,对输入信息进行处理和/或生成输出信息。
例如,接口电路可以获取推理请求,逻辑电路可以对该推理请求实现第一方面或第一方面的可能的实现方式中所述的攻击检测方法。
本申请在上述各方面提供的实现方式的基础上,还可以进行进一步组合以提供更多实现方式。
附图说明
图1为本申请提供的一种攻击检测系统的结构示意图;
图2为本申请提供的一种攻击检测模型的获取示意图;
图3为本申请提供的一种攻击检测方法的流程示意图一;
图4为本申请提供的一种攻击检测方法的流程示意图二;
图5为本申请提供的一种攻击检测方法的流程示意图三;
图6为本申请提供的一种攻击检测装置的结构示意图;
图7为本申请提供的一种物理设备的结构示意图。
具体实施方式
为了解决背景技术提出的问题,本申请实施例提供一种攻击检测方法,该方法包括:首先,攻击检测模型获取推理请求,该推理请求携带有应用模型的待处理数据集,待处理数据集包括一个或多个样本。其次,攻击检测模型检测待处理数据集中是否存在物理对抗样本。最后,若待处理数据集中存在物理对抗样本,攻击检测模型对应用模型执行防护处理。本实施例采用不同于应用模型的攻击检测模型来检测推理请求中是否具有物理对抗样本,由于应用模型无需抵抗物理对抗攻击,因此无需对应用模型进行模型重训练或防御性蒸馏等,避免了对应用模型进行重构导致应用模型的精度降低。
另外,在一些可能的情形中,攻击检测模型还可以利用检测到的物理对抗样本进行训练,以优化攻击检测模型的攻击检测能力,提供物理对抗攻击检测的准确率。
为了下述各实施例的描述清楚简洁,首先给出相关技术的简要介绍。
图1为本申请提供的一种攻击检测系统的结构示意图。如图1所示,该攻击检测系统包括数据中心和多个终端(如图1所示出的终端111~终端113),数据中心可以通过网络与终端进行通信,该网络可以是因特网,或其他网络。该网络可以包括一个或多个网络设备,如网络设备可以是路由器或交换机等。
数据中心包括一个或多个服务器,如图1所示出的服务器120,例如支持应用服务的应用服务器,该应用服务器可以提供视频服务、图像服务、基于视频或图像的其他AI处理服务等。在一种可选的情形中,服务器120是指部署有多个服务器的服务器集群,该服务器集群可以具有机架,机架可通过有线连接为该多个服务器建立通信,如通用串行总线(universal serial bus,USB)或快捷外围组件互连(peripheral component interconnect express,PCIe)高速总线等。
服务器120还可以从终端获取数据,并对该数据进行AI处理后,向相应的终端发送AI处理的结果。该AI处理可以是指利用AI模型对数据进行对象识别、目标检测等,还可以是指依据终端所采集的样本获取符合需求的AI模型等。
另外,该数据中心还可以包括其他具有AI处理功能的物理设备,如手机、平板电脑或其他设备等。
终端(Terminal)也可以称为终端设备、用户设备(user equipment,UE)、移动台(mobile station,MS)、移动终端(mobile terminal,MT)等。终端可以是手机(如图1所示出的终端111)、具备移动支付功能的刷脸支付设备(如图1所示出的终端112)、具有数据(如图像或视频)采集和处理功能的摄像设备(如图1所示出的终端113)等,终端还可以是平板电脑(Pad)、带无线收发功能的电脑、虚拟现实(Virtual Reality,VR)终端设备、增强现实(Augmented Reality,AR)终端设备、工业控制(industrial control)中的无线终端、无人驾驶(self driving)中的无线终端、智能电网(smart grid)中的无线终端、运输安全(transportation safety)中的无线终端、智慧城市(smart city)中的无线终端、智慧家庭(smart home)中的无线终端等等。本申请的实施例对终端设备所采用的具体技术和具体设备形态不做限定。
值得注意的是,使用者可以通过终端获取服务器120所存储的AI模型等。示例的,该AI模型可以是针对数据进行目标检测、对象识别或分类等操作的模型。
例如,使用者1利用终端111中部署的AI模型来实现手机刷脸识别、指纹识别等功能。
又如,使用者2利用终端112中部署的AI模型来实现刷脸支付、物体分类(如商品分类)等功能。
又如,终端113中部署的AI模型可以用于实现对象检测等功能。
图1只是示意图,不应理解为对本申请的限定。本申请的实施例对终端和服务器的应用场景不做限定。
下面在图1所示出的攻击检测系统的基础上,本申请提供一种攻击检测模型的训练方法,如图2所示,图2为本申请提供的一种攻击检测模型的获取示意图,该攻击检测模型的训练方法可以由服务器120执行,也可以由终端111~终端113中任一个终端执行,在一些可能的示例中,该训练方法还可以由其他设备执行。
这里以攻击检测模型的训练方法由服务器120执行为例进行说明,请参见图2,攻击检测模型的获取过程可以包括以下步骤S210~S230。
S210,服务器120获取训练数据集。
该训练数据集包括针对应用模型的多个物理对抗样本和多个标准样本。
标准样本是指符合应用模型的输入需求的样本,且该样本中不存在对抗攻击。在一些 情形中,标准样本被称为正常样本。
对抗攻击(adversarial attack)是指向输入数据中添加一些无法被察觉的噪声,使得应用模型对输入数据做出错误的判断,添加的噪声称为对抗扰动(adversarial perturbation),而添加噪声后得到的样本则被称为对抗样本(adversarial example)。
对抗攻击包括物理对抗攻击和数字对抗攻击。数字对抗攻击是指计算机在样本中控制bit级的数据变化,从而产生具有数字对抗扰动的数字对抗样本,导致应用模型对该数字对抗样本的处理发生错误。物理对抗攻击是指在真实世界中部署深度学习的模型时,在真实世界中所产生的物理对抗扰动(如在图像中的人脸之前具有遮挡物,该遮挡物会干扰人脸的识别)构建物理对抗样本,导致应用模型对该物理对抗样本的处理发生错误。
物理对抗样本是指攻击者在真实世界中构造的具有物理对抗扰动的样本。例如,该物理对抗样本可以是不同场景中的样本,如该场景可以是目标检测、对象识别、物体分类或其他可能的场景,如刷脸支付、视频监控、门禁检测等。
值得注意的是,由于这里是对攻击检测模型进行训练,进而利用该攻击检测模型来确定物理对抗攻击,因此,在训练过程中,服务器120可以将物理对抗样本标记为正(positive)样本,将标准样本标记为负(negative)样本。
为了提高攻击检测模型的攻击检测能力,服务器120将物理对抗样本中的对抗扰动或攻击模式进行更进一步的标记。
S220,服务器120利用训练数据集对第一模型进行训练,并判断第一模型是否收敛。
该第一模型是基于应用模型所适用的应用场景或领域确定的。
在一种可能的示例中,第一模型与应用模型为同一类型的神经网络模型。例如,若应用模型为卷积神经网络(Convolutional Neural Network,CNN)模型,则该第一模型也可以是CNN模型。又如,若应用模型为循环神经网络(Recurrent Neural Network,RNN)模型或基于Transformer的模型,则该第一模型也可以是RNN模型或基于Transformer的模型。
在另一种可能的示例中,第一模型与应用模型为不同类型的神经网络模型。例如,若应用模型为CNN模型,则该第一模型可以是RNN模型。
值得注意是,上述两种示例仅为本实施例为说明第一模型与应用模型提供的可能的实现方式,在一些可能的情形中,应用模型可以具有多种类型的网络,如该应用模型为具有CNN和RNN的复合神经网络,第一模型可以具有一种或多种类型的神经网络。
服务器120可以基于应用模型为该第一模型设置损失函数、优化方法等训练(超)参数等,并利于上述的训练数据集对该第一模型进行训练。训练的方式包括但不限于:将已有的预训练模型作为上述的第一模型,并对该预训练模型进行微调(fine-tuning)训练。
若第一模型收敛,则执行S230;若训练模型不收敛,则服务器120利用训练数据集对第一模型进行多次训练,直到第一模型收敛。其中,第一模型收敛可以是指服务器120对第一模型的训练次数达到阈值(如5万次),或第一模型对物理对抗样本和标准样本的检测准确率达到阈值(如99%),本申请对此不予限定。
S230,服务器120将该第一模型作为攻击检测模型。
值得注意的是,在本申请实施例提供的方法中,对攻击检测模型进行训练的过程可以是由服务器执行,也可以是由终端执行,本申请对此不予限定。
在一些可能的示例中,攻击检测模型还可以是由其他计算设备训练获得的,该其他计算设备将训练得到的攻击检测模型发送到本申请实施例所提供的服务器或终端。示例的,该其他计算设备可以包括至少一个处理器,该处理器可以是一种集成电路芯片,具有信号处理能力。该处理器可以是通用处理器,包括中央处理单元(Central Processing Unit,CPU)、网络处理器(Network Processor,NP)等;还可以是数字信号处理器(Digital Signal Processing,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field ProgrammableGate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。例如,该其他计算设备只有一个处理器时,该处理器可以实现上述S210~S230及其可能的子步骤。又如,该其他计算设备包括多个处理器时,该多个处理器可以协同实现上述S210~S230及其可能的子步骤,如该多个处理器包括第一处理器和第二处理器,第一处理器可以实现上述的S210和S220,第二处理器可以实现上述的S230。
通常,专用于模型训练和生成的计算设备的处理能力比终端的处理能力更强,相较于终端来训练得到攻击检测模型,若攻击检测模型的训练过程由计算设备(如上述的服务器120)执行,攻击检测模型的训练时间会更短,训练效率更高。终端可以利用攻击检测模型来确定推理请求中是否具有物理对抗样本,进而在推理请求具有物理对抗样本的情况下,对应用模型执行防护处理,由于应用模型无需抵抗物理对抗攻击,因此服务器无需对应用模型进行模型重训练或防御性蒸馏等,避免了对应用模型进行重构导致应用模型的精度降低。
另外,在一些可能的攻击检测过程中,该攻击检测模型还可以利用检测到的物理对抗样本进行训练,以优化攻击检测模型的攻击检测能力,提供攻击检测的准确率。
针对于图2所示出的训练过程,服务器还可以基于深度神经网络,训练物理对抗样本获得特征检测模块,该特征检测模块为上述的攻击检测模型的一部分;进而,服务器将该特征检测模块结合其他模块获得攻击检测模型。
针对于服务器获取攻击检测模型的过程,本实施例提供一种可能的具体实现方式。
首先,服务器利用业界公开的最新的物理对抗攻击方法,在不同的环境(室内、室外、不同光照、不同的人脸等)下生成相应的物理对抗样本(视频、照片)作为正样本,这些物理对抗样本可导致基于AI的对象识别系统出现误判。
其次,服务器采用人工或自动化方法在上述物理对抗样本(视频、照片)中进一步标记存在物理对抗扰动的区域。
另外,服务器在相同或相似的环境(室内、室外、不同光照等)下获取对应的对象的正常样本(视频、照片)作为负样本。
最后,服务器利用上述标记的正、负样本,选择恰当的DNN架构,设置损失函数、优化方法等训练(超)参数,利用上述步骤中标记出的样本(训练数据集),采用数据驱动的方式,训练出对应的攻击检测模型。该攻击检测模型能够以较高的精度检测出视频、图片中存在的物理对抗扰动,从而区分出物理对抗输入(物理对抗样本)与正常输入(标准样本),并给出其判断对应的置信度或概率等。
下面将在上述所示出的攻击检测模型的基础上,结合附图对本实施例提供攻击检测方法的实施方式进行详细描述。
图3为本申请提供的一种攻击检测方法的流程示意图一,该攻击检测方法由攻击检测模型D31执行,该攻击检测模型D31用于检测针对应用模型D32的物理对抗攻击。关于攻击检测模型D31的训练过程可以参考上述图2的相关内容,此处不再赘述。
在一种可能的示例中,攻击检测模型D31和应用模型D32部署在同一个物理设备上。如图1所示出的终端111~终端113中的任一个终端,或服务器120,亦或是其他处理设备,
在另一种可能的示例中,攻击检测模型D31和应用模型D32部署在不同物理设备上。如攻击检测模型D31部署在终端111上,应用模型D32部署在与终端111通信的服务器120上。
值得注意的是,无论攻击检测模型D31和应用模型D32部署的物理设备的具体形态和位置,攻击检测模型D31和应用模型D32的管理可以由AI应用网关D30来实现,该AI应用网关D30是用于管理多个AI模型的物理机,也可以是虚拟机(virtual machine,VM)或者容器(container)等,VM或容器是指用于实现AI应用网关D30的功能的软件进程。当AI应用网关D30是VM或容器时,该AI应用网关D30可以部署在上述的终端或服务器中,本申请对此不予限定。
这里以上述的AI应用网关D30、攻击检测模型D31和应用模型D32部署在一个物理设备为例进行说明,如图3所示,本实施例提供的攻击检测方法包括以下步骤。
S310,攻击检测模型D31获取推理请求。
值得注意的是,该推理请求可以是攻击检测模型D31所部署的终端生成的,也可以是由与该终端所通信的其他设备发送的。
该推理请求携带有应用模型D32的待处理数据集,该待处理数据集包括一个或多个样本。
上述的样本是指应用模型D32所需处理的数据,如图片、视频、音频或文字等。
例如,当应用模型D32是目标检测模型时,该样本可以是需要检测图片或视频。
又如,当应用模型D32是对象识别模型时,该样本可以是需要识别的图片或视频。
又如,当应用模型D32是数据分析模型时,该样本可以是需要分析的音频或文字等。
上述示例仅为本实施例为说明推理请求给出的示例,不应理解为对本申请的限定。
在一种可能的情形中,推理请求中还携带有指示所需进行AI处理的类型,如上述的对象识别或目标检测等。
另外,如图3所示,攻击检测模型D31获取该推理请求可以是由AI应用网关D30转发的。
在第一种可能的示例中,攻击检测模型D31和应用模型D32是并行部署的,则该AI应用网关D30还将该推理请求转发至应用模型D32。
在第二种可能的示例中,攻击检测模型D31和应用模型D32是串行部署的,则在攻击检测模型D31在检测推理请求中不携带有物理对抗样本后,再将该推理请求转发至应用模型D32。
在上述攻击检测模型D31和应用模型D32是串行部署的情况下,由于应用模型D32接收到推理请求之前,攻击检测模型D31已对该推理请求进行物理对抗攻击的检测,以保证应用模型D32获取到的推理请求中未携带有物理对抗样本,这极大的降低了应用模型D32被攻击的概率,提高了应用模型D32的安全性。
S320,攻击检测模型D31检测待处理数据集中是否存在物理对抗样本。
关于物理对抗样本的内容可以参考上述S210的相关阐述,此处不再赘述。
在一种可选的实现方式中,上述的S320包括以下内容:对于待处理数据集包括的每一个样本,攻击检测模型D31输出样本的安全信息,该安全信息用于指示该样本包含物理对抗扰动的置信度;若样本的置信度达到第一阈值,攻击检测模型D31将样本识别为针对应用模型D32的物理对抗样本。
置信度达到第一阈值是指置信度大于或等于第一阈值,该第一阈值可以根据不同的应用场景进行设定。例如,在刷脸支付的场景中,该第一阈值为40%;又如,在门禁检测的场景中,该第一阈值为60%。
若待处理数据集包括样本1~样本4,针对于上述的安全信息指示的置信度,下表1提供了一种可能的情形。
表1
  置信度(%)
样本1 12
样本2 13
样本3 65
样本4 5
其中,样本1中包含物理对抗扰动的置信度为12%,样本2中包含物理对抗扰动的置信度为13%,样本3中包含物理对抗扰动的置信度为65%,样本4中包含物理对抗扰动的置信度为5%。
假设上述的第一阈值为50%,则上述的样本1~样本4中仅有样本3达到了第一阈值,则攻击检测模型D31将该样本3作为物理对抗样本。
另外,图1仅为本实施例提供的一种示例,待处理数据集中也可以仅包括一个样本,如表1所示出的样本3,当样本3被攻击检测模型D31识别为物理对抗样本的情况下,攻击检测模型D31可以将该推理请求作为针对于应用模型D32的攻击请求。
在本实施例中,由于攻击检测模型可以对推理请求携带的样本进行物理对抗攻击检测,替代了应用模型抵抗物理对抗攻击的过程,使得攻击检测模型替代了应用模型实现物理对抗攻击检测的功能;因此,应用模型无需为了抵抗物理对抗攻击而进行重构,避免了通常技术中应用模型的重构过程所导致的应用模型的精度降低。
可选的,若待处理数据集中包括多个样本,则上述的S320还可以包括:攻击检测模型D31依据待处理数据集包括的多个样本的安全信息,输出待处理数据集的检测结果。
示例的,若待处理数据集包括的多个样本中具有第一数量个(如10个)物理对抗样本,则攻击检测模型D31确定该待处理数据集是攻击者所提供的,并将该推理请求作为攻击请求。
在本实施例中,若待处理数据集包括多个样本,则攻击检测模型可以依据这多个样本中每个样本包含物理对抗扰动的置信度,确定推理请求的物理对抗攻击检测的检测结果,避免了攻击检测模型在偶然的情况下,错误的判定待处理数据集存在少量物理对抗样本,进而确定推理请求为攻击请求,提高了攻击检测模型进行物理对抗攻击检测的准确率。
为了完成对推理请求携带的样本的检测,在图3的基础上,本实施例提供一种可选的 实现方式,如图4所示,图4为本申请提供的一种攻击检测方法的流程示意图二,该攻击检测模型D31包括特征检测模块D311和序列检测模块D312。
该特征检测模块D311用于对样本包括的特征进行识别和检测。在一些可能的示例中该特征检测模块D311还可以对样本所包括的物理对抗扰动的攻击模式进行识别和标记。
该序列检测模块D312用于缓存一个或多个物理对抗样本。
关于特征检测模块D311和序列检测模块D312更多的功能请参见一下关于图4的相关内容。
如图4所示,上述的S320可以包括以下步骤S321~S323。
S321,对于待处理数据集包括的每一个样本,特征检测模块D311输出样本的安全信息。
若样本的置信度达到第一阈值,特征检测模块D311将该样本识别为针对应用模型D32的物理对抗样本。
S322,特征检测模块D311将物理对抗样本存储在攻击检测模型D31包含的序列检测模块D312中。
序列检测模块包括根据某个/组用户前后相继的推理请求对应的检测结果构成的向量序列。如图4所示,序列检测模块D312中可以存储有多个序列(序列1和序列2),每个序列可以是不同用户产生的。
示例的,若应用模型D32为视频监控场景中用于检测人脸的AI模型,一个推理请求可以包括一段视频,该视频中每一帧图像为推理请求中的一个样本,则每个样本可以指示相同或不同用户的人脸。则在攻击者进行物理对抗攻击的过程中,攻击者可以将不同的用户产生的多个物理对抗样本设置在一个推理请求中。如图4所示,序列检测模块D312存储的序列1为用户1产生的多个物理对抗样本(样本1和样本3),序列检测模块D312存储的序列2为用户2产生的多个物理对抗样本(样本2)。
序列检测模块D312可采用统计分析或阈值判断等方法,对上述检测结果序列进行分析,判断该序列/该用户的输入中是否包含物理对抗扰动,或正在执行物理对抗攻击。请继续参见图4,上述的S320还可以包括以下步骤S323。
S323,若多个样本中物理对抗样本的数量大于或等于第一数量,序列检测模块D312确定推理请求为攻击请求。
在本实施例中,在推理请求具有一定数量的物理对抗样本的情况,序列检测模块才识别该推理请求为攻击请求,避免特征检测模块在错误的识别出推理请求中具有单个或少量的物理对抗样本时,错误的识别推理请求为攻击请求,提高了攻击检测模型的识别准确率。
可选的,在应用检测模型D32和攻击检测模型D31为串行部署的情况下,若推理请求中未携带有物理对抗样本,攻击检测模型D31将推理请求转发至应用模型D32,以便该应用模型D32对推理请求进行AI处理。
由于应用模型接收到推理请求之前,攻击检测模型已对该推理请求进行物理对抗攻击的检测,以保证应用模型获取到的推理请求中未携带有物理对抗样本,这极大的降低了应用模型被攻击的概率,提高了应用模型的安全性。
请继续参见图3,本实施例提供的攻击检测方法还包括以下步骤S330。
S330,若待处理数据集中存在物理对抗样本,攻击检测模型D31对应用模型D32执 行防护处理。
攻击检测模型D31对应用模型D32执行防护处理包括多种可能的实现方式,本实施例给出了以下两种可能的情形。
第一种情形,攻击检测模型D31阻断应用模型D32处理推理请求。
在本实施例中,被保护的应用模型无需做任何改动,攻击检测模型将推理请求包含的输入样本是否为物理对抗样本,并将反馈结果给相应的访问控制机制,由AI应用网关对物理对抗攻击进行阻断。
在攻击检测模型D31和应用模型D32为并行部署的情况下,攻击检测模型D31将应用模型D32输出的处理结果设置为无效结果。由于应用模型D32输出的处理结果被设置为无效结果,使得该应用模型D32针对于物理对抗样本的AI处理不存在有效输出,攻击者利用该推理请求包含的物理对抗样本无法获取到应用模型D32的模型参数或其他数据,提高了应用模型D32的安全性。
在攻击检测模型D31和应用模型D32为串行部署的情况下,攻击检测模型D31丢弃上述的推理请求。由于攻击检测模型D31确定推理请求中包含有物理对抗样本,并将推理请求丢弃,使得应用模型D32无法获取到攻击者进行物理对抗攻击所采用的物理对抗样本,即应用模型D32没有输入,应用模型D32也就不会处理该物理对抗样本,进而,攻击者无法获取到应用模型D32的模型参数或其他数据,提高了应用模型D32的安全性。
也就是说,在本实施例中,由于被保护的应用模型无需做任何改动,也可以受到攻击检测模型针对于物理对抗攻击的防御,因此,使得应用模型的精度不会由于重构而降低,提高了应用模型的准确性。
第二种情形,攻击检测模型D31记录告警日志,该告警日志用于指示该推理请求包括物理对抗样本。
如图3所示,若多个AI模型均由AI应用网关D30进行管理,则攻击检测模型D31还可以向AI应用网关D30发送该告警日志,以便AI应用网关D30记录,该告警日志可以用于被保护的应用模型的后续保护。例如,若在一定时长内,AI应用网关D30记录有同一客户端发送的多个推理请求,且该多个推理请求均包含有物理对抗样本,则AI应用网关D30可以将该客户端标识为攻击者,并阻断该客户端后续发送的其他请求,达到保护应用模型D32的目的。
由于应用模型D32的特殊性,在一定时长内,若推理请求中仅具有少量的物理对抗样本,不会对应用模型D32造成影响,即一个推理请求中的少量物理对抗样本一般不会被判定为应用模型D32的物理对抗攻击,仅在应用模型32持续受到多个推理请求中物理对抗样本进行攻击的情况下,才会对应用模型D32的安全性造成影响。
应用模型32持续受到多个推理请求中物理对抗样本进行攻击的情况可以包括,但不限于:模型萃取(model extraction)、成员推理(membership inference)和模型逆向(model inversion)等。
其中,模型萃取是指服务器将应用模型的模型参数设置在云中,并由云提供给用户的情况下,攻击者用不同的样本去轮询该应用模型的模型参数的行为。
成员推理是指攻击者利用样本(包括物理对抗样本和标准样本)反复轮询应用模型,判断某个样本或某组是否在应用模型的训练样本集中的行为。
模型逆向是指攻击者利用样本(包括物理对抗样本和标准样本)反复轮询应用模型,并根据轮询的结果构造应用模型的等价模型,或反向构造出应用模型的训练样本集的行为。
在本实施例中,攻击检测模型将具有物理对抗样本的推理请求进行记录,使得攻击检测模型实现了对物理对抗攻击的安全态势感知,该安全态势是指应用模型在一定时长内的安全情况,如在本申请中,该安全情况是指应用模型是否受到攻击者的物理对抗攻击。如此,攻击检测模型对上述模型萃取、成员推理和模型逆向等行为进行了规避,避免了应用模型的模型参数和训练样本集的泄露,这提高了应用模型的安全性。
在一些可能的情形中,AI应用网关可以将一个推理请求中具有少量的物理对抗样本的事件识别为攻击者的试探攻击,或者AI应用网关可以将多个推理请求中均具有大量物理对抗样本的事件识别为攻击者的物理对抗攻击流。当AI应用网关和攻击检测模型识别上述的试探攻击或者物理对抗攻击流之后,可以将发送推理请求的客户端标记为攻击者,并拒绝接收该客户端的其他请求,以保护上述的应用模型,提高应用模型的安全性。
如此,本实施例采用不同于应用模型的攻击检测模型来检测推理请求中是否具有物理对抗样本,由于应用模型无需抵抗物理对抗攻击,因此无需对应用模型进行模型重训练或防御性蒸馏等,避免了对应用模型进行重构导致应用模型的精度降低。另外,在一些可能的情形中,攻击检测模型还可以利用检测到的物理对抗样本进行训练,以优化攻击检测模型的攻击检测能力,提供物理对抗攻击检测的准确率。
另外,由于应用模型的模型厂商无需考虑应用模型面临的物理对抗攻击威胁,只需要选择将其应用模型部署到提供了攻击检测模型的算力平台,即可获得应用模型对物理对抗攻击的感知(检测)与防御能力。
针对于上述实施例提供的攻击检测方法,在AI应用网关D30、攻击检测模型D31和应用模型D32的基础上,本申请还提供一种可能的具体实现方式,如图5所示,图5为本申请提供的一种攻击检测方法的流程示意图三,该攻击检测模型D31和应用模型D32为并行部署,且应用模型D32还与数据库D33连接。
数据库33存储有多个比对特征,该比对特征用于支持实现应用模型D32的AI处理功能。该AI处理的过程可以是,但不限于:对象识别、目标检测、物体分类等。
这里以AI处理为生物特征识别为例进行说明,生物特征可以是但不限于人脸、指纹、身材、眼部虹膜等信息,如图5所示,本实施例提供的攻击检测方法包括以下步骤。
S510,AI应用网关D30获取推理请求。
S521,AI应用网关D30向攻击检测模型D31发送推理请求。
S522,AI应用网关D30向应用模型D32发送推理请求。
当AI应用网关接收到来自客户(终端或客户端)的推理请求(生物特征输入信息,如视频、照片),在AI应用网关将该推理请求发送给应用模型的同时,还将该推理请求或从该推理请求中采样(如从视频中每一秒截取一帧图像,从而控制攻击检测所需的算力)转发给攻击检测模型。
关于S510~S522的具体实现可以参考上述S310的内容,此处不再赘述。
S523,应用模型D32对推理请求进行AI处理。
如图5所示,应用模型D32可以利用数据库D33存储的多个比对特征(特征1~特征 3)处理推理请求中所携带的四个样本,确定推理请求中包含2个具有特征1的样本,以及1个具有特征2的样本。
在应用模型D32对推理请求进行AI处理的过程中,攻击检测模型D31还可以执行物理对抗攻击检测的过程,如以下S530所示。
S530,攻击检测模型D31对推理请求包含的多个样本进行物理对抗攻击检测。
例如,攻击检测模型D31利用其预先训练的特征检测模块D311对AI应用网关D30转发的照片(或照片序列)进行推理分析,判别其中是否含有物理对抗扰动,并给出判断的置信度或概率。关于S530的具体实现和有益效果可以参考上述S320的内容,此处不再赘述。
当特征检测模型以较高的置信度判断推理请求中包含物理对抗扰动/攻击时,攻击检测模型D31可以生成并输出告警信息,如图5所示出的S541。
S541,攻击检测模型D31向AI应用网关D30发送告警信息。
该告警信息指示推理请求中包含物理对抗扰动/攻击。在一些情况中,该告警信息是以日志(log)的形式记录的。
S542,AI应用网关D30依据告警信息对应用模型D32执行防护处理。
关于S542的具体实现可以参考上述S330的内容,此处不再赘述。
例如,当AI应用网关D30对应用模型D32执行防护处理时,AI应用网关D30和攻击检测模型D31可以拦截应用模型D32的生物特征识别结果,避免攻击者利用物理对抗样本绕过基于生物特征识别的认证,提高了应用模型D32的安全性。
本实施例可以提供一种“模型无关”的物理对抗攻击检测机制,在生物特征识别场景中检测推理请求中是否存在物理对抗样本,以及提供应用模型运行时的安全态势感知,还可以实现对物理对抗攻击的告警。
示例的,首先,服务器生成针对生物特征识别的各种物理对抗样本(并标记),基于DNN模型采用数据驱动的方法训练出对应的攻击检测模型。其次,服务器可以将集成了攻击检测模型的攻击检测引擎以独立于(受保护的AI)模型的方式部署到相应的AI应用环境中,攻击检测引擎和应用模型可以同步接收应用的推理请求(输入样本)。最后,在应用模型的AI处理过程中,攻击检测模型可以同步判断推理请求是否包含物理对抗样本(扰动),实现对针对生物特征识别的物理对抗攻击的检测。
可以理解的是,为了实现上述实施例中功能,服务器和终端设备包括了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本申请中所公开的实施例描述的各示例的单元及方法步骤,本申请能够以硬件或硬件和计算机软件相结合的形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用场景和设计约束条件。
图6和图7为本申请的实施例提供的可能的攻击检测装置和物理设备的结构示意图。攻击检测装置和物理设备可以用于实现上述方法实施例中服务器120和任一终端的功能,因此也能实现上述方法实施例所具备的有益效果。在本申请的实施例中,该攻击检测装置600可以是如图1所示的服务器120,还可以是应用于服务器120的模块(如芯片),也可以是图1所示出的任一个终端。
如图6所示,攻击检测装置600包括通信单元610、检测单元620、防护单元630、告 警单元640、存储单元650和训练单元660。攻击检测装置600用于实现上述图2~图5中所示的方法实施例的功能。
当攻击检测装置600用于实现图2所示的方法实施例时:通信单元610用于执行S210,训练单元660用于执行S220和S230。
当攻击检测装置600用于实现图3所示的方法实施例时:通信单元610用于执行S310,检测单元620用于执行S320,防护单元630用于执行S330。
当攻击检测装置600用于实现图4所示的方法实施例时:通信单元610用于执行S310,检测单元620用于执行S321~S323,防护单元630用于执行S330。
当攻击检测装置600用于实现图5所示的方法实施例时:通信单元610用于执行S510,S521和S522;检测单元620用于执行S530,告警单元640用于执行S541和S542。
另外,存储单元650可以用于存储上述的推理请求,以及攻击检测模型D31识别出的物理对抗样本等。
有关上述通信单元610、检测单元620、防护单元630、告警单元640、存储单元650和训练单元660,更详细的描述及有益效果可以直接参考图2~图5所示的方法实施例中相关描述直接得到,这里不加赘述。
如图7所示,图7为本申请提供的一种物理设备的结构示意图,物理设备700包括处理器710和接口电路720。该物理设备700可以是指上述的服务器、终端或其他计算设备等。
处理器710和接口电路720之间相互耦合。可以理解的是,接口电路720可以为收发器或输入输出接口。可选的,物理设备700还可以包括存储器730,用于存储处理器710执行的指令或存储处理器710运行指令所需要的输入数据或存储处理器710运行指令后产生的数据。
当物理设备700用于实现图2~图5所示的方法时,处理器710和接口电路720用于执行上述通信单元610、检测单元620、防护单元630、告警单元640、存储单元650和训练单元660的功能。处理器710、接口电路720和存储器730还可以协同实现攻击检测方法中的各个操作步骤。物理设备700还可以执行图6所示出的攻击检测装置600的功能,此处不予赘述。
本申请实施例中不限定上述接口电路720、处理器710以及存储器730之间的具体连接介质。本申请实施例在图7中以接口电路720、处理器710以及存储器730之间通过总线740连接,总线在图7中以粗线表示,其它部件之间的连接方式,仅是进行示意性说明,并不引以为限。所述总线可以分为地址总线、数据总线、控制总线等。为便于表示,图7中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
存储器730可用于存储软件程序及模块,如本申请实施例所提供的攻击检测方法对应的程序指令/模块,处理器710通过执行存储在存储器730内的软件程序及模块,从而执行各种功能应用以及数据处理。该接口电路720可用于与其他设备进行信令或数据的通信。在本申请中该物理设备700可以具有多个接口电路720。
可以理解的是,本申请的实施例中的处理器可以是CPU、神经处理器(Neural processing unit,NPU)或图形处理器(Graphic processing unit,GPU),还可以是其它通用处理器、DSP、ASIC、FPGA或者其它可编程逻辑器件、晶体管逻辑器件,硬件部件或者其任意组 合。通用处理器可以是微处理器,也可以是任何常规的处理器。
本申请的实施例中的方法步骤可以通过硬件的方式来实现,也可以由处理器执行软件指令的方式来实现。软件指令可以由相应的软件模块组成,软件模块可以被存放于随机存取存储器(Random Access Memory,RAM)、闪存、只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable ROM,PROM)、可擦除可编程只读存储器(Erasable PROM,EPROM)、电可擦除可编程只读存储器(Electrically EPROM,EEPROM)、寄存器、硬盘、移动硬盘、CD-ROM或者本领域熟知的任何其它形式的存储介质中。一种示例性的存储介质耦合至处理器,从而使处理器能够从该存储介质读取信息,且可向该存储介质写入信息。当然,存储介质也可以是处理器的组成部分。处理器和存储介质可以位于ASIC中。另外,该ASIC可以位于网络设备或终端设备中。当然,处理器和存储介质也可以作为分立组件存在于网络设备或终端设备中。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机程序或指令。在计算机上加载和执行所述计算机程序或指令时,全部或部分地执行本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、网络设备、用户设备或者其它可编程装置。所述计算机程序或指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机程序或指令可以从一个网站站点、计算机、服务器或数据中心通过有线或无线方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是集成一个或多个可用介质的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,例如,软盘、硬盘、磁带;也可以是光介质,例如,数字视频光盘(digital video disc,DVD);还可以是半导体介质,例如,固态硬盘(solid state drive,SSD)。
在本申请的各个实施例中,如果没有特殊说明以及逻辑冲突,不同的实施例之间的术语和/或描述具有一致性、且可以相互引用,不同的实施例中的技术特征根据其内在的逻辑关系可以组合形成新的实施例。
本申请说明书和权利要求书及上述附图中的术语“第一”、“第二”和“第三”等是用于区别不同对象,而不是用于限定特定顺序。“示例性的”或者“例如”等词用于表示作例子、例证或说明。本申请实施例中被描述为“示例性的”或者“例如”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言,使用“示例性的”或者“例如”等词旨在以具体方式呈现相关概念。
本申请中,“至少一个”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B的情况,其中A,B可以是单数或者复数。在本申请的文字描述中,字符“/”,一般表示前后关联对象是一种“或”的关系;在本申请的公式中,字符“/”,表示前后关联对象是一种“相除”的关系。此外,对于单数形式“a”,“an”和“the”出现的元素(element),除非上下文另有明确规定,否则其不意味着“一个或仅一个”,而是意味着“一个或多于一个”。例如,“a device”意味着对一个或多个这样的device。再者,至少一个(at least one of).......”意味着后续关联对象中的一个或任意组合,例如“A、B和C中 的至少一个”包括A,B,C,AB,AC,BC,或ABC。
可以理解的是,在本申请的实施例中涉及的各种数字编号仅为描述方便进行的区分,并不用来限制本申请的实施例的范围。上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定。

Claims (22)

  1. 一种攻击检测方法,其特征在于,所述方法包括:
    攻击检测模型获取推理请求,所述推理请求携带有应用模型的待处理数据集,所述待处理数据集包括一个或多个样本;
    所述攻击检测模型检测所述待处理数据集中是否存在物理对抗样本;
    若所述待处理数据集中存在物理对抗样本,所述攻击检测模型对所述应用模型执行防护处理。
  2. 根据权利要求1所述的方法,其特征在于,所述攻击检测模型是依据训练数据集确定的,所述训练数据集包括针对所述应用模型的多个物理对抗样本和多个标准样本。
  3. 根据权利要求1或2所述的方法,其特征在于,所述攻击检测模型检测所述待处理数据集中是否存在物理对抗样本,包括:
    对于所述待处理数据集包括的每一个样本,所述攻击检测模型输出所述样本的安全信息;所述安全信息用于指示所述样本包含物理对抗扰动的置信度;
    若所述样本的置信度达到第一阈值,所述攻击检测模型将所述样本识别为针对所述应用模型的物理对抗样本。
  4. 根据权利要求3所述的方法,其特征在于,所述样本的安全信息是由所述攻击检测模型包含的特征检测模块获取的。
  5. 根据权利要求3或4所述的方法,其特征在于,
    所述攻击检测模型检测所述待处理数据集中是否存在物理对抗样本,还包括:
    所述攻击检测模型依据所述待处理数据集包括的多个样本的安全信息,输出所述待处理数据集的检测结果。
  6. 根据权利要求5所述的方法,其特征在于,
    所述攻击检测模型依据所述待处理数据集包括的多个样本的安全信息,输出所述待处理数据集的检测结果,包括:
    所述攻击检测模型将所述物理对抗样本存储在所述攻击检测模型包含的序列检测模块中;
    若所述多个样本中物理对抗样本的数量大于或等于第一数量,所述序列检测模块确定所述推理请求为攻击请求。
  7. 根据权利要求1-6中任一项所述的方法,其特征在于,
    所述攻击检测模型对所述应用模型执行防护处理,包括:
    所述攻击检测模型阻断所述应用模型处理所述推理请求。
  8. 根据权利要求7所述的方法,其特征在于,
    所述攻击检测模型阻断所述应用模型处理所述推理请求,包括:
    所述攻击检测模型将所述应用模型输出的处理结果设置为无效结果。
  9. 根据权利要求7所述的方法,其特征在于,
    所述攻击检测模型阻断所述应用模型处理所述推理请求,包括:
    所述攻击检测模型丢弃所述推理请求。
  10. 根据权利要求1-6中任一项所述的方法,其特征在于,所述方法还包括:
    所述攻击检测模型记录告警日志,所述告警日志用于指示所述推理请求包括物理对抗 样本。
  11. 一种攻击检测装置,其特征在于,所述攻击检测装置应用于攻击检测模型,所述攻击检测装置包括:
    通信单元,用于获取推理请求,所述推理请求携带有应用模型的待处理数据集,所述待处理数据集包括一个或多个样本;
    检测单元,用于检测所述待处理数据集中是否存在物理对抗样本;
    防护单元,用于若所述待处理数据集中存在物理对抗样本,对所述应用模型执行防护处理。
  12. 根据权利要求11所述的装置,其特征在于,所述攻击检测模型是依据训练数据集确定的,所述训练数据集包括针对所述应用模型的多个物理对抗样本和多个标准样本。
  13. 根据权利要求11或12所述的装置,其特征在于,所述检测单元,具体用于对于所述待处理数据集包括的每一个样本,输出所述样本的安全信息;所述安全信息用于指示所述样本包含物理对抗扰动的置信度;
    所述检测单元,具体用于若所述样本的置信度达到第一阈值,将所述样本识别为针对所述应用模型的物理对抗样本。
  14. 根据权利要求13所述的装置,其特征在于,所述样本的安全信息是由所述攻击检测模型包含的特征检测模块获取的。
  15. 根据权利要求13或14所述的装置,其特征在于,所述检测单元,还用于依据所述待处理数据集包括的多个样本的安全信息,输出所述待处理数据集的检测结果。
  16. 根据权利要求15所述的装置,其特征在于,所述检测单元,具体用于将所述物理对抗样本存储在所述攻击检测模型包含的序列检测模块中;
    所述检测单元,具体用于若所述多个样本中物理对抗样本的数量大于或等于第一数量,确定所述推理请求为攻击请求。
  17. 根据权利要求11-16中任一项所述的装置,其特征在于,所述防护单元,具体用于阻断所述应用模型处理所述推理请求。
  18. 根据权利要求17所述的装置,其特征在于,所述防护单元,具体用于将所述应用模型输出的处理结果设置为无效结果。
  19. 根据权利要求17所述的装置,其特征在于,所述防护单元,具体用于丢弃所述推理请求。
  20. 根据权利要求11-16中任一项所述的装置,其特征在于,所述攻击检测装置还包括:告警单元,用于记录告警日志,所述告警日志用于指示所述推理请求包括物理对抗样本。
  21. 一种攻击检测系统,其特征在于,包括:第一设备和第二设备,所述第一设备中部署有攻击检测模型,所述第二设备中部署有应用模型;
    所述第一设备获取客户端的推理请求,所述推理请求携带有所述应用模型的待处理数据集,所述待处理数据集包括一个或多个样本;
    所述第一设备检测所述待处理数据集中是否存在物理对抗样本;
    若所述待处理数据集中存在物理对抗样本,所述第一设备对所述第二设备中部署的应用模型执行防护处理。
  22. 一种计算机程序产品,其特征在于,所述计算程序产品包括指令,当所述计算机程 序产品在服务器或终端上运行时,使得服务器或终端执行该指令,以实现权利要求1-10中任一项所述的方法。
PCT/CN2022/085391 2021-08-20 2022-04-06 一种攻击检测方法及装置 WO2023019970A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP22857289.7A EP4375860A1 (en) 2021-08-20 2022-04-06 Attack detection method and apparatus

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN202110959827.5 2021-08-20
CN202110959827 2021-08-20
CN202111064335.6 2021-09-10
CN202111064335.6A CN115712893A (zh) 2021-08-20 2021-09-10 一种攻击检测方法及装置

Publications (1)

Publication Number Publication Date
WO2023019970A1 true WO2023019970A1 (zh) 2023-02-23

Family

ID=85230441

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/085391 WO2023019970A1 (zh) 2021-08-20 2022-04-06 一种攻击检测方法及装置

Country Status (3)

Country Link
EP (1) EP4375860A1 (zh)
CN (1) CN115712893A (zh)
WO (1) WO2023019970A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109525607A (zh) * 2019-01-07 2019-03-26 四川虹微技术有限公司 对抗攻击检测方法、装置及电子设备
CN109784411A (zh) * 2019-01-23 2019-05-21 四川虹微技术有限公司 对抗样本的防御方法、装置、系统及存储介质
WO2020233564A1 (zh) * 2019-05-21 2020-11-26 华为技术有限公司 一种对抗样本的检测方法及电子设备
CN112465019A (zh) * 2020-11-26 2021-03-09 重庆邮电大学 一种基于扰动的对抗样本生成与对抗性防御方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109525607A (zh) * 2019-01-07 2019-03-26 四川虹微技术有限公司 对抗攻击检测方法、装置及电子设备
CN109784411A (zh) * 2019-01-23 2019-05-21 四川虹微技术有限公司 对抗样本的防御方法、装置、系统及存储介质
WO2020233564A1 (zh) * 2019-05-21 2020-11-26 华为技术有限公司 一种对抗样本的检测方法及电子设备
CN112465019A (zh) * 2020-11-26 2021-03-09 重庆邮电大学 一种基于扰动的对抗样本生成与对抗性防御方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
CAI XIUXIA;DU HUIMIN: "Survey on Adversarial Example Generation and Adversarial Attack Method", JOURNAL OF XI'AN UNIVERSITY OF POSTS AND TELECOMMUNICATIONS, vol. 26, no. 1, 31 January 2021 (2021-01-31), pages 67 - 75, XP093035822, ISSN: 2095-6533, DOI: 10.13682/j.issn.2095-6533.2021.01.011 *

Also Published As

Publication number Publication date
CN115712893A (zh) 2023-02-24
EP4375860A1 (en) 2024-05-29

Similar Documents

Publication Publication Date Title
TWI673625B (zh) 統一資源定位符(url)攻擊檢測方法、裝置以及電子設備
US11522873B2 (en) Detecting network attacks
US11171977B2 (en) Unsupervised spoofing detection from traffic data in mobile networks
US10599851B2 (en) Malicious code analysis method and system, data processing apparatus, and electronic apparatus
US11743276B2 (en) Methods, systems, articles of manufacture and apparatus for producing generic IP reputation through cross protocol analysis
WO2020236651A1 (en) Identity verification and management system
US11620384B2 (en) Independent malware detection architecture
US20110179488A1 (en) Kernal-based intrusion detection using bloom filters
CN110245714B (zh) 图像识别方法、装置及电子设备
WO2023207548A1 (zh) 一种流量检测方法、装置、设备及存储介质
Zhao et al. CAN bus intrusion detection based on auxiliary classifier GAN and out-of-distribution detection
CN111049783A (zh) 一种网络攻击的检测方法、装置、设备及存储介质
CN114565513A (zh) 对抗图像的生成方法、装置、电子设备和存储介质
CN108156127B (zh) 网络攻击模式的判断装置、判断方法及其计算机可读取储存媒体
Li et al. Deep learning algorithms for cyber security applications: A survey
Cheng et al. STC‐IDS: Spatial–temporal correlation feature analyzing based intrusion detection system for intelligent connected vehicles
CN113537145B (zh) 目标检测中误、漏检快速解决的方法、装置及存储介质
WO2023019970A1 (zh) 一种攻击检测方法及装置
US20220414274A1 (en) Active control of communications bus for cyber-attack mitigation
US20240171675A1 (en) System and method for recognizing undesirable calls
EP4373031A1 (en) System and method for recognizing undersirable calls
CN115150165B (zh) 一种流量识别方法及装置
WO2024041436A1 (zh) 业务请求处理方法、装置、电子设备及存储介质
US20220210169A1 (en) Systems and methods for a.i.-based malware analysis on offline endpoints in a network
US20230344840A1 (en) Method, apparatus, system, and non-transitory computer readable medium for identifying and prioritizing network security events

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22857289

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2022857289

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2022857289

Country of ref document: EP

Effective date: 20240222

NENP Non-entry into the national phase

Ref country code: DE