WO2022218188A1 - 攻击样本管理的方法以及设备 - Google Patents

攻击样本管理的方法以及设备 Download PDF

Info

Publication number
WO2022218188A1
WO2022218188A1 PCT/CN2022/085278 CN2022085278W WO2022218188A1 WO 2022218188 A1 WO2022218188 A1 WO 2022218188A1 CN 2022085278 W CN2022085278 W CN 2022085278W WO 2022218188 A1 WO2022218188 A1 WO 2022218188A1
Authority
WO
WIPO (PCT)
Prior art keywords
network device
attack sample
attack
sample
detection result
Prior art date
Application number
PCT/CN2022/085278
Other languages
English (en)
French (fr)
Inventor
焦丽娟
叶浩楠
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2022218188A1 publication Critical patent/WO2022218188A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/40Network security protocols

Definitions

  • the present application relates to the field of artificial intelligence, and more particularly, to methods and apparatuses for attack sample management.
  • the network attack may be identified through an artificial intelligence (artificial intelligence, AI) detection model.
  • AI artificial intelligence
  • the training samples also referred to as attack samples
  • the AI detection model trained in this related technical solution has limited detection capability for some advanced bypass attacks, especially for injection attacks on corporate websites, where hackers can easily bypass traditional anti-attack detection. Therefore, how to better detect and respond to these attacks has become the competitive commanding point of security products or security solutions.
  • the present application provides an attack sample management method and device, which can improve the attack detection model's ability to detect attack behavior, and lay a foundation for improving the security of the network environment.
  • a method for managing attack samples including: a first network device sending a first attack sample to a second network device; the first network device receiving a first feedback message sent by the second network device , the first feedback message includes the detection result of the first attack sample by the second network device, and the detection result of the first attack sample indicates that the first attack sample is a normal sample; the first network device The device sends the first attack sample to a management device according to the first feedback message, and the management device is used to train an attack detection model according to the first attack sample, and the attack detection model is used by the second network The device is used to identify attack packets.
  • the training sample of the attack detection model is the first attack sample identified as a normal sample by the second network device
  • the training sample obtained by the first attack sample is used by the second network device to identify the attack packet.
  • the attack detection model can identify some attack samples that bypass attack detection, thereby improving the attack detection model's ability to detect or identify attack packets, and lay a foundation for improving the security of the network environment.
  • the method before the first network device sends the first attack sample to the second network device, the method further includes: the first network device sends the first attack sample to the second network device.
  • the second network device sends a second attack sample; the first network device receives a second feedback message sent by the second network device, where the second feedback message includes the second attack sample sent by the second network device
  • the detection result of the second attack sample indicates that the second attack sample is an attack sample; the first network device adjusts the second attack sample according to the second feedback message, and obtains the The first attack sample.
  • the first network device includes a reinforcement learning agent RL Agent and a reinforcement learning environment RL Evn, and the RL Evn is detected according to the second attack sample
  • the result and the reward and punishment function send a punishment signal to the RL Agent; the RL Agent adjusts the second attack sample according to the punishment signal to obtain the first attack sample.
  • the method further includes: the first network device obtains the second attack sample by using any one or a combination of the following attack methods: Unicode encoding, Base64 encoding, insert comments, garbage data padding, Offset replacement.
  • a method for managing attack samples including: a second network device receives a first attack sample sent by a first network device; the second network device detects the first attack sample, and obtains a first attack sample.
  • a detection result of an attack sample indicates that the first attack sample is a normal sample; the second network device sends a first feedback message to the first network device, the first The feedback message includes the detection result of the first attack sample by the second network device.
  • the second network device performs feature extraction on the first attack sample to obtain a feature vector of the first attack sample
  • the second network device detects the feature vector of the first attack sample to obtain a detection result of the first attack sample.
  • the method before the second network device receives the first attack sample sent by the first network device, the method further includes: receiving, by the second network device, the first attack sample sent by the first network device.
  • the second attack sample is an attack sample; the second network device sends a second feedback message to the first network device, where the second feedback message includes a detection result of the second attack sample by the second network device.
  • a method for managing attack samples including: a first network device sending a first attack sample to a second network device; and the first network device receiving a first feedback message sent by the second network device , the first feedback message includes the detection result of the first attack sample by the second network device, and the detection result of the first attack sample indicates that the first attack sample is a normal sample; the first network device The device sends the first attack sample to the management device according to the first feedback message; the management device trains an attack detection model according to the first attack sample, and the attack detection model is used by the second network device. for identifying attack packets; the management device deploys the attack detection model to the second network device.
  • the method before the first network device sends the first attack sample to the second network device, the method further includes: the first network device sends the first attack sample to the second network device.
  • the second network device sends a second attack sample; the first network device receives a second feedback message sent by the second network device, where the second feedback message includes the second attack sample sent by the second network device
  • the detection result of the second attack sample indicates that the second attack sample is an attack sample; the first network device adjusts the second attack sample according to the second feedback message, and obtains the The first attack sample.
  • the first network device includes a reinforcement learning agent RL Agent and a reinforcement learning environment RL Evn, and the RL Evn is detected according to the second attack sample
  • the result and the reward and punishment function send a punishment signal to the RL Agent; the RL Agent adjusts the second attack sample according to the punishment signal to obtain the first attack sample.
  • the method further includes: the first network device obtains the second attack sample by using any one or a combination of the following attack methods: Unicode encoding, Base64 encoding, insert comments, garbage data padding, Offset replacement.
  • the method further includes: the second network device receiving the first attack sample sent by the first network device; the second network device Detecting the first attack sample to obtain a detection result of the first attack sample; the second network device sends a first feedback message to the first network device, where the first feedback message includes the second network device The detection result of the device on the first attack sample.
  • the second network device performs feature extraction on the first attack sample to obtain a feature vector of the first attack sample; the second network device The feature vector of the first attack sample is detected to obtain a detection result of the first attack sample.
  • the method before the second network device receives the first attack sample sent by the first network device, the method further includes: the second network device The network device receives the second attack sample sent by the first network device; the second network device detects the second attack sample to obtain a detection result of the second attack sample, and the detection result of the second attack sample Indicate that the second attack sample is an attack sample; the second network device sends a second feedback message to the first network device, where the second feedback message includes the second network device's response to the second attack sample test results.
  • a first network device comprising: a sending module, a receiving module,
  • a sending module configured to send the first attack sample to the second network device
  • a receiving module configured to receive a first feedback message sent by the second network device, where the first feedback message includes a detection result of the first attack sample by the second network device, and a The detection result indicates that the first attack sample is a normal sample;
  • the sending module is further configured to send the first attack sample to a management device according to the first feedback message, and the management device is configured to train an attack detection model according to the first attack sample, and the attack detection model The model is used by the second network device to identify attack packets.
  • the sending module is further configured to send a second attack sample to the second network device;
  • the receiving module is further configured to receive the second attack sample A second feedback message sent by the network device, where the second feedback message includes the detection result of the second attack sample by the second network device, and the detection result of the second attack sample indicates that the second attack sample is attack sample;
  • the first network device further includes: a processing module configured to adjust the second attack sample according to the second feedback message to obtain the first attack sample.
  • the first network device includes a reinforcement learning agent RL Agent and a reinforcement learning environment RL Evn
  • the processing module is specifically used for: the RL Evn according to The detection result of the second attack sample and the reward and punishment function send a punishment signal to the RL Agent; the RL Agent adjusts the second attack sample according to the punishment signal to obtain the first attack sample.
  • the processing module is further configured to: obtain the second attack sample through a combination of any one or more of the following attack methods: Unicode encoding, Base64 Encoding, Inserting Comments, Garbage Data Filling, Offset Replacement.
  • a second network device comprising: a receiving module, a detecting module, a sending module,
  • a receiving module configured to receive the first attack sample sent by the first network device
  • a detection module configured to detect the first attack sample to obtain a detection result of the first attack sample, where the detection result of the first attack sample indicates that the first attack sample is a normal sample;
  • a sending module configured to send a first feedback message to the first network device, where the first feedback message includes a detection result of the first attack sample by the second network device.
  • the detection module is specifically configured to: perform feature extraction on the first attack sample to obtain a feature vector of the first attack sample; A feature vector of an attack sample is detected, and a detection result of the first attack sample is obtained.
  • the receiving module is further configured to receive a second attack sample sent by the first network device; the detection module is further configured to The second attack sample is detected, and the detection result of the second attack sample is obtained, and the detection result of the second attack sample indicates that the second attack sample is an attack sample; the sending module is further configured to send the first network device to the first network device. Two feedback messages, where the second feedback message includes the detection result of the second attack sample by the second network device.
  • a sixth aspect provides an attack sample management system, including: a sending module, a receiving module, a processing module,
  • a sending module used for the first network device to send the first attack sample to the second network device
  • a receiving module configured for the first network device to receive a first feedback message sent by the second network device, where the first feedback message includes a detection result of the first attack sample by the second network device, and the The detection result of the first attack sample indicates that the first attack sample is a normal sample;
  • the sending module is further configured for the first network device to send the first attack sample to the management device according to the first feedback message;
  • a processing module used for the management device to train an attack detection model according to the first attack sample, and the attack detection model is used by the second network device to identify attack packets;
  • the processing module is further configured for the management device to deploy the attack detection model into the second network device.
  • the sending module is further configured to send the second attack sample to the second network device by the first network device;
  • the receiving module is further configured to use receiving, at the first network device, a second feedback message sent by the second network device, where the second feedback message includes a detection result of the second attack sample by the second network device, the second attack The detection result of the sample indicates that the second attack sample is an attack sample;
  • the processing module is further configured for the first network device to adjust the second attack sample according to the second feedback message to obtain the first attack sample.
  • An attack sample is further configured to send the second attack sample to the second network device by the first network device.
  • the first network device includes a reinforcement learning agent RL Agent and a reinforcement learning environment RL Evn
  • the processing module is specifically used for: the RL Evn is based on The detection result of the second attack sample and the reward and punishment function send a punishment signal to the RL Agent; the RL Agent adjusts the second attack sample according to the punishment signal to obtain the first attack sample.
  • the processing module is further configured to: the first network device obtains the second attack through a combination of any one or more of the following attack methods Samples: Unicode encoding, Base64 encoding, insert comments, garbage data padding, Offset replacement.
  • the receiving module is further configured to receive, by the second network device, the first attack sample sent by the first network device; the processing module , and is further used by the second network device to detect the first attack sample to obtain the detection result of the first attack sample; the sending module is also used by the second network device to send the first network device to the first network device.
  • Send a first feedback message where the first feedback message includes a detection result of the first attack sample by the second network device.
  • the processing module is specifically configured to: perform feature extraction on the first attack sample by the second network device to obtain the feature of the first attack sample vector; the second network device detects the feature vector of the first attack sample, and obtains the detection result of the first attack sample.
  • the receiving module is further configured to receive, by the second network device, a second attack sample sent by the first network device; the processing module is further configured to for the second network device to detect the second attack sample to obtain a detection result of the second attack sample, where the detection result of the second attack sample indicates that the second attack sample is an attack sample; the sending The module is further configured for the second network device to send a second feedback message to the first network device, where the second feedback message includes a detection result of the second attack sample by the second network device.
  • a first network device in a seventh aspect, includes a processor, a memory, an interface, and a bus.
  • the interface may be implemented in a wireless or wired manner, specifically a network card.
  • the above-mentioned processor, memory and interface are connected by a bus.
  • the interface may include a transmitter and a receiver, which are used by the first network device to implement the above-mentioned transceiving.
  • the processor is configured to execute the processing performed by the first network device in the above embodiment.
  • the memory includes an operating system and an application program, and is used to store programs, codes or instructions. When the processor or the hardware device executes these programs, codes or instructions, the processing process involving the first network device in the method embodiment can be completed.
  • the memory may include read-only memory (ROM) and random access memory (RAM).
  • ROM read-only memory
  • RAM random access memory
  • the ROM includes a basic input/output system (basic input/output system, BIOS) or an embedded system
  • the RAM includes an application program and an operating system.
  • the first network device may include any number of interfaces, processors or memories.
  • a second network device in an eighth aspect, includes a processor, a memory, an interface, and a bus.
  • the interface may be implemented in a wireless or wired manner, specifically a network card.
  • the above-mentioned processor, memory and interface are connected by a bus.
  • the interface may include a transmitter and a receiver, which are used by the second network device to implement the above-mentioned transceiving.
  • the processor is configured to execute the processing performed by the second network device in the above-mentioned embodiment.
  • the memory includes an operating system and an application program, and is used to store programs, codes or instructions. When the processor or the hardware device executes these programs, codes or instructions, the processing process involving the second network device in the method embodiment can be completed.
  • the memory may include read-only memory (ROM) and random access memory (RAM).
  • ROM read-only memory
  • RAM random access memory
  • the ROM includes a basic input/output system (basic input/output system, BIOS) or an embedded system
  • the RAM includes an application program and an operating system.
  • the second network device may include any number of interfaces, processors or memories.
  • a computer program product comprising: computer program code, when the computer program code is executed on the first network device, the first network device is made to execute the above-mentioned first aspect or the first aspect any possible method of implementation.
  • a computer program product comprising: computer program code, when the computer program code is executed on the second network device, the second network device is made to perform the above-mentioned second aspect or the second aspect any possible method of implementation.
  • a computer-readable medium stores program codes, when the computer program codes are executed on the first network device, the first network device is made to execute the above-mentioned first aspect or the first Any possible method of performing an aspect.
  • These computer-readable storages include, but are not limited to, one or more of the following: read-only memory (ROM), programmable ROM (PROM), erasable PROM (erasable PROM, EPROM), Flash memory, electrical EPROM (electrically EPROM, EEPROM) and hard drive (hard drive).
  • a twelfth aspect provides a computer-readable medium, the computer-readable medium stores program code, when the computer program code is executed on the second network device, the first network device is made to execute the above-mentioned second aspect or the first Any possible method of performing an aspect.
  • These computer-readable storages include, but are not limited to, one or more of the following: read-only memory (ROM), programmable ROM (PROM), erasable PROM (erasable PROM, EPROM), Flash memory, electrical EPROM (electrically EPROM, EEPROM) and hard drive (hard drive).
  • a thirteenth aspect provides a chip, the chip includes a processor and a data interface, wherein the processor reads an instruction stored in a memory through the data interface to execute the first aspect or any possible implementation of the first aspect method in method.
  • the chip can be a central processing unit (CPU), a microcontroller (MCU), a microprocessor (microprocessing unit, MPU), a digital signal processor (digital signal processor) processing, DSP), system on chip (system on chip, SoC), application-specific integrated circuit (application-specific integrated circuit, ASIC), field programmable gate array (field programmable gate array, FPGA) or programmable logic device (programmable logic device) , PLD).
  • CPU central processing unit
  • MCU microcontroller
  • MPU microprocessor
  • DSP digital signal processor
  • system on chip system on chip
  • SoC system on chip
  • application-specific integrated circuit application-specific integrated circuit
  • FPGA field programmable gate array
  • PLD programmable logic device
  • a fourteenth aspect provides a chip, which includes a processor and a data interface, wherein the processor reads an instruction stored in a memory through the data interface to execute the second aspect or any possible implementation of the second aspect method in method.
  • the chip can be a central processing unit (CPU), a microcontroller (MCU), a microprocessor (microprocessing unit, MPU), a digital signal processor (digital signal processor) processing, DSP), system on chip (system on chip, SoC), application-specific integrated circuit (application-specific integrated circuit, ASIC), field programmable gate array (field programmable gate array, FPGA) or programmable logic device (programmable logic device) , PLD).
  • a system for managing attack samples including: a management device, such as the first network device in the third aspect or any possible implementation manner of the third aspect, and the fourth aspect or any possible implementation of the fourth aspect.
  • the second network device in the implementation.
  • FIG. 1 is a schematic structural diagram of a first network device 100 provided by an embodiment of the present application.
  • FIG. 2 is a schematic flowchart of an attack sample management method provided by an embodiment of the present application.
  • FIG. 3 is a schematic block diagram of a system architecture 300 for attack sample management provided by an embodiment of the present application.
  • FIG. 4 is a schematic flowchart of another attack sample management method provided by an embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of a first network device 500 provided by an embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of a second network device 600 provided by an embodiment of the present application.
  • At least one means one or more, and “plurality” means two or more.
  • And/or which describes the relationship of the associated objects, means that there can be three relationships, for example, A and/or B, which can mean: including the existence of A alone, the existence of A and B at the same time, and the existence of B alone, where A and B can be singular or plural.
  • the character “/” generally indicates that the associated objects are an “or” relationship.
  • At least one item(s) below” or similar expressions thereof refer to any combination of these items, including any combination of single item(s) or plural items(s).
  • At least one (a) of a, b, or c can represent: a, b, c, a-b, a-c, b-c, or a-b-c, where a, b, c may be single or multiple .
  • the embodiments of the present application provide a method for managing attack samples, which can improve the ability of an attack detection model to detect attack behaviors, and lay a foundation for improving the security of a network environment.
  • the attack sample management method provided in this embodiment of the present application can be applied to a computing device, which may also be referred to as a computer system, including a hardware layer, an operating system layer running on top of the hardware layer, and an operating system layer running on the operating system layer.
  • application layer includes hardware such as processing unit, memory and memory control unit, and the function and structure of the hardware are described in detail later.
  • the operating system is any one or more computer operating systems that implement business processing through processes, such as a Linux operating system, a Unix operating system, an Android operating system, an iOS operating system, or a Windows operating system.
  • the application layer includes browsers, address books, word processing software, instant messaging software and other applications.
  • the computer system is a handheld device such as a smart phone, or a terminal device such as a personal computer, which is not particularly limited in the present application, as long as the method provided by the embodiment of the present application can be used.
  • the execution subject of the attack sample management method provided by the embodiment of the present application may be a computer system, or a functional module in the computer system capable of calling a program and executing the program.
  • the computing device that performs the attack sample management method above may also be referred to as a first network device, and the first network device may be, for example, a breakthrough and attack simulation (BAS).
  • BAS breakthrough and attack simulation
  • FIG. 1 is a schematic structural diagram of a first network device 100 provided by an embodiment of the present application.
  • the first network device 100 may be a server or a computer or other devices with computing capabilities.
  • the first network device 100 shown in FIG. 1 includes: at least one processor 110 and a memory 120 .
  • the processor 110 executes the instructions in the memory 120, so that the first network device 100 implements the attack sample management method provided by the present application.
  • the first network device 100 further includes a system bus, wherein the processor 110 and the memory 120 are respectively connected to the system bus.
  • the processor 110 can access the memory 120 through the system bus, for example, the processor 110 can read and write data or execute code in the memory 120 through the system bus.
  • the system bus is a peripheral component interconnect express (PCI) bus or an extended industry standard architecture (EISA) bus or the like.
  • PCI peripheral component interconnect express
  • EISA extended industry standard architecture
  • the system bus is divided into an address bus, a data bus, a control bus, and the like. For ease of presentation, only one thick line is used in FIG. 1, but it does not mean that there is only one bus or one type of bus.
  • the function of the processor 110 is mainly to interpret the instructions (or code) of the computer program and process the data in the computer software.
  • the instructions of the computer program and the data in the computer software can be stored in the memory 120 or the cache 116 .
  • the processor 110 may be an integrated circuit chip with signal processing capability.
  • the processor 110 is a general-purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), an off-the-shelf programmable gate array (FPGA) ) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
  • the general-purpose processor is a microprocessor or the like.
  • the processor 110 is a central processing unit (CPU).
  • each processor 110 includes at least one processing unit 112 and a memory control unit 114 .
  • the processing unit 112 also referred to as a core or core, is the most important component of the processor.
  • the processing unit 112 is manufactured from monocrystalline silicon by a certain production process, and all calculations, receiving commands, storing commands, and processing data of the processor are performed by the core.
  • the processing units run program instructions independently, and use the capability of parallel computing to speed up the running speed of the program.
  • Various processing units have fixed logical structures.
  • the processing units include logic units such as a first-level cache, a second-level cache, an execution unit, an instruction-level unit, and a bus interface.
  • the memory control unit 114 is used to control the data interaction between the memory 120 and the processing unit 112 . Specifically, the memory control unit 114 receives a memory access request from the processing unit 112 and controls access to the memory based on the memory access request.
  • the memory control unit is a device such as a memory management unit (memory management unit, MMU).
  • each memory control unit 114 performs addressing to the memory 120 through the system bus.
  • an arbiter (not shown in the figure) is configured in the system bus, and the arbiter is responsible for processing and coordinating the competing accesses of the multiple processing units 112 .
  • the processing unit 112 and the memory control unit 114 are communicatively connected through a connection line inside the chip, such as an address line, so as to realize the communication between the processing unit 112 and the memory control unit 114 .
  • each processor 110 also includes a cache 116, wherein the cache is a buffer (referred to as a cache) for data exchange.
  • the processing unit 112 wants to read data, it will first look up the required data from the cache, and if it is found, it will be executed directly, and if it cannot be found, it will be found in the memory. Since the cache runs much faster than the memory, the function of the cache is to help the processing unit 112 run faster.
  • a memory 120 can provide a running space for a process in the first network device 100 , for example, a computer program (specifically, the code of the program) for generating a process is stored in the memory 120 .
  • the processor allocates a corresponding storage space for the process in the memory 120 .
  • the above-mentioned storage space further includes a text segment, an initialization data segment, a bit initialization data segment, a stack segment, a heap segment, and the like.
  • the memory 120 stores data generated during the running of the process, such as intermediate data, or process data, etc., in the storage space corresponding to the above-mentioned process.
  • the memory is also called internal memory, and its function is to temporarily store operation data in the processor 110 and data exchanged with an external memory such as a hard disk.
  • the processor 110 will transfer the data to be calculated into the memory for calculation, and the processing unit 112 will transmit the result after the calculation is completed.
  • memory 120 is volatile memory or non-volatile memory, or may include both volatile and non-volatile memory.
  • non-volatile memory is read-only memory (ROM), programmable read-only memory (programmable ROM, PROM), erasable programmable read-only memory (erasable PROM, EPROM), electrically erasable Except programmable read-only memory (electrically EPROM, EEPROM) or flash memory.
  • Volatile memory is random access memory (RAM), which acts as an external cache.
  • RAM random access memory
  • DRAM dynamic random access memory
  • SDRAM synchronous DRAM
  • SDRAM double data rate synchronous dynamic random access memory
  • ESDRAM enhanced synchronous dynamic random access memory
  • SLDRAM synchronous link dynamic random access memory
  • direct rambus RAM direct rambus RAM
  • the structure of the first network device 100 listed above is only an example, and the present application is not limited thereto.
  • the first network device 100 in the embodiment of the present application includes various hardware in the computer system in the prior art.
  • a network device 100 also includes other storages other than the memory 120, such as disk storage and the like.
  • the first network device 100 may also include other devices necessary for normal operation.
  • the above-mentioned first network device 100 may further include hardware devices that implement other additional functions.
  • the above-mentioned first network device 100 may also only include the necessary components for implementing the embodiments of the present application, rather than all the components shown in FIG. 1 .
  • FIG. 2 is a schematic flowchart of an attack sample management method provided by an embodiment of the present application, and the method may be executed by the first network device 100 shown in FIG. 1 . As shown in FIG. 2, the method may include steps 210-230, and the steps 210-230 will be described in detail below respectively.
  • Step 210 The first network device sends the first attack sample to the second network device.
  • the first network device may be configured to generate the first attack sample.
  • the first network device may be a BAS.
  • the second network device can be used to detect the attack sample, whether the attacked sample is a normal sample.
  • the second network device may be any one of the following: a firewall (firewall, FW), a web application firewall (web application firewall, WAF), a situation awareness product (situation awareness, AS), and the like.
  • the hardware architecture of the second network device is similar to that of the first network device.
  • the hardware architecture of the first network device 100 in FIG. 1 please refer to the description of the hardware architecture of the first network device 100 in FIG. 1 , which will not be repeated here.
  • Step 220 The first network device receives the first feedback message sent by the second network device.
  • the second network device may detect whether the first attack sample is a normal sample. Specifically, for example, the second network device may perform feature extraction on the first attack sample, obtain a feature vector of the first attack sample, identify the feature vector of the first attack sample, and determine whether the first attack sample is a normal sample.
  • the detection result of the first attack sample obtained by the second network device indicates that the first attack sample is a normal sample
  • the second network device may generate a first feedback message based on the detection result of the first attack sample, the The first feedback message includes the detection result of the first attack sample, and the first feedback message is sent to the first network device.
  • Step 230 The first network device sends the first attack sample to the management device according to the first feedback message.
  • the first network device After receiving the first feedback message sent by the second network device, the first network device indicates that the first attack sample is a normal sample according to the detection result of the first attack sample included in the first feedback message, and sends the first attack sample to the management device.
  • the management device may train an attack detection model according to the first attack sample, and the attack detection model is used by the second network device to identify attack packets. Specifically, the management device may use the first attack sample as the input of the model to train the attack detection model.
  • the above-mentioned management device may be a cloud background upgrade system, generally, for example, an enterprise internal system, which is specially used for data management, model upgrade, and the like.
  • the hardware architecture of the management device is similar to that of the first network device. For details, please refer to the description of the hardware architecture of the first network device 100 in FIG. 1 , which will not be repeated here.
  • the training sample of the attack detection model is the first attack sample identified as a normal sample by the second network device
  • the training sample obtained by the first attack sample is used by the second network device to identify the attack packet.
  • the attack detection model can identify some attack samples that bypass attack detection, thereby improving the attack detection model's ability to detect or identify attack packets, and lay a foundation for improving the security of the network environment.
  • FIG. 3 is a schematic block diagram of a system architecture 300 for attack sample management provided by an embodiment of the present application.
  • the system architecture 300 of the attack sample management may include: a first network device 310 , a second network device 320 , and a management device 330 , and the functions of each device are described in detail below.
  • the first network device 310 1. The first network device 310
  • the first network device 310 may also be called an attack sample automatic generation unit, which is mainly responsible for generating attack samples, and can also be understood as an attacker model for generating attack samples.
  • the first network device 310 may include: reinforcement learning agent (RL Agent) 311, RL environment (RL environment, RL Evn) 312, structured query language manipulator (structured query language manipulator, SQL manipulator) )313.
  • RL treats learning as a heuristic evaluation process, and the RL Agent 311 learns in a "trial and error” fashion, choosing an action for the RL Evn 312. After the RL Evn 312 accepts the action, the state changes, and at the same time, a reinforcement signal (reward or punishment) is generated and fed back to the RL Agent 311.
  • the RL Agent 311 selects the next action according to the reinforcement signal and the current state of the environment. The principle of selection is to increase the probability of receiving positive reinforcement (reward), even if the RL Agent 311 obtains the maximum reward.
  • the action of the RL Agent 311 acting on the RL Evn 312 is the injection method of the SQL attack sample, and the output is adjusted based on the evaluation fed back by the RL Evn 312 (usually a reinforcement signal of reward or punishment). action.
  • SQL manipulator 313 is used to generate attack samples based on this injection method.
  • the RL Evn 312 is used to evaluate the quality of the actions generated by the RL Agent 311 according to the detection results of the attack samples.
  • the second network device 320 may also be called an AI detection unit, which is mainly responsible for detecting the attack sample generated by the first network device 310, detecting whether the sample is an attack sample, and notifying the RL Evn 312 of the detection result of the sample.
  • the second network device 320 can also be understood as a defender model for detecting and identifying attack samples.
  • the second network device 320 may include: a feature extraction module 321 and an SQL detection module 322.
  • the feature extraction module 321 is configured to acquire the generated attack sample from the SQL manipulator 313, and perform feature extraction on the attack sample to generate a feature vector.
  • the SQL detection module 322 is configured to detect whether the corresponding sample is an attack sample according to the feature vector.
  • the management device 330 may also be referred to as a detection model update unit. Specifically, on the one hand, the management device 330 implements the function of a training module, which is mainly used to train the detection model based on the above-mentioned sample set, and obtain the attack detection model above. On the other hand, the function of implementing the update module is mainly used to deploy the attack detection model into the SQL detection module 322 of the second network device 320 .
  • FIG. 4 is a schematic flowchart of another attack sample management method provided by an embodiment of the present application, and the method may be executed by the attack sample management system architecture 300 shown in FIG. 3 . As shown in FIG. 4 , the method may include steps 410-480, and the steps 410-480 will be described in detail below respectively.
  • Step 410 The RL Agent 311 outputs the SQL attack injection method to the RL Evn 312.
  • the SQL attack injection method output by the RL Agent 311 to the RL Evn 312 may also be called the obfuscation bypass method, so as to generate attack samples according to the obfuscation bypass method.
  • common SQL attack injection methods may include, but are not limited to: Unicode encoding, Base64 encoding, inserting comments, filling garbage data, and Offset replacement.
  • Unicode encoding may include, but are not limited to: Unicode encoding, Base64 encoding, inserting comments, filling garbage data, and Offset replacement.
  • Unicode encoding refers to randomly encoding the keywords in the SQL statement to unicode, replacing them 1-3 times. For example, select*from user can be converted to selectu0020*u0020fromu0020users, where spaces are replaced with u0020.
  • Inserting a comment means randomly selecting characters, adding /*XXX*/ after the characters, and the content of the comment can be random.
  • Offset replacement refers to randomly selecting a comma to convert it into an offset, and replacing it 1 to 3 times.
  • Step 420 The RL Evn 312 sends the SQL attack injection method to the SQL manipulator 313.
  • Step 430 The SQL manipulator 313 generates attack samples according to the SQL attack injection method.
  • the SQL manipulator 313 can generate attack samples according to the SQL attack injection method sent by the RL Evn 312 .
  • SQL manipulator 313 can convert the SQL statement "select*from user” into the attack sample "selectu0020*u0020fromu0020users", where spaces are replaced with u0020.
  • Step 440 The feature extraction module 321 obtains the attack sample generated by the SQL manipulator 313, and performs feature extraction on the attack sample to generate a feature vector.
  • the SQL manipulator 313 can send the attack sample to the feature extraction module 321, so that the feature extraction module 321 can perform feature extraction on the attack sample to generate a feature vector.
  • the feature vector mainly includes: the proportion of functions, dangerous functions, spaces, dangerous characters, punctuation marks, etc., and n-gram features of bytes.
  • Step 450 The feature extraction module 321 transmits the extracted feature vector as a state to the SQL detection module 322 for sample detection.
  • the feature extraction module 321 After the feature extraction module 321 performs feature extraction on the attack sample to generate a feature vector, the feature vector is passed as a state to the SQL detection module 322 for sample detection.
  • Step 460 The RL Evn 312 may evaluate the SQL attack injection mode output by the RL Agent 311 based on the detection result of the attack sample by the SQL detection module 322.
  • the SQL detection module 322 After the SQL detection module 322 detects the attack sample, it can feed back the detection result to the RL Evn 312, so as to facilitate the evaluation of the SQL attack injection mode output by the RL Agent 311. For example, if the SQL detection module 322 identifies the attack sample, that is, the attack sample generated by the SQL attack injection method output by the RL Agent 311 does not bypass the SQL detection, then the RL Evn 312 can feedback a penalty reinforcement to the RL Agent 311 Signal. For another example, the SQL detection module 322 does not identify the attack sample, that is to say, the attack sample generated by the SQL attack injection method output by the RL Agent 311 bypasses the SQL detection, then the RL Evn 312 can feed back a reward to the RL Agent 311 strengthening signal.
  • the RL Agent 311 can adjust the action (SQL attack injection method) output in step 410 based on the evaluation. For example, if the SQL attack injection method output by the RL Agent 311 in step 410 results in a positive reward (immediate reward) for the RL Evn 312, then the tendency of the RL Agent 311 to generate this action will be strengthened in the future. Conversely, the tendency of RL Agent 311 to generate this action will be weakened.
  • the RL Evn 312 may set a reward and punishment function based on the difficulty of identifying the attack sample by the SQL detection module 322.
  • the above probability of identifying the attack sample as white may be the probability that the SQL detection module 322 identifies the attack sample
  • the probability of identifying the attack sample as black may be the probability that the SQL detection module 322 does not identify the attack sample. That is to say, if the probability that the SQL detection module 322 identifies an attack sample is smaller, it means that the attack sample generated by the SQL attack injection method output by the RL Agent 311 has a higher probability of bypassing the SQL detection, and the RL Evn 312 will reward the RL Agent 311 the greater the return.
  • Steps 410-460 are performed iteratively until the adversarial reinforcement learning model of the first network device 310 and the second network device 320 converges. After the reinforcement learning model converges, an attack sample of an advanced bypass class can be obtained, and the attack sample can bypass the detection and identification of the SQL detection module 322 in the second network device 320 . It should be understood that the attack sample of the advanced bypass class here corresponds to the first attack sample above.
  • Step 470 The management device 330 acquires the attack samples of the advanced bypass class, and trains the attack detection model based on the acquired attack samples of the advanced bypass class.
  • these samples can be sent to the management device 330 in the cloud to update the attack detection model.
  • the management device 330 may include: sample management, SQL training, SQL model, and model validation.
  • sample management is used to continuously collect advanced bypass attack samples.
  • SQL training is used to iteratively train the attack detection model on continuously collected data samples.
  • the attack model is used to record the information of the attack detection model after completing the SQL training process.
  • Model validation is used to verify the availability of the newly generated attack detection model.
  • Step 480 the management device 330 deploys the updated attack detection model to the second network device 320 .
  • the management device 330 can deploy the updated attack detection model into the SQL detection module 322 of the second network device 320, so that the SQL detection module 322 can identify the attack samples of the advanced bypass type, thereby improving the detection of attack samples. Test your ability.
  • the size of the sequence numbers of the above-mentioned processes does not mean the sequence of execution, and the execution sequence of each process should be determined by its functions and internal logic, and should not be dealt with in the embodiments of the present application. implementation constitutes any limitation.
  • FIG. 5 is a schematic structural diagram of a first network device 500 provided by an embodiment of the present application.
  • the first network device 500 shown in FIG. 5 may perform the corresponding steps performed by the first network device in the methods of the foregoing embodiments.
  • the first network device 500 includes: a sending module 510, a receiving module 520,
  • a sending module 510 configured to send the first attack sample to the second network device
  • a receiving module 520 configured to receive a first feedback message sent by the second network device, where the first feedback message includes a detection result of the first attack sample by the second network device, the first attack sample The detection result indicates that the first attack sample is a normal sample;
  • the sending module 510 is further configured to send the first attack sample to a management device according to the first feedback message, and the management device is configured to train an attack detection model according to the first attack sample.
  • the detection model is used by the second network device to identify attack packets.
  • the sending module 510 is further configured to send a second attack sample to the second network device;
  • the receiving module 520 is further configured to receive a second feedback message sent by the second network device, the The second feedback message includes a detection result of the second attack sample by the second network device, and the detection result of the second attack sample indicates that the second attack sample is an attack sample;
  • the first network device 500 further includes: a processing module 530, configured to adjust the second attack sample according to the second feedback message to obtain the first attack sample.
  • the first network device 500 includes a reinforcement learning agent RL Agent and a reinforcement learning environment RL Evn
  • the processing module 530 is specifically configured to: the RL Evn according to the detection result of the second attack sample and The reward and punishment function sends a punishment signal to the RL Agent; the RL Agent adjusts the second attack sample according to the punishment signal to obtain the first attack sample.
  • the processing module 530 is further configured to: obtain the second attack sample through a combination of any one or more of the following attack modes: Unicode encoding, Base64 encoding, inserting comments, filling garbage data, and Offset replacement .
  • FIG. 6 is a schematic structural diagram of a second network device 600 provided by an embodiment of the present application.
  • the second network device 600 shown in FIG. 6 may perform the corresponding steps performed by the second network device in the methods of the foregoing embodiments.
  • the second network device 600 includes: a receiving module 610, a detecting module 620, a sending module 630,
  • a receiving module 610 configured to receive the first attack sample sent by the first network device
  • a detection module 620 configured to detect the first attack sample, and obtain a detection result of the first attack sample, where the detection result of the first attack sample indicates that the first attack sample is a normal sample;
  • the sending module 630 is configured to send a first feedback message to the first network device, where the first feedback message includes a detection result of the first attack sample by the second network device.
  • the detection module 620 is specifically configured to: perform feature extraction on the first attack sample to obtain a feature vector of the first attack sample; detect the feature vector of the first attack sample to obtain the The detection result of the first attack sample is described.
  • the receiving module 610 is further configured to receive a second attack sample sent by the first network device; the detection module 620 is further configured to detect the second attack sample to obtain a second attack The detection result of the sample, the detection result of the second attack sample indicates that the second attack sample is an attack sample; the sending module 630 is further configured to send a second feedback message to the first network device, the second feedback The message includes the detection result of the second attack sample by the second network device.
  • Embodiments of the present application further provide a computer-readable medium, where program codes are stored in the computer-readable medium, and when the computer program codes are run on a computer, the computer executes the method performed by the first network device.
  • These computer-readable storages include, but are not limited to, one or more of the following: read-only memory (ROM), programmable ROM (PROM), erasable PROM (erasable PROM, EPROM), Flash memory, electrical EPROM (electrically EPROM, EEPROM) and hard drive (hard drive).
  • Embodiments of the present application further provide a computer-readable medium, where program codes are stored in the computer-readable medium, and when the computer program codes are run on a computer, the computer executes the method performed by the second network device.
  • These computer-readable storages include, but are not limited to, one or more of the following: read-only memory (ROM), programmable ROM (PROM), erasable PROM (erasable PROM, EPROM), Flash memory, electrical EPROM (electrically EPROM, EEPROM) and hard drive (hard drive).
  • An embodiment of the present application further provides a chip system, which is applied to the first network device, the chip system includes: at least one processor, at least one memory, and an interface circuit, where the interface circuit is responsible for information between the chip system and the outside world interaction, the at least one memory, the interface circuit and the at least one processor are interconnected by a wire, and the at least one memory stores instructions; the instructions are executed by the at least one processor to perform the above aspects The operation of the first network device in the method.
  • the chip can be a central processing unit (CPU), a microcontroller (MCU), a microprocessor (microprocessing unit, MPU), a digital signal processor (digital signal processor) processing, DSP), system on chip (system on chip, SoC), application-specific integrated circuit (application-specific integrated circuit, ASIC), field programmable gate array (field programmable gate array, FPGA) or programmable logic device (programmable logic device) , PLD).
  • CPU central processing unit
  • MCU microcontroller
  • MPU microprocessor
  • DSP digital signal processor
  • SoC system on chip
  • ASIC application-specific integrated circuit
  • FPGA field programmable gate array
  • PLD programmable logic device
  • An embodiment of the present application further provides a chip system, which is applied to a second network device.
  • the chip system includes: at least one processor, at least one memory, and an interface circuit, where the interface circuit is responsible for information between the chip system and the outside world interaction, the at least one memory, the interface circuit and the at least one processor are interconnected by a wire, and the at least one memory stores instructions; the instructions are executed by the at least one processor to perform the above aspects The operation of the second network device in the method.
  • the chip can be a central processing unit (CPU), a microcontroller (MCU), a microprocessor (microprocessing unit, MPU), a digital signal processor (digital signal processor) processing, DSP), system on chip (system on chip, SoC), application-specific integrated circuit (application-specific integrated circuit, ASIC), field programmable gate array (field programmable gate array, FPGA) or programmable logic device (programmable logic device) , PLD).
  • CPU central processing unit
  • MCU microcontroller
  • MPU microprocessor
  • DSP digital signal processor
  • SoC system on chip
  • ASIC application-specific integrated circuit
  • FPGA field programmable gate array
  • PLD programmable logic device
  • Embodiments of the present application further provide a computer program product, which is applied to a first network device, where the computer program product includes a series of instructions, when the instructions are executed, to perform the methods described in the above aspects. Operation of the first network device.
  • Embodiments of the present application further provide a computer program product, which is applied to a second network device, where the computer program product includes a series of instructions, when the instructions are executed, to perform the methods described in the above aspects. Operation of the second network device.
  • the disclosed system, apparatus and method may be implemented in other manners.
  • the apparatus embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented.
  • the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the functions, if implemented in the form of software functional units and sold or used as independent products, may be stored in a computer-readable storage medium.
  • the technical solution of the present application can be embodied in the form of a software product in essence, or the part that contributes to the prior art or the part of the technical solution.
  • the computer software product is stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (ROM), random access memory (RAM), magnetic disk or optical disk and other media that can store program codes .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computing Systems (AREA)
  • Computer Hardware Design (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

本申请提供了一种攻击样本管理的方法,该方法包括:第一网络设备向第二网络设备发送第一攻击样本;所述第一网络设备接收所述第二网络设备发送的第一反馈消息,所述第一反馈消息包括所述第二网络设备对所述第一攻击样本的检测结果,所述第一攻击样本的检测结果指示所述第一攻击样本为正常样本;所述第一网络设备根据所述第一反馈消息将所述第一攻击样本发送到管理设备,所述管理设备用于根据所述第一攻击样本训练得到攻击检测模型,所述攻击检测模型被所述第二网络设备用于识别攻击报文。该方法能够提高攻击检测模型对攻击行为的检测能力,为提高网络环境的安全性奠定了基础。

Description

攻击样本管理的方法以及设备
本申请要求于2021年4月16日提交中国专利局、申请号为202110411816.3、申请名称为“攻击样本管理的方法以及设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能领域,并且更具体地,涉及攻击样本管理的方法以及设备。
背景技术
随着网络规模的日益扩大,网络攻击数量也随之增多,攻防对抗也愈发严峻。从国家组织“护网行动”以来,一些常见的攻击方法和模式已经被大多数安全公司所覆盖。具体的,可以是通过人工智能(artificialintelligence,AI)检测模型对网络攻击进行识别。
相关的技术方案中,在训练AI检测模型时提供的训练样本(也可以称为攻击样本)仅包含了一些常规的网络攻击行为,没有覆盖一些绕过检测类的攻击行为。因此,该相关技术方案中训练出的AI检测模型对一些高级绕过类攻击的检测能力还很有限,尤其是针对公司网站的注入类攻击,黑客很容易绕过传统的防攻击检测。因此,如何更好的检测和响应这些攻击已经成为安全产品或者安全解决方案的竞争力制高点。
发明内容
本申请提供一种攻击样本管理的方法以及设备,该方法能够提高攻击检测模型对攻击行为的检测能力,为提高网络环境的安全性奠定了基础。
第一方面,提供了一种攻击样本管理的方法,包括:第一网络设备向第二网络设备发送第一攻击样本;所述第一网络设备接收所述第二网络设备发送的第一反馈消息,所述第一反馈消息包括所述第二网络设备对所述第一攻击样本的检测结果,所述第一攻击样本的检测结果指示所述第一攻击样本为正常样本;所述第一网络设备根据所述第一反馈消息将所述第一攻击样本发送到管理设备,所述管理设备用于根据所述第一攻击样本训练得到攻击检测模型,所述攻击检测模型被所述第二网络设备用于识别攻击报文。
上述技术方案中,由于攻击检测模型的训练样本是被第二网络设备识别为正常样本的第一攻击样本,因此,通过第一攻击样本训练得到的被第二网络设备用于识别攻击报文的攻击检测模型可以识别一些绕过攻击检测的攻击样本,从而提高了攻击检测模型对攻击报文的检测或识别能力,为提高网络环境的安全性奠定了基础。
结合第一方面,在第一方面的某些实现方式中,在所述第一网络设备向第二网络设备发送第一攻击样本之前,所述方法还包括:所述第一网络设备向所述第二网络设备发送第二攻击样本;所述第一网络设备接收所述第二网络设备发送的第二反馈消息,所述第二反馈消息包括所述第二网络设备对所述第二攻击样本的检测结果,所述第二攻击样本的检测 结果指示所述第二攻击样本为攻击样本;所述第一网络设备根据所述第二反馈消息对所述第二攻击样本进行调整,得到所述第一攻击样本。
结合第一方面,在第一方面的某些实现方式中,所述第一网络设备中包括强化学习智能体RL Agent和强化学习环境RL Evn,所述RL Evn根据所述第二攻击样本的检测结果以及奖惩函数向所述RL Agent发送惩戒信号;所述RL Agent根据所述惩戒信号对所述第二攻击样本进行调整,得到所述第一攻击样本。
结合第一方面,在第一方面的某些实现方式中,所述方法还包括:所述第一网络设备通过以下中的任一种或多种攻击方式的组合得到所述第二攻击样本:Unicode编码、Base64编码、插入注释、垃圾数据填充、Offset替换。
第二方面,提供了一种攻击样本管理的方法,包括:第二网络设备接收第一网络设备发送的第一攻击样本;所述第二网络设备对所述第一攻击样本进行检测,得到第一攻击样本的检测结果,所述第一攻击样本的检测结果指示所述第一攻击样本为正常样本;所述第二网络设备向所述第一网络设备发送第一反馈消息,所述第一反馈消息包括所述第二网络设备对所述第一攻击样本的检测结果。
结合第二方面,在第二方面的某些实现方式中,所述第二网络设备对所述第一攻击样本进行特征提取,获得所述第一攻击样本的特征向量;
所述第二网络设备对所述第一攻击样本的特征向量进行检测,获得所述第一攻击样本的检测结果。
结合第二方面,在第二方面的某些实现方式中,在所述第二网络设备接收第一网络设备发送的第一攻击样本之前,所述方法还包括:所述第二网络设备接收所述第一网络设备发送的第二攻击样本;所述第二网络设备对所述第二攻击样本进行检测,得到第二攻击样本的检测结果,所述第二攻击样本的检测结果指示所述第二攻击样本为攻击样本;所述第二网络设备向所述第一网络设备发送第二反馈消息,所述第二反馈消息包括所述第二网络设备对所述第二攻击样本的检测结果。
第三方面,提供了一种攻击样本管理的方法,包括:第一网络设备向第二网络设备发送第一攻击样本;所述第一网络设备接收所述第二网络设备发送的第一反馈消息,所述第一反馈消息包括所述第二网络设备对所述第一攻击样本的检测结果,所述第一攻击样本的检测结果指示所述第一攻击样本为正常样本;所述第一网络设备根据所述第一反馈消息将所述第一攻击样本发送到管理设备;所述管理设备根据所述第一攻击样本训练得到攻击检测模型,所述攻击检测模型被所述第二网络设备用于识别攻击报文;所述管理设备将所述攻击检测模型部署到所述第二网络设备中。
结合第三方面,在第三方面的某些实现方式中,在所述第一网络设备向第二网络设备发送第一攻击样本之前,所述方法还包括:所述第一网络设备向所述第二网络设备发送第二攻击样本;所述第一网络设备接收所述第二网络设备发送的第二反馈消息,所述第二反馈消息包括所述第二网络设备对所述第二攻击样本的检测结果,所述第二攻击样本的检测结果指示所述第二攻击样本为攻击样本;所述第一网络设备根据所述第二反馈消息对所述第二攻击样本进行调整,得到所述第一攻击样本。
结合第三方面,在第三方面的某些实现方式中,所述第一网络设备中包括强化学习智能体RL Agent和强化学习环境RL Evn,所述RL Evn根据所述第二攻击样本的检测结果以及奖惩函数向所述RL Agent发送惩戒信号;所述RL Agent根据所述惩戒信号对所述第 二攻击样本进行调整,得到所述第一攻击样本。
结合第三方面,在第三方面的某些实现方式中,所述方法还包括:所述第一网络设备通过以下中的任一种或多种攻击方式的组合得到所述第二攻击样本:Unicode编码、Base64编码、插入注释、垃圾数据填充、Offset替换。
结合第三方面,在第三方面的某些实现方式中,所述方法还包括:所述第二网络设备接收所述第一网络设备发送的所述第一攻击样本;所述第二网络设备对所述第一攻击样本进行检测,得到第一攻击样本的检测结果;所述第二网络设备向所述第一网络设备发送第一反馈消息,所述第一反馈消息包括所述第二网络设备对所述第一攻击样本的检测结果。
结合第三方面,在第三方面的某些实现方式中,所述第二网络设备对所述第一攻击样本进行特征提取,获得所述第一攻击样本的特征向量;所述第二网络设备对所述第一攻击样本的特征向量进行检测,获得所述第一攻击样本的检测结果。
结合第三方面,在第三方面的某些实现方式中,在所述第二网络设备接收所述第一网络设备发送的所述第一攻击样本之前,所述方法还包括:所述第二网络设备接收所述第一网络设备发送的第二攻击样本;所述第二网络设备对所述第二攻击样本进行检测,得到第二攻击样本的检测结果,所述第二攻击样本的检测结果指示所述第二攻击样本为攻击样本;所述第二网络设备向所述第一网络设备发送第二反馈消息,所述第二反馈消息包括所述第二网络设备对所述第二攻击样本的检测结果。
第四方面,提供了一种第一网络设备,包括:发送模块,接收模块,
发送模块,用于向第二网络设备发送第一攻击样本;
接收模块,用于接收所述第二网络设备发送的第一反馈消息,所述第一反馈消息包括所述第二网络设备对所述第一攻击样本的检测结果,所述第一攻击样本的检测结果指示所述第一攻击样本为正常样本;
所述发送模块,还用于根据所述第一反馈消息将所述第一攻击样本发送到管理设备,所述管理设备用于根据所述第一攻击样本训练得到攻击检测模型,所述攻击检测模型被所述第二网络设备用于识别攻击报文。
结合第四方面,在第四方面的某些实现方式中,所述发送模块,还用于向所述第二网络设备发送第二攻击样本;所述接收模块,还用于接收所述第二网络设备发送的第二反馈消息,所述第二反馈消息包括所述第二网络设备对所述第二攻击样本的检测结果,所述第二攻击样本的检测结果指示所述第二攻击样本为攻击样本;
所述第一网络设备还包括:处理模块,用于根据所述第二反馈消息对所述第二攻击样本进行调整,得到所述第一攻击样本。
结合第四方面,在第四方面的某些实现方式中,所述第一网络设备中包括强化学习智能体RL Agent和强化学习环境RL Evn,所述处理模块具体用于:所述RL Evn根据所述第二攻击样本的检测结果以及奖惩函数向所述RL Agent发送惩戒信号;所述RL Agent根据所述惩戒信号对所述第二攻击样本进行调整,得到所述第一攻击样本。
结合第四方面,在第四方面的某些实现方式中,所述处理模块还用于:通过以下中的任一种或多种攻击方式的组合得到所述第二攻击样本:Unicode编码、Base64编码、插入注释、垃圾数据填充、Offset替换。
第五方面,提供了一种第二网络设备,包括:接收模块,检测模块,发送模块,
接收模块,用于接收第一网络设备发送的第一攻击样本;
检测模块,用于对所述第一攻击样本进行检测,得到第一攻击样本的检测结果,所述第一攻击样本的检测结果指示所述第一攻击样本为正常样本;
发送模块,用于向所述第一网络设备发送第一反馈消息,所述第一反馈消息包括所述第二网络设备对所述第一攻击样本的检测结果。
结合第五方面,在第五方面的某些实现方式中,所述检测模块具体用于:对所述第一攻击样本进行特征提取,获得所述第一攻击样本的特征向量;对所述第一攻击样本的特征向量进行检测,获得所述第一攻击样本的检测结果。
结合第五方面,在第五方面的某些实现方式中,所述接收模块,还用于接收所述第一网络设备发送的第二攻击样本;所述检测模块,还用于对所述第二攻击样本进行检测,得到第二攻击样本的检测结果,所述第二攻击样本的检测结果指示所述第二攻击样本为攻击样本;发送模块,还用于向所述第一网络设备发送第二反馈消息,所述第二反馈消息包括所述第二网络设备对所述第二攻击样本的检测结果。
第六方面,提供了一种攻击样本管理的系统,包括:发送模块,接收模块,处理模块,
发送模块,用于第一网络设备向第二网络设备发送第一攻击样本;
接收模块,用于所述第一网络设备接收所述第二网络设备发送的第一反馈消息,所述第一反馈消息包括所述第二网络设备对所述第一攻击样本的检测结果,所述第一攻击样本的检测结果指示所述第一攻击样本为正常样本;
所述发送模块,还用于所述第一网络设备根据所述第一反馈消息将所述第一攻击样本发送到管理设备;
处理模块,用于所述管理设备根据所述第一攻击样本训练得到攻击检测模型,所述攻击检测模型被所述第二网络设备用于识别攻击报文;
所述处理模块,还用于所述管理设备将所述攻击检测模型部署到所述第二网络设备中。
结合第六方面,在第六方面的某些实现方式中,所述发送模块,还用于所述第一网络设备向所述第二网络设备发送第二攻击样本;所述接收模块,还用于所述第一网络设备接收所述第二网络设备发送的第二反馈消息,所述第二反馈消息包括所述第二网络设备对所述第二攻击样本的检测结果,所述第二攻击样本的检测结果指示所述第二攻击样本为攻击样本;所述处理模块,还用于所述第一网络设备根据所述第二反馈消息对所述第二攻击样本进行调整,得到所述第一攻击样本。
结合第六方面,在第六方面的某些实现方式中,所述第一网络设备中包括强化学习智能体RL Agent和强化学习环境RL Evn,所述处理模块具体用于:所述RL Evn根据所述第二攻击样本的检测结果以及奖惩函数向所述RL Agent发送惩戒信号;所述RL Agent根据所述惩戒信号对所述第二攻击样本进行调整,得到所述第一攻击样本。
结合第六方面,在第六方面的某些实现方式中,所述处理模块还用于:所述第一网络设备通过以下中的任一种或多种攻击方式的组合得到所述第二攻击样本:Unicode编码、Base64编码、插入注释、垃圾数据填充、Offset替换。
结合第六方面,在第六方面的某些实现方式中,所述接收模块,还用于所述第二网络设备接收所述第一网络设备发送的所述第一攻击样本;所述处理模块,还用于所述第二网络设备对所述第一攻击样本进行检测,得到第一攻击样本的检测结果;所述发送模块,还用于所述第二网络设备向所述第一网络设备发送第一反馈消息,所述第一反馈消息包括所述第二网络设备对所述第一攻击样本的检测结果。
结合第六方面,在第六方面的某些实现方式中,所述处理模块具体用于:所述第二网络设备对所述第一攻击样本进行特征提取,获得所述第一攻击样本的特征向量;所述第二网络设备对所述第一攻击样本的特征向量进行检测,获得所述第一攻击样本的检测结果。
结合第六方面,在第六方面的某些实现方式中,所述接收模块,还用于所述第二网络设备接收所述第一网络设备发送的第二攻击样本;所述处理模块,还用于所述第二网络设备对所述第二攻击样本进行检测,得到第二攻击样本的检测结果,所述第二攻击样本的检测结果指示所述第二攻击样本为攻击样本;所述发送模块,还用于所述第二网络设备向所述第一网络设备发送第二反馈消息,所述第二反馈消息包括所述第二网络设备对所述第二攻击样本的检测结果。
第七方面,提供了一种第一网络设备,该第一网络设备包括处理器、存储器、接口和总线。其中接口可以通过无线或有线的方式实现,具体来讲可以是网卡。上述处理器、存储器和接口通过总线连接。
该接口具体可以包括发送器和接收器,用于第一网络设备实现上述收发。
该处理器用于执行上述实施例中由第一网络设备进行的处理。存储器包括操作系统和应用程序,用于存储程序、代码或指令,当处理器或硬件设备执行这些程序、代码或指令时可以完成方法实施例中涉及第一网络设备的处理过程。可选的,该存储器可以包括只读存储器(read-only memory,ROM)和随机存取存储器(random access memory,RAM)。其中,该ROM包括基本输入/输出系统(basic input/output system,BIOS)或嵌入式系统;该RAM包括应用程序和操作系统。当需要运行第一网络设备时,通过固化在ROM中的BIOS或者嵌入式系统中的bootloader引导系统进行启动,引导第一网络设备进入正常运行状态。在第一网络设备进入正常运行状态后,运行在RAM中的应用程序和操作系统,从而,完成上述第一方面以及任一可能的实现方式中的方法实施例中涉及第一网络设备的处理过程。
可以理解的是,在实际应用中,第一网络设备可以包含任意数量的接口,处理器或者存储器。
第八方面,提供了一种第二网络设备,该第二网络设备包括处理器、存储器、接口和总线。其中接口可以通过无线或有线的方式实现,具体来讲可以是网卡。上述处理器、存储器和接口通过总线连接。
该接口具体可以包括发送器和接收器,用于第二网络设备实现上述收发。
该处理器用于执行上述实施例中由第二网络设备进行的处理。存储器包括操作系统和应用程序,用于存储程序、代码或指令,当处理器或硬件设备执行这些程序、代码或指令时可以完成方法实施例中涉及第二网络设备的处理过程。可选的,该存储器可以包括只读存储器(read-only memory,ROM)和随机存取存储器(random access memory,RAM)。其中,该ROM包括基本输入/输出系统(basic input/output system,BIOS)或嵌入式系统;该RAM包括应用程序和操作系统。当需要运行第二网络设备时,通过固化在ROM中的BIOS或者嵌入式系统中的bootloader引导系统进行启动,引导第二网络设备进入正常运行状态。在第二网络设备进入正常运行状态后,运行在RAM中的应用程序和操作系统,从而,完成上述第二方面以及任一可能的实现方式中的方法实施例中涉及第二网络设备的处理过程。
可以理解的是,在实际应用中,第二网络设备可以包含任意数量的接口,处理器或者 存储器。
第九方面,提供了一种计算机程序产品,该计算机程序产品包括:计算机程序代码,当该计算机程序代码在第一网络设备上运行时,使得第一网络设备执行上述第一方面或第一方面的任一种可能执行的方法。
第十方面,提供了一种计算机程序产品,该计算机程序产品包括:计算机程序代码,当该计算机程序代码在第二网络设备上运行时,使得第二网络设备执行上述第二方面或第二方面的任一种可能执行的方法。
第十一方面,提供了一种计算机可读介质,该计算机可读介质存储有程序代码,当该计算机程序代码在第一网络设备上运行时,使得第一网络设备执行上述第一方面或第一方面的任一种可能执行的方法。这些计算机可读存储包括但不限于如下的一个或者多个:只读存储器(read-only memory,ROM)、可编程ROM(programmable ROM,PROM)、可擦除的PROM(erasable PROM,EPROM)、Flash存储器、电EPROM(electrically EPROM,EEPROM)以及硬盘驱动器(hard drive)。
第十二方面,提供了一种计算机可读介质,该计算机可读介质存储有程序代码,当该计算机程序代码在第二网络设备上运行时,使得第一网络设备执行上述第二方面或第一方面的任一种可能执行的方法。这些计算机可读存储包括但不限于如下的一个或者多个:只读存储器(read-only memory,ROM)、可编程ROM(programmable ROM,PROM)、可擦除的PROM(erasable PROM,EPROM)、Flash存储器、电EPROM(electrically EPROM,EEPROM)以及硬盘驱动器(hard drive)。
第十三方面,提供一种芯片,该芯片包括处理器与数据接口,其中,处理器通过该数据接口读取存储器上存储的指令,以执行第一方面或第一方面任意一种可能的实现方式中的方法。在具体实现过程中,该芯片可以以中央处理器(central processing unit,CPU)、微控制器(micro controller unit,MCU)、微处理器(micro processing unit,MPU)、数字信号处理器(digital signal processing,DSP)、片上系统(system on chip,SoC)、专用集成电路(application-specific integrated circuit,ASIC)、现场可编程门阵列(field programmable gate array,FPGA)或可编辑逻辑器件(programmable logic device,PLD)的形式实现。
第十四方面,提供一种芯片,该芯片包括处理器与数据接口,其中,处理器通过该数据接口读取存储器上存储的指令,以执行第二方面或第二方面任意一种可能的实现方式中的方法。在具体实现过程中,该芯片可以以中央处理器(central processing unit,CPU)、微控制器(micro controller unit,MCU)、微处理器(micro processing unit,MPU)、数字信号处理器(digital signal processing,DSP)、片上系统(system on chip,SoC)、专用集成电路(application-specific integrated circuit,ASIC)、现场可编程门阵列(field programmable gate array,FPGA)或可编辑逻辑器件(programmable logic device,PLD)的形式实现。
第十五方面。提供了一种攻击样本管理的系统,包括:管理设备,如第三方面或第三方面任意一种可能的实现方式中的第一网络设备以及如第四方面或第四方面任意一种可能的实现方式中的第二网络设备。
附图说明
图1是本申请实施例提供的一种第一网络设备100的架构示意图。
图2是本申请实施例提供的一种攻击样本管理的方法的示意性流程图。
图3是本申请实施例提供的一种攻击样本管理的系统架构300的示意性框图。
图4是本申请实施例提供的另一种攻击样本管理的方法的示意性流程图。
图5是本申请实施例提供的一种第一网络设备500的示意性结构图。
图6是本申请实施例提供的一种第二网络设备600的示意性结构图。
具体实施方式
下面将结合附图,对本申请中的技术方案进行描述。
本申请中,“至少一个”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:包括单独存在A,同时存在A和B,以及单独存在B的情况,其中A,B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一项(个)”或其类似表达,是指的这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b,或c中的至少一项(个),可以表示:a,b,c,a-b,a-c,b-c,或a-b-c,其中a,b,c可以是单个,也可以是多个。
随着网络规模的日益扩大,网络攻击数量也随之增多,攻防对抗也愈发严峻。从国家组织“护网行动”以来,一些常见的攻击方法和模式已经被大多数安全公司所覆盖。但是一些高级绕过类攻击的检测能力还很有限,尤其是针对公司网站的注入类攻击,黑客很容易绕过传统的防攻击检测。因此,如何更好的检测和响应这些攻击已经成为安全产品或者安全解决方案的竞争力制高点。
有鉴于此,本申请实施例提供了一种攻击样本管理的方法,该方法能够提高攻击检测模型对攻击行为的检测能力,为提高网络环境的安全性奠定了基础。
本申请实施例提供的攻击样本管理的方法可应用于计算设备,该计算设备也可以被称为计算机系统,包括硬件层、运行在硬件层之上的操作系统层,以及运行在操作系统层上的应用层。该硬件层包括处理单元、内存和内存控制单元等硬件,随后对该硬件的功能和结构进行详细说明。该操作系统是任意一种或多种通过进程(process)实现业务处理的计算机操作系统,例如,Linux操作系统、Unix操作系统、Android操作系统、iOS操作系统或windows操作系统等。该应用层包含浏览器、通讯录、文字处理软件、即时通信软件等应用程序。并且,可选地,该计算机系统是智能手机等手持设备,或个人计算机等终端设备,本申请并未特别限定,只要能够通过本申请实施例提供的方法即可。本申请实施例提供的攻击样本管理的方法的执行主体可以是计算机系统,或者,是计算机系统中能够调用程序并执行程序的功能模块。
作为示例,上述执行攻击样本管理的方法的计算设备也可以称为第一网络设备,该第一网络设备例如可以是突破与攻击模拟(breach and attack simulation,BAS)。
下面结合图1,对本申请实施例提供的一种第一网络设备的架构进行详细描述。
图1是本申请实施例提供的一种第一网络设备100的架构示意图。该第一网络设备100可以是服务器或者计算机或者其他具有计算能力的设备。图1所示的第一网络设备100包括:至少一个处理器110和内存120。
处理器110执行内存120中的指令,使得第一网络设备100实现本申请提供的攻击样 本管理的方法。
可选地,第一网络设备100还包括系统总线,其中,处理器110和内存120分别与系统总线连接。处理器110能够通过系统总线访问内存120,例如,处理器110能够通过系统总线在内存120中进行数据读写或代码执行。该系统总线是快捷外设部件互连标准(peripheral component interconnect express,PCI)总线或扩展工业标准结构(extended industry standard architecture,EISA)总线等。所述系统总线分为地址总线、数据总线、控制总线等。为便于表示,图1中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
一种可能的实现方式,处理器110的功能主要是解释计算机程序的指令(或者说,代码)以及处理计算机软件中的数据。其中,该计算机程序的指令以及计算机软件中的数据能够保存在内存120或者缓存116中。
可选地,处理器110可能是集成电路芯片,具有信号的处理能力。作为示例而非限定,处理器110是通用处理器、数字信号处理器(digital signal processor,DSP)、专用集成电路(application specific integrated circuit,ASIC)、现成可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。其中,通用处理器是微处理器等。例如,该处理器110是中央处理单元(central processing unit,CPU)。
可选地,每个处理器110包括至少一个处理单元112和内存控制单元114。
可选地,处理单元112也称为核心(core)或内核,是处理器最重要的组成部分。处理单元112是由单晶硅以一定的生产工艺制造出来的,处理器所有的计算、接受命令、存储命令、处理数据都由核心执行。处理单元分别独立地运行程序指令,利用并行计算的能力加快程序的运行速度。各种处理单元都具有固定的逻辑结构,例如,处理单元包括例如,一级缓存、二级缓存、执行单元、指令级单元和总线接口等逻辑单元。
一种实现举例,内存控制单元114用于控制内存120与处理单元112之间的数据交互。具体地说,内存控制单元114从处理单元112接收内存访问请求,并基于该内存访问请求控制针对内存的访问。作为示例而非限定,内存控制单元是内存管理单元(memory management unit,MMU)等器件。
一种实现举例,各内存控制单元114通过系统总线进行针对内存120的寻址。并且在系统总线中配置仲裁器(图中未示出),该仲裁器负责处理和协调多个处理单元112的竞争访问。
一种实现举例,处理单元112和内存控制单元114通过芯片内部的连接线,例如地址线,通信连接,从而实现处理单元112和内存控制单元114之间的通信。
可选地,每个处理器110还包括缓存116,其中,缓存是数据交换的缓冲区(称作cache)。当处理单元112要读取数据时,会首先从缓存中查找需要的数据,如果找到了则直接执行,找不到的话则从内存中找。由于缓存的运行速度比内存快得多,故缓存的作用就是帮助处理单元112更快地运行。
内存(memory)120能够为第一网络设备100中的进程提供运行空间,例如,内存120中保存用于生成进程的计算机程序(具体地说,是程序的代码)。计算机程序被处理器运行而生成进程后,处理器在内存120中为该进程分配对应的存储空间。进一步的,上述存储空间进一步包括文本段、初始化数据段、位初始化数据段、栈段、堆段等等。内存 120在上述进程对应的存储空间中保存进程运行期间产生的数据,例如,中间数据,或过程数据等等。
可选地,内存也称为内存储器,其作用是用于暂时存放处理器110中的运算数据,以及与硬盘等外部存储器交换的数据。只要计算机在运行中,处理器110就会把需要运算的数据调到内存中进行运算,当运算完成后处理单元112再将结果传送出来。
作为示例而非限定,内存120是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器是只读存储器(read-only memory,ROM)、可编程只读存储器(programmable ROM,PROM)、可擦除可编程只读存储器(erasable PROM,EPROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)或闪存。易失性存储器是随机存取存储器(random access memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(static RAM,SRAM)、动态随机存取存储器(dynamic RAM,DRAM)、同步动态随机存取存储器(synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(double data rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(direct rambus RAM,DR RAM)。应注意,本文描述的系统和方法的内存120旨在包括但不限于这些和任意其它适合类型的存储器。
以上列举的第一网络设备100的结构仅为示例性说明,本申请并未限定于此,本申请实施例的第一网络设备100包括现有技术中计算机系统中的各种硬件,例如,第一网络设备100还包括除内存120以外的其他存储器,例如,磁盘存储器等。本领域的技术人员应当理解,第一网络设备100还可以包括实现正常运行所必须的其他器件。同时,根据具体需要,本领域的技术人员应当理解,上述第一网络设备100还可包括实现其他附加功能的硬件器件。此外,本领域的技术人员应当理解,上述第一网络设备100也可仅仅包括实现本申请实施例所必须的器件,而不必包括图1中所示的全部器件。
图2是本申请实施例提供的一种攻击样本管理的方法的示意性流程图,该方法可以由图1所示的第一网络设备100执行。如图2所示,该方法可以包括步骤210-230,下面分别对步骤210-230进行详细描述。
步骤210:第一网络设备向第二网络设备发送第一攻击样本。
第一网络设备可以用于生成所述第一攻击样本,举例说明,该第一网络设备例如可以是BAS。
第二网络设备可以用于对攻击样本进行检测,是被攻击样本是否为正常的样本。举例说明,该第二网络设备例如可以是以下中的任一种:防火墙(fire wall,FW)、Web应用防火墙(web application firewall,WAF)、态势感知产品(situation awareness,AS)等。
应理解,第二网络设备的硬件架构与第一网络设备类似,具体的请参见图1中对第一网络设备100的硬件架构的描述,此处不再赘述。
步骤220:第一网络设备接收所述第二网络设备发送的第一反馈消息。
作为示例,本申请实施例中第二网络设备在接收到第一网络设备发送的第一攻击样本后,可以检测该第一攻击样本是否为正常样本。具体的,举例说明,第二网络设备可以对该第一攻击样本进行特征提取,获得第一攻击样本的特征向量,并对第一攻击样本的特征向量进行识别,确定该第一攻击样本是否为正常样本。
在本申请中,第二网络设备获得的第一攻击样本的检测结果指示该第一攻击样本为正常样本,该第二网络设备可以基于该第一攻击样本的检测结果生成第一反馈消息,该第一反馈下消息中包括所述第一攻击样本的检测结果,并将该第一反馈消息发送给第一网络设备。
步骤230:第一网络设备根据所述第一反馈消息将所述第一攻击样本发送到管理设备。
第一网络设备在接收到第二网络设备发送的第一反馈消息后,根据第一反馈消息中包括的第一攻击样本的检测结果指示第一攻击样本为正常样本,将该第一攻击样本发送给管理设备。
管理设备在获取到第一攻击样本后,可以根据该第一攻击样本训练得到攻击检测模型,该攻击检测模型被所述第二网络设备用于识别攻击报文。具体的,管理设备可以将第一攻击样本作为模型的输入,训练攻击检测模型。
举例说明,上述管理设备可以是云端后台升级系统,一般例如是企业内部系统,专门用于数据管理、模型升级等。应理解,管理设备的硬件架构与第一网络设备类似,具体的请参见图1中对第一网络设备100的硬件架构的描述,此处不再赘述。
上述技术方案中,由于攻击检测模型的训练样本是被第二网络设备识别为正常样本的第一攻击样本,因此,通过第一攻击样本训练得到的被第二网络设备用于识别攻击报文的攻击检测模型可以识别一些绕过攻击检测的攻击样本,从而提高了攻击检测模型对攻击报文的检测或识别能力,为提高网络环境的安全性奠定了基础。
下面结合图3,对应用于本申请实施例的一种攻击样本管理的系统架构进行详细描述。
图3是本申请实施例提供的一种攻击样本管理的系统架构300的示意性框图。如图3所示,该攻击样本管理的系统架构300中可以包括:第一网络设备310、第二网络设备320、管理设备330,下面分别对各个设备的功能进行详细描述。
1、第一网络设备310
第一网络设备310也可以称为攻击样本自动生成单元,主要负责生成攻击样本,也可以理解为产生攻击样本的攻击者模型。作为示例,第一网络设备310可以包括:强化学习智能体(reinforcement learning Agent,RL Agent)311、RL环境(RL environment,RL Evn)312、结构化查询语言操纵器(structured query language manipulator,SQL manipulator)313。
应理解,RL把学习看作试探评价过程,RL Agent 311以“试错”的方式进行学习,其选择一个动作用于RL Evn 312。RL Evn 312接受该动作后状态发生变化,同时产生一个强化信号(奖或惩)反馈给RL Agent 311。RL Agent 311根据强化信号和环境当前状态再选择下一个动作,选择的原则是使受到正强化(奖)的概率增大,也即使RL Agent 311获得最大的奖赏。
举例说明,在本申请实施例中,RL Agent 311作用于RL Evn 312的动作为SQL攻击样本的注入方式,并基于RL Evn 312反馈的评价(通常是一个奖励或惩罚的强化信号)调整输出的动作。SQL manipulator 313用于基于该注入方式生成攻击样本。RL Evn 312用于根据攻击样本的检测结果对RL Agent 311所产生的动作的好坏进行评价。
2、第二网络设备320
第二网络设备320也可以称为AI检测单元,主要负责检测第一网络设备310生成的攻击样本,检测该样本是否为攻击样本,并向RL Evn 312通知样本的检测结果。第二网络设备320也可以理解为对攻击样本进行检测和识别的防护者模型。作为示例,第二网络 设备320可以包括:特征提取模块321、SQL检测模块322。
举例说明,在本申请实施例中,特征提取模块321用于从SQL manipulator 313获取生成的攻击样本,并对该攻击样本进行特征提取生成特征向量。SQL检测模块322用于根据该特征向量检测对应的样本是否为攻击样本。
3、管理设备330
管理设备330也可以称为检测模型更新单元。具体的,管理设备330一方面实现训练模块的功能,主要用于基于上述样本集合训练检测模型,得到上文中的攻击检测模型。另一方面实现更新模块的功能主要用于将该攻击检测模型部署到第二网络设备320的SQL检测模块322中。
下面以图3所示的攻击样本管理的系统架构300为例,结合图4,对本申请实施例提供的另一种攻击样本管理的方法的具体实现方式进行详细描述。应理解,图4的例子仅仅是为了帮助本领域技术人员理解本申请实施例,而非要将申请实施例限制于所示例的具体数值或具体场景。本领域技术人员根据下面所给出的图4的例子,显然可以进行各种等价的修改或变化,这样的修改和变化也落入本申请实施例的范围内。
图4是本申请实施例提供的另一种攻击样本管理的方法的示意性流程图,该方法可以由图3所示的攻击样本管理的系统架构300执行。如图4所示,该方法可以包括步骤410-480,下面分别对步骤410-480进行详细描述。
步骤410:RL Agent 311向RL Evn 312输出SQL攻击注入方式。
应理解,RL Agent 311向RL Evn 312输出的SQL攻击注入方式也可以称为混淆绕过方式,以便于根据该混淆绕过方式生成攻击样本。
作为示例,常用的SQL攻击注入方式可以包括但不限于:Unicode编码、Base64编码、插入注释、垃圾数据填充、Offset替换等。下面对上述归纳的几种SQL攻击注入方式进行解释说明。
Unicode编码,指随机将SQL语句中的关键字进行unicode编码,进行替换,替换1-3次。例如,select*from user可以转换成selectu0020*u0020fromu0020users,其中空格使用u0020来替换。
Base64编码,指随机将SQL语句中的关键字进行base64编码,进行替换,替换1-3次。例如,如:1’AND‘1’=’1,可以使用MScgQU5EICcxJz0nMQ==来替换。
插入注释,指随机选择字符,在字符后面加/*XXX*/,注释内容可以随机。
垃圾数据填充,指在&之前添加相同字符,长度随机。例如,aaaaaa&id=1’and‘1’=‘1。
Offset替换,指随机选择逗号转换成offset,替换1~3次。
步骤420:RL Evn 312将SQL攻击注入方式发送给SQL manipulator 313。
步骤430:SQL manipulator 313根据SQL攻击注入方式生成攻击样本。
SQL manipulator 313可以根据RL Evn 312发送的SQL攻击注入方式生成攻击样本。举例说明,以SQL攻击注入方式为Unicode编码为例,SQL manipulator 313可以将SQL语句“select*from user”转换成攻击样本“selectu0020*u0020fromu0020users”,其中空格使用u0020来替换。
步骤440:特征提取模块321获取SQL manipulator 313生成的攻击样本,并对该攻击样本进行特征提取生成特征向量。
SQL manipulator 313在生成攻击样本后,可以将该攻击样本发送给特征提取模块321, 以便于特征提取模块321对该攻击样本进行特征提取生成特征向量。举例说明,该特征向量主要包含:函数、危险函数、空格、危险字符、标点符号等所占的占比、字节的n-gram特征等。
步骤450:特征提取模块321将提取的特征向量作为状态传递至SQL检测模块322进行样本检测。
特征提取模块321在对该攻击样本进行特征提取生成特征向量后,将该特征向量作为状态传递至SQL检测模块322进行样本检测。
步骤460:RL Evn 312可以基于SQL检测模块322对攻击样本的检测结果,对RL Agent 311输出的SQL攻击注入方式进行评价。
SQL检测模块322在对攻击样本进行检测后,可以将检测的结果反馈给RL Evn 312,以便于RL Agent 311输出的SQL攻击注入方式进行评价。例如,SQL检测模块322识别出攻击样本,也就是说,由RL Agent 311输出的SQL攻击注入方式生成的攻击样本没有绕过SQL检测,那么,RL Evn 312可以向RL Agent 311反馈一个惩罚的强化信号。又如,SQL检测模块322没有识别出攻击样本,也就是说,由RL Agent 311输出的SQL攻击注入方式生成的攻击样本绕过了SQL检测,那么,RL Evn 312可以向RL Agent 311反馈一个奖励的强化信号。
RL Agent 311在收到RL Evn 312反馈的评价后,可以基于该评价对步骤410中输出的动作(SQL攻击注入方式)进行调整。例如,如果步骤410中RL Agent 311输出的SQL攻击注入方式导致RL Evn 312正的奖赏(立即报酬),那么RL Agent 311以后产生这个动作趋势便会加强。反之,RL Agent 311产生这个动作的趋势将削弱。
具体的,作为示例,RL Evn 312可以基于SQL检测模块322识别攻击样本的难度设置奖惩函数。举例说明,该奖惩函数可以二分类softmax,R=α*(softmax_0-softmax_1)。其中softmax_0代表SQL检测模块322将攻击样本识别为白的概率,softmax_1代表SQL检测模块322将攻击样本识别为黑的概率。SQL检测模块322将攻击样本识别的越白,则RL Evn 312向RL Agent 311奖励的回报越大。
应理解,上述将攻击样本识别为白的概率可以是SQL检测模块322识别出攻击样本的概率,将攻击样本识别为黑的概率可以是SQL检测模块322没有识别出攻击样本的概率。也就是说,如果SQL检测模块322识别出攻击样本的概率越小,说明RL Agent 311输出的SQL攻击注入方式生成的攻击样本绕过SQL检测的概率越大,则RL Evn 312向RL Agent 311奖励的回报越大。
迭代执行步骤410-460,直到第一网络设备310和第二网络设备320进行对抗的强化学习模型收敛。该强化学习模型收敛后,可以获取高级绕过类的攻击样本,该攻击样本可以绕过第二网络设备320中SQL检测模块322的检测和识别。应理解,这里的高级绕过类的攻击样本对应于上文中的第一攻击样本。
步骤470:管理设备330获取高级绕过类的攻击样本,并基于该获取高级绕过类的攻击样本进行攻击检测模型的训练。
当通过第一网络设备310和第二网络设备320之间的对抗筛选出高级绕过类的攻击样本后,可以将这些样本送入云端的管理设备330进行攻击检测模型的更新。
具体的,作为示例,管理设备330可以包括:样本管理、SQL训练、SQL模型、模型验证。其中,样本管理用于持续收集高级绕过类的攻击样本。SQL训练用于对持续收集 的数据样本对攻击检测模型进行迭代训练。攻击模型用于完成SQL训练过程后,记录攻击检测模型的信息。模型验证用于对新生成的攻击检测模型进行可用性等验证。
步骤480:管理设备330将更新的攻击检测模型部署到第二网络设备320中。
作为示例,管理设备330可以将更新的攻击检测模型部署到第二网络设备320的SQL检测模块322中,以便于SQL检测模块322可以识别出高级绕过类的攻击样本,从而提高对攻击样本的检测能力。
上述技术方案中,可以从防御视角转换为攻击视角,不再仅仅关注抵御攻击,而是借助已经归纳的注入攻击手段,按照攻击者和防护者进行强化学习对抗,获取绕过AI检测算法的高价值对抗样本,基于对抗样本更新模型从而提高模型对攻击报文的检测能力。
应理解,在本申请的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
上文结合图1至图4,详细描述了本申请实施例提供的方法以及设备的硬件架构,下面将结合图5至图6,详细描述本申请的装置的实施例。应理解,方法实施例的描述与装置实施例的描述相互对应,因此,未详细描述的部分可以参见前面方法实施例。
图5是本申请实施例提供的一种第一网络设备500的示意性结构图。图5所示的该第一网络设备500可以执行上述实施例的方法中第一网络设备执行的相应步骤。如图5所示,所述第一网络设备500包括:发送模块510,接收模块520,
发送模块510,用于向第二网络设备发送第一攻击样本;
接收模块520,用于接收所述第二网络设备发送的第一反馈消息,所述第一反馈消息包括所述第二网络设备对所述第一攻击样本的检测结果,所述第一攻击样本的检测结果指示所述第一攻击样本为正常样本;
所述发送模块510,还用于根据所述第一反馈消息将所述第一攻击样本发送到管理设备,所述管理设备用于根据所述第一攻击样本训练得到攻击检测模型,所述攻击检测模型被所述第二网络设备用于识别攻击报文。
可选地,所述发送模块510,还用于向所述第二网络设备发送第二攻击样本;所述接收模块520,还用于接收所述第二网络设备发送的第二反馈消息,所述第二反馈消息包括所述第二网络设备对所述第二攻击样本的检测结果,所述第二攻击样本的检测结果指示所述第二攻击样本为攻击样本;
所述第一网络设备500还包括:处理模块530,用于根据所述第二反馈消息对所述第二攻击样本进行调整,得到所述第一攻击样本。
可选地,所述第一网络设备500中包括强化学习智能体RL Agent和强化学习环境RL Evn,所述处理模块530具体用于:所述RL Evn根据所述第二攻击样本的检测结果以及奖惩函数向所述RL Agent发送惩戒信号;所述RL Agent根据所述惩戒信号对所述第二攻击样本进行调整,得到所述第一攻击样本。
可选地,所述处理模块530还用于:通过以下中的任一种或多种攻击方式的组合得到所述第二攻击样本:Unicode编码、Base64编码、插入注释、垃圾数据填充、Offset替换。
图6是本申请实施例提供的一种第二网络设备600的示意性结构图。图6所示的该第二网络设备600可以执行上述实施例的方法中第二网络设备执行的相应步骤。如图6所示,所述第二网络设备600包括:接收模块610,检测模块620,发送模块630,
接收模块610,用于接收第一网络设备发送的第一攻击样本;
检测模块620,用于对所述第一攻击样本进行检测,得到第一攻击样本的检测结果,所述第一攻击样本的检测结果指示所述第一攻击样本为正常样本;
发送模块630,用于向所述第一网络设备发送第一反馈消息,所述第一反馈消息包括所述第二网络设备对所述第一攻击样本的检测结果。
可选地,所述检测模块620具体用于:对所述第一攻击样本进行特征提取,获得所述第一攻击样本的特征向量;对所述第一攻击样本的特征向量进行检测,获得所述第一攻击样本的检测结果。
可选地,所述接收模块610,还用于接收所述第一网络设备发送的第二攻击样本;所述检测模块620,还用于对所述第二攻击样本进行检测,得到第二攻击样本的检测结果,所述第二攻击样本的检测结果指示所述第二攻击样本为攻击样本;发送模块630,还用于向所述第一网络设备发送第二反馈消息,所述第二反馈消息包括所述第二网络设备对所述第二攻击样本的检测结果。
本申请实施例还提供了一种计算机可读介质,该计算机可读介质存储有程序代码,当该计算机程序代码在计算机上运行时,使得计算机执行上述第一网络设备执行的方法。这些计算机可读存储包括但不限于如下的一个或者多个:只读存储器(read-only memory,ROM)、可编程ROM(programmable ROM,PROM)、可擦除的PROM(erasable PROM,EPROM)、Flash存储器、电EPROM(electrically EPROM,EEPROM)以及硬盘驱动器(hard drive)。
本申请实施例还提供了一种计算机可读介质,该计算机可读介质存储有程序代码,当该计算机程序代码在计算机上运行时,使得计算机执行上述第二网络设备执行的方法。这些计算机可读存储包括但不限于如下的一个或者多个:只读存储器(read-only memory,ROM)、可编程ROM(programmable ROM,PROM)、可擦除的PROM(erasable PROM,EPROM)、Flash存储器、电EPROM(electrically EPROM,EEPROM)以及硬盘驱动器(hard drive)。
本申请实施例还提供了一种芯片系统,应用于第一网络设备中,该芯片系统包括:至少一个处理器、至少一个存储器和接口电路,所述接口电路负责所述芯片系统与外界的信息交互,所述至少一个存储器、所述接口电路和所述至少一个处理器通过线路互联,所述至少一个存储器中存储有指令;所述指令被所述至少一个处理器执行,以进行上述各个方面的所述的方法中所述第一网络设备的操作。
在具体实现过程中,该芯片可以以中央处理器(central processing unit,CPU)、微控制器(micro controller unit,MCU)、微处理器(micro processing unit,MPU)、数字信号处理器(digital signal processing,DSP)、片上系统(system on chip,SoC)、专用集成电路(application-specific integrated circuit,ASIC)、现场可编程门阵列(field programmable gate array,FPGA)或可编辑逻辑器件(programmable logic device,PLD)的形式实现。
本申请实施例还提供了一种芯片系统,应用于第二网络设备中,该芯片系统包括:至少一个处理器、至少一个存储器和接口电路,所述接口电路负责所述芯片系统与外界的信息交互,所述至少一个存储器、所述接口电路和所述至少一个处理器通过线路互联,所述至少一个存储器中存储有指令;所述指令被所述至少一个处理器执行,以进行上述各个方面的所述的方法中所述第二网络设备的操作。
在具体实现过程中,该芯片可以以中央处理器(central processing unit,CPU)、微控制器(micro controller unit,MCU)、微处理器(micro processing unit,MPU)、数字信号处理器(digital signal processing,DSP)、片上系统(system on chip,SoC)、专用集成电路(application-specific integrated circuit,ASIC)、现场可编程门阵列(field programmable gate array,FPGA)或可编辑逻辑器件(programmable logic device,PLD)的形式实现。
本申请实施例还提供了一种计算机程序产品,应用于第一网络设备中,所述计算机程序产品包括一系列指令,当所述指令被运行时,以进行上述各个方面的所述的方法中所述第一网络设备的操作。
本申请实施例还提供了一种计算机程序产品,应用于第二网络设备中,所述计算机程序产品包括一系列指令,当所述指令被运行时,以进行上述各个方面的所述的方法中所述第二网络设备的操作。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (31)

  1. 一种攻击样本管理的方法,其特征在于,包括:
    第一网络设备向第二网络设备发送第一攻击样本;
    所述第一网络设备接收所述第二网络设备发送的第一反馈消息,所述第一反馈消息包括所述第二网络设备对所述第一攻击样本的检测结果,所述第一攻击样本的检测结果指示所述第一攻击样本为正常样本;
    所述第一网络设备根据所述第一反馈消息将所述第一攻击样本发送到管理设备,所述管理设备用于根据所述第一攻击样本训练得到攻击检测模型,所述攻击检测模型被所述第二网络设备用于识别攻击报文。
  2. 根据权利要求1所述的方法,其特征在于,在所述第一网络设备向第二网络设备发送第一攻击样本之前,所述方法还包括:
    所述第一网络设备向所述第二网络设备发送第二攻击样本;
    所述第一网络设备接收所述第二网络设备发送的第二反馈消息,所述第二反馈消息包括所述第二网络设备对所述第二攻击样本的检测结果,所述第二攻击样本的检测结果指示所述第二攻击样本为攻击样本;
    所述第一网络设备根据所述第二反馈消息对所述第二攻击样本进行调整,得到所述第一攻击样本。
  3. 根据权利要求2所述的方法,其特征在于,所述第一网络设备中包括强化学习智能体RL Agent和强化学习环境RLEvn,
    所述第一网络设备根据所述第二反馈消息对所述第二攻击样本进行调整,得到所述第一攻击样本,包括:
    所述RLEvn根据所述第二攻击样本的检测结果以及奖惩函数向所述RL Agent发送惩戒信号;
    所述RL Agent根据所述惩戒信号对所述第二攻击样本进行调整,得到所述第一攻击样本。
  4. 根据权利要求2或3所述的方法,其特征在于,所述方法还包括:
    所述第一网络设备通过以下中的任一种或多种攻击方式的组合得到所述第二攻击样本:Unicode编码、Base64编码、插入注释、垃圾数据填充、Offset替换。
  5. 一种攻击样本管理的方法,其特征在于,包括:
    第二网络设备接收第一网络设备发送的第一攻击样本;
    所述第二网络设备对所述第一攻击样本进行检测,得到第一攻击样本的检测结果,所述第一攻击样本的检测结果指示所述第一攻击样本为正常样本;
    所述第二网络设备向所述第一网络设备发送第一反馈消息,所述第一反馈消息包括所述第二网络设备对所述第一攻击样本的检测结果。
  6. 根据权利要求5所述的方法,其特征在于,所述第二网络设备对所述第一攻击样本进行检测,得到第一攻击样本的检测结果,包括:
    所述第二网络设备对所述第一攻击样本进行特征提取,获得所述第一攻击样本的特征向量;
    所述第二网络设备对所述第一攻击样本的特征向量进行检测,获得所述第一攻击样本的检测结果。
  7. 根据权利要求5或6所述的方法,其特征在于,在所述第二网络设备接收第一网络设备发送的第一攻击样本之前,所述方法还包括:
    所述第二网络设备接收所述第一网络设备发送的第二攻击样本;
    所述第二网络设备对所述第二攻击样本进行检测,得到第二攻击样本的检测结果,所述第二攻击样本的检测结果指示所述第二攻击样本为攻击样本;
    所述第二网络设备向所述第一网络设备发送第二反馈消息,所述第二反馈消息包括所述第二网络设备对所述第二攻击样本的检测结果。
  8. 一种攻击样本管理的方法,其特征在于,包括:
    第一网络设备向第二网络设备发送第一攻击样本;
    所述第一网络设备接收所述第二网络设备发送的第一反馈消息,所述第一反馈消息包括所述第二网络设备对所述第一攻击样本的检测结果,所述第一攻击样本的检测结果指示所述第一攻击样本为正常样本;
    所述第一网络设备根据所述第一反馈消息将所述第一攻击样本发送到管理设备;
    所述管理设备根据所述第一攻击样本训练得到攻击检测模型,所述攻击检测模型被所述第二网络设备用于识别攻击报文;
    所述管理设备将所述攻击检测模型部署到所述第二网络设备中。
  9. 根据权利要求8所述的方法,其特征在于,在所述第一网络设备向第二网络设备发送第一攻击样本之前,所述方法还包括:
    所述第一网络设备向所述第二网络设备发送第二攻击样本;
    所述第一网络设备接收所述第二网络设备发送的第二反馈消息,所述第二反馈消息包括所述第二网络设备对所述第二攻击样本的检测结果,所述第二攻击样本的检测结果指示所述第二攻击样本为攻击样本;
    所述第一网络设备根据所述第二反馈消息对所述第二攻击样本进行调整,得到所述第一攻击样本。
  10. 根据权利要求9所述的方法,其特征在于,所述第一网络设备中包括强化学习智能体RL Agent和强化学习环境RLEvn,
    所述第一网络设备根据所述第二反馈消息对所述第二攻击样本进行调整,得到所述第一攻击样本,包括:
    所述RLEvn根据所述第二攻击样本的检测结果以及奖惩函数向所述RL Agent发送惩戒信号;
    所述RL Agent根据所述惩戒信号对所述第二攻击样本进行调整,得到所述第一攻击样本。
  11. 根据权利要求9或10所述的方法,其特征在于,所述方法还包括:
    所述第一网络设备通过以下中的任一种或多种攻击方式的组合得到所述第二攻击样本:Unicode编码、Base64编码、插入注释、垃圾数据填充、Offset替换。
  12. 根据权利要求8至11中任一项所述的方法,其特征在于,所述方法还包括:
    所述第二网络设备接收所述第一网络设备发送的所述第一攻击样本;
    所述第二网络设备对所述第一攻击样本进行检测,得到第一攻击样本的检测结果;
    所述第二网络设备向所述第一网络设备发送第一反馈消息,所述第一反馈消息包括所述第二网络设备对所述第一攻击样本的检测结果。
  13. 根据权利要求12所述的方法,其特征在于,所述第二网络设备对所述第一攻击样本进行检测,得到第一攻击样本的检测结果,包括:
    所述第二网络设备对所述第一攻击样本进行特征提取,获得所述第一攻击样本的特征向量;
    所述第二网络设备对所述第一攻击样本的特征向量进行检测,获得所述第一攻击样本的检测结果。
  14. 根据权利要求12或13所述的方法,其特征在于,在所述第二网络设备接收所述第一网络设备发送的所述第一攻击样本之前,所述方法还包括:
    所述第二网络设备接收所述第一网络设备发送的第二攻击样本;
    所述第二网络设备对所述第二攻击样本进行检测,得到第二攻击样本的检测结果,所述第二攻击样本的检测结果指示所述第二攻击样本为攻击样本;
    所述第二网络设备向所述第一网络设备发送第二反馈消息,所述第二反馈消息包括所述第二网络设备对所述第二攻击样本的检测结果。
  15. 一种第一网络设备,其特征在于,包括:
    发送模块,用于向第二网络设备发送第一攻击样本;
    接收模块,用于接收所述第二网络设备发送的第一反馈消息,所述第一反馈消息包括所述第二网络设备对所述第一攻击样本的检测结果,所述第一攻击样本的检测结果指示所述第一攻击样本为正常样本;
    所述发送模块,还用于根据所述第一反馈消息将所述第一攻击样本发送到管理设备,所述管理设备用于根据所述第一攻击样本训练得到攻击检测模型,所述攻击检测模型被所述第二网络设备用于识别攻击报文。
  16. 根据权利要求15所述的第一网络设备,其特征在于,
    所述发送模块,还用于向所述第二网络设备发送第二攻击样本;
    所述接收模块,还用于接收所述第二网络设备发送的第二反馈消息,所述第二反馈消息包括所述第二网络设备对所述第二攻击样本的检测结果,所述第二攻击样本的检测结果指示所述第二攻击样本为攻击样本;
    所述第一网络设备还包括:
    处理模块,用于根据所述第二反馈消息对所述第二攻击样本进行调整,得到所述第一攻击样本。
  17. 根据权利要求16所述的第一网络设备,其特征在于,所述第一网络设备中包括强化学习智能体RL Agent和强化学习环境RLEvn,所述处理模块具体用于:
    所述RLEvn根据所述第二攻击样本的检测结果以及奖惩函数向所述RL Agent发送惩戒信号;
    所述RL Agent根据所述惩戒信号对所述第二攻击样本进行调整,得到所述第一攻击样本。
  18. 根据权利要求16或17所述的第一网络设备,其特征在于,所述处理模块还用于:
    通过以下中的任一种或多种攻击方式的组合得到所述第二攻击样本:Unicode编码、Base64编码、插入注释、垃圾数据填充、Offset替换。
  19. 一种第二网络设备,其特征在于,包括:
    接收模块,用于接收第一网络设备发送的第一攻击样本;
    检测模块,用于对所述第一攻击样本进行检测,得到第一攻击样本的检测结果,所述第一攻击样本的检测结果指示所述第一攻击样本为正常样本;
    发送模块,用于向所述第一网络设备发送第一反馈消息,所述第一反馈消息包括所述第二网络设备对所述第一攻击样本的检测结果。
  20. 根据权利要求19所述的第二网络设备,其特征在于,所述检测模块具体用于:
    对所述第一攻击样本进行特征提取,获得所述第一攻击样本的特征向量;
    对所述第一攻击样本的特征向量进行检测,获得所述第一攻击样本的检测结果。
  21. 根据权利要求19或20所述的第二网络设备,其特征在于,
    所述接收模块,还用于接收所述第一网络设备发送的第二攻击样本;
    所述检测模块,还用于对所述第二攻击样本进行检测,得到第二攻击样本的检测结果,所述第二攻击样本的检测结果指示所述第二攻击样本为攻击样本;
    发送模块,还用于向所述第一网络设备发送第二反馈消息,所述第二反馈消息包括所述第二网络设备对所述第二攻击样本的检测结果。
  22. 一种攻击样本管理的系统,其特征在于,包括:
    发送模块,用于第一网络设备向第二网络设备发送第一攻击样本;
    接收模块,用于所述第一网络设备接收所述第二网络设备发送的第一反馈消息,所述第一反馈消息包括所述第二网络设备对所述第一攻击样本的检测结果,所述第一攻击样本的检测结果指示所述第一攻击样本为正常样本;
    所述发送模块,还用于所述第一网络设备根据所述第一反馈消息将所述第一攻击样本发送到管理设备;
    处理模块,用于所述管理设备根据所述第一攻击样本训练得到攻击检测模型,所述攻击检测模型被所述第二网络设备用于识别攻击报文;
    所述处理模块,还用于所述管理设备将所述攻击检测模型部署到所述第二网络设备中。
  23. 根据权利要求22所述的系统,其特征在于,
    所述发送模块,还用于所述第一网络设备向所述第二网络设备发送第二攻击样本;
    所述接收模块,还用于所述第一网络设备接收所述第二网络设备发送的第二反馈消息,所述第二反馈消息包括所述第二网络设备对所述第二攻击样本的检测结果,所述第二攻击样本的检测结果指示所述第二攻击样本为攻击样本;
    所述处理模块,还用于所述第一网络设备根据所述第二反馈消息对所述第二攻击样本进行调整,得到所述第一攻击样本。
  24. 根据权利要求23所述的系统,其特征在于,所述第一网络设备中包括强化学习智能体RL Agent和强化学习环境RLEvn,所述处理模块具体用于:
    所述RLEvn根据所述第二攻击样本的检测结果以及奖惩函数向所述RL Agent发送惩戒信号;
    所述RL Agent根据所述惩戒信号对所述第二攻击样本进行调整,得到所述第一攻击样本。
  25. 根据权利要求23或24所述的系统,其特征在于,所述处理模块,还用于:
    所述第一网络设备通过以下中的任一种或多种攻击方式的组合得到所述第二攻击样 本:Unicode编码、Base64编码、插入注释、垃圾数据填充、Offset替换。
  26. 根据权利要求22至25中任一项所述的系统,其特征在于,
    所述接收模块,还用于所述第二网络设备接收所述第一网络设备发送的所述第一攻击样本;
    所述处理模块,还用于所述第二网络设备对所述第一攻击样本进行检测,得到第一攻击样本的检测结果;
    所述发送模块,还用于所述第二网络设备向所述第一网络设备发送第一反馈消息,所述第一反馈消息包括所述第二网络设备对所述第一攻击样本的检测结果。
  27. 根据权利要求26所述的系统,其特征在于,所述处理模块具体用于:
    所述第二网络设备对所述第一攻击样本进行特征提取,获得所述第一攻击样本的特征向量;
    所述第二网络设备对所述第一攻击样本的特征向量进行检测,获得所述第一攻击样本的检测结果。
  28. 根据权利要求26或27所述的系统,其特征在于,
    所述接收模块,还用于所述第二网络设备接收所述第一网络设备发送的第二攻击样本;
    所述处理模块,还用于所述第二网络设备对所述第二攻击样本进行检测,得到第二攻击样本的检测结果,所述第二攻击样本的检测结果指示所述第二攻击样本为攻击样本;
    所述发送模块,还用于所述第二网络设备向所述第一网络设备发送第二反馈消息,所述第二反馈消息包括所述第二网络设备对所述第二攻击样本的检测结果。
  29. 一种第一网络设备,其特征在于,包括:处理器和存储器,所述存储器用于存储程序或代码,所述处理器用于从存储器中调用并运行所述程序以执行权利要求1至4中任一项所述的方法。
  30. 一种第二网络设备,其特征在于,包括:处理器和存储器,所述存储器用于存储程序或代码,所述处理器用于从存储器中调用并运行所述程序以执行权利要求5至7中任一项所述的方法。
  31. 一种攻击样本管理的系统,其特征在于,包括:管理设备,如权利要求15至18中任一项所述的第一网络设备以及如权利要求19至21中任一项所述的第二网络设备。
PCT/CN2022/085278 2021-04-16 2022-04-06 攻击样本管理的方法以及设备 WO2022218188A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110411816.3 2021-04-16
CN202110411816.3A CN115225295A (zh) 2021-04-16 2021-04-16 攻击样本管理的方法以及设备

Publications (1)

Publication Number Publication Date
WO2022218188A1 true WO2022218188A1 (zh) 2022-10-20

Family

ID=83605571

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/085278 WO2022218188A1 (zh) 2021-04-16 2022-04-06 攻击样本管理的方法以及设备

Country Status (2)

Country Link
CN (1) CN115225295A (zh)
WO (1) WO2022218188A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105282091A (zh) * 2014-06-05 2016-01-27 腾讯科技(深圳)有限公司 安全应用的服务器检测方法及其系统
CN109902709A (zh) * 2019-01-07 2019-06-18 浙江大学 一种基于对抗学习的工业控制系统恶意样本生成方法
CN111783085A (zh) * 2020-06-29 2020-10-16 浙大城市学院 一种对抗样本攻击的防御方法、装置及电子设备
CN112311733A (zh) * 2019-07-30 2021-02-02 四川大学 一种基于强化学习优化xss检测模型防御对抗攻击的方法
WO2021051561A1 (zh) * 2019-09-18 2021-03-25 平安科技(深圳)有限公司 图像分类网络的对抗防御方法、装置、电子设备及计算机可读存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105282091A (zh) * 2014-06-05 2016-01-27 腾讯科技(深圳)有限公司 安全应用的服务器检测方法及其系统
CN109902709A (zh) * 2019-01-07 2019-06-18 浙江大学 一种基于对抗学习的工业控制系统恶意样本生成方法
CN112311733A (zh) * 2019-07-30 2021-02-02 四川大学 一种基于强化学习优化xss检测模型防御对抗攻击的方法
WO2021051561A1 (zh) * 2019-09-18 2021-03-25 平安科技(深圳)有限公司 图像分类网络的对抗防御方法、装置、电子设备及计算机可读存储介质
CN111783085A (zh) * 2020-06-29 2020-10-16 浙大城市学院 一种对抗样本攻击的防御方法、装置及电子设备

Also Published As

Publication number Publication date
CN115225295A (zh) 2022-10-21

Similar Documents

Publication Publication Date Title
JP6371790B2 (ja) 変更されたウェブページを判定するためのシステム及び方法
JP6138896B2 (ja) 悪質な脆弱性のあるファイルを検出する方法、装置及び端末
US20240121266A1 (en) Malicious script detection
US9253208B1 (en) System and method for automated phishing detection rule evolution
JP6346632B2 (ja) モバイルデバイスでの悪質なファイルを検出するシステム及び方法
US10140451B2 (en) Detection of malicious scripting language code in a network environment
US10339301B2 (en) System and method of analysis of files for maliciousness in a virtual machine
US11048795B2 (en) System and method for analyzing a log in a virtual machine based on a template
CN103678126B (zh) 用于提高应用仿真加速的效率的系统和方法
KR20200052957A (ko) 보안 제어 방법 및 컴퓨터 시스템
CN103500308A (zh) 用于对抗由恶意软件对仿真的检测的系统和方法
US10275595B2 (en) System and method for characterizing malware
WO2022218188A1 (zh) 攻击样本管理的方法以及设备
KR20140100912A (ko) 영구적인 락아웃 공격 검지
Deng et al. A Pattern‐Based Software Testing Framework for Exploitability Evaluation of Metadata Corruption Vulnerabilities
CN105988811B (zh) 获取操作系统的内核控制流程图的方法和装置
US10121008B1 (en) Method and process for automatic discovery of zero-day vulnerabilities and expoits without source code access
EP3361406A1 (en) System and method of analysis of files for maliciousness in a virtual machine
Michel " Intrusion Detection Systems (IDSs) in Vehicle Controller Area Networks (CANs) via Hardware Performance Counter", Master thesis as part of a double degree with the INSA of Lyon, INSA tutor: Mathieu CUNCHE
CN115834204A (zh) 一种操作异常的分析方法及装置
CN115220865A (zh) 不触发系统保护的内核任意位置挂钩方法及装置
Bos et al. Systems security at vu university amsterdam

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22787415

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22787415

Country of ref document: EP

Kind code of ref document: A1