CN109918901B - Method for real-time detection of attack based on Cache - Google Patents

Method for real-time detection of attack based on Cache Download PDF

Info

Publication number
CN109918901B
CN109918901B CN201910127173.2A CN201910127173A CN109918901B CN 109918901 B CN109918901 B CN 109918901B CN 201910127173 A CN201910127173 A CN 201910127173A CN 109918901 B CN109918901 B CN 109918901B
Authority
CN
China
Prior art keywords
queue
cache
event data
hardware event
active
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910127173.2A
Other languages
Chinese (zh)
Other versions
CN109918901A (en
Inventor
翁楚良
郑蓓蕾
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
East China Normal University
Original Assignee
East China Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by East China Normal University filed Critical East China Normal University
Priority to CN201910127173.2A priority Critical patent/CN109918901B/en
Publication of CN109918901A publication Critical patent/CN109918901A/en
Application granted granted Critical
Publication of CN109918901B publication Critical patent/CN109918901B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Debugging And Monitoring (AREA)

Abstract

The invention aims to provide a method for detecting attack based on Cache in real time, which comprises two stages: in the off-line analysis stage, monitoring the running of an attack program based on Cache and a benign program sample, collecting hardware event data generated in the running process, extracting and selecting features according to the collected data to train a machine learning model, and generating a classifier capable of identifying various Cache attacks; in the on-line detection stage, the active process in the system is monitored, hardware event data generated during the operation of the active process is collected, the collected data are divided into windows with fixed sizes in real time, the data in each window are processed, and the classifier generated in the off-line analysis stage is used for predicting, so that whether the active process is attacked based on the Cache or not is judged in real time. The scheme is not aiming at the Cache-based attack taking any specific encryption algorithm as an attack object, and can detect the existence of the attack in real time before the Cache-based attack is completed with lower cost.

Description

Method for real-time detection of attack based on Cache
Technical Field
The invention relates to the field of computers, in particular to a method for detecting attacks based on Cache in real time.
Background
Cloud computing offers its tenants the convenience of providing computing resources, but because multiple tenants share hardware resources, cloud computing also presents a significant security risk. For example, an attacker may acquire sensitive data of a victim using a Cache (Cache) on a shared processor, which is a Cache-based attack. To launch a Cache-based attack, an attacker typically needs to perform some operations to bring the Cache to some desired state, and then wait for the victim to execute; the victim, when executing, inevitably uses the shared cache, and therefore the state of the cache may change; after the execution of the victim, the attacker detects the state of the cache, and then speculates the sensitive data of the victim by analyzing the change condition of the state of the cache before and after the execution of the victim, thereby causing information leakage. Cache-based attacks can occur among shared caches and untrusted software components such as desktops, cloud servers and virtual machines, mobile devices and browsers, and attack targets include encryption systems (such as keys), computer system randomness (such as address space layout), user privacy (such as keyboard monitoring) and the like, which brings huge threats to system and information security.
To resist Cache-based attacks, some defense schemes limit the use of the Cache by an attacker by partitioning the shared Cache, thereby limiting its ability to operate the Cache; some defense schemes interfere with the analysis of the cache state by the attacker through a noise injection method, so that the attacker is prevented from accurately deducing the information of the victim; some detection measures utilize characteristics of hardware events generated when a Cache-based attack is applied to a particular encryption algorithm by analyzing the characteristics rules, thereby identifying the Cache-based attack on the particular encryption algorithm.
Although the existing defense schemes and detection measures have certain success, the methods can only defend certain specific Cache-based attacks or can only detect Cache attacks aiming at specific encryption algorithms, and cannot be used when new Cache-based attacks are encountered. For example, recently exploded Meltdown and Spectre attacks utilize speculative execution characteristics of the processor to change the state of the cache to enable illegal out-of-range access to cause information leakage, and still bypass all existing defense and detection schemes.
Disclosure of Invention
One object of the present invention is to provide a method for real-time detection of Cache-based attacks.
According to one aspect of the invention, a method for real-time detection of Cache-based attacks is provided, which comprises two stages: an off-line analysis phase and an on-line detection phase, wherein,
the off-line analysis phase comprises the steps of:
step 1, collecting hardware event data generated when an attack program and a benign program based on Cache are executed;
step 2, processing the collected hardware event data, extracting and selecting features for training a machine learning algorithm to obtain a classifier for identifying the attack based on the Cache;
the on-line detection stage comprises the following steps:
step 3, collecting hardware event data of an active process in the system in real time, and carrying out window division on the hardware event data;
and 4, processing hardware event data in each window, and detecting whether each window has Cache-based attack or not by using a classifier.
Further, in the above method, in the step 1,
the attack program based on the Cache comprises the following steps: flush + Reload, Flush + Flush, Prime + Probe, Meltdown, Spectre and XLATE series attacks;
the benign program includes: CPU intensive, IO intensive; the hardware events comprise Cache hit, TLB hit and branch misdetection related events, and are collected by using a hardware performance counter;
the hardware event data is a hardware event data sequence with a time sequence.
Further, in the above method, in step 2, the step of processing the collected hardware event data into features for training is as follows:
step 2-1, processing the collected hardware event data by using an attention method;
2-2, extracting characteristics of a mean value, a standard deviation, a maximum value, a quantile and a range from the processed data;
and 2-3, selecting the extracted features by using a genetic algorithm according to F-Score.
Further, in the aforementioned method, in the attention method, the time sequence E of the original hardware event is set to { x ═ x0,x1,...,xt,...,xnConverting into E' ═ y0,y1,...,yt,...,ynAnd (c) the step of (c) in which,
Figure BDA0001974004790000031
ht=α×ht-1+(1-α)×xt,h0=x0where t is the sampling time point, t>0,0<α<1。
Further, in the above method, in the step 2, the machine learning algorithm includes: a decision tree algorithm, a multi-layer perceptron algorithm and an Xgboost algorithm;
the classifier is a classification model which is generated by the machine learning algorithm through training and can identify various types of attacks based on the Cache.
Further, in the above method, in the step 3, the step of collecting hardware event data of an active process in the system in real time includes:
step 3-1, periodically scanning all active processes in the system, and acquiring process ID lists of all active processes;
step 3-2, comparing the process ID lists of the active processes obtained in two adjacent times to obtain a process ID list of a newly generated active process;
3-3, filtering out the process ID and the sub-process ID in the white list from the process ID list of the newly generated active process, and adding the process ID of the remaining active process into a process ID queue to be detected;
and 3-4, monitoring all active processes in all process ID queues to be detected and acquiring hardware event data.
Further, in the above method, the windows contain a fixed number of hardware event data records, and there is no overlap between multiple windows;
the white list is the process ID determined to be a benign process.
Further, in the above method, in step 3-4, the process ID queue to be detected includes: a filtering queue, a long process queue and a dangerous process queue,
during monitoring, distributing a thread for each process in the filtering queue and the dangerous process queue to collect hardware event data, and distributing a thread for all processes in the long-process queue to collect hardware event data with a fixed window number in turn;
in step 3-3, the process ID queue to be detected is a filtering queue.
Further, in the above method, the process ID of the active process is circulated among the filtering queue, the long process queue, and the dangerous process queue as follows:
the process ID initially enters the filter queue, and if the active process completes within a specified time and no exception is detected, its process ID is removed from the filter queue upon completion of the active process;
moving the process ID from a filter queue to a long process queue if the active process is not completed within a specified time and no exception is detected;
moving the process ID from a filter queue to a critical process queue if the active process detects an exception within a specified time;
if the active process detects an exception in the long process queue, the process ID is moved from the long process queue to a hazardous process queue.
Further, in the above method, in the step 4,
processing hardware event data in the window in the same way as the method described in the step 2-1 and the step 2-2;
when a classifier is used for detecting whether the attack based on the Cache exists in the window, different detection schemes are adopted for the windows from different processes: detecting windows from processes in the filter queue and the long-process queue using an Xgboost classifier; windows from processes in the danger process queue are jointly detected using a decision tree classifier, a multi-tier perceptron classifier, and an Xgboost classifier.
Compared with the prior art, the invention provides a method for detecting the attack based on the Cache in real time, which comprises two stages: in the off-line analysis stage, monitoring the running of an attack program based on Cache and a benign program sample, collecting hardware event data generated in the running process, extracting and selecting features according to the collected data to train a machine learning model, and generating a classifier capable of identifying various Cache attacks; in the on-line detection stage, the active process in the system is monitored, hardware event data generated during the operation of the active process is collected, the collected data are divided into windows with fixed sizes in real time, the data in each window are processed, and the classifier generated in the off-line analysis stage is used for predicting, so that whether the active process is attacked based on the Cache or not is judged in real time. The scheme is not aiming at the Cache-based attack taking any specific encryption algorithm as an attack object, and can detect the existence of the attack in real time before the Cache-based attack is completed with lower cost.
Drawings
Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments made with reference to the following drawings:
FIG. 1 is an overview of a scheme for real-time detection based on Cache attacks in an embodiment of the present invention;
FIG. 2 is an architecture diagram for real-time detection of Cache-based attacks in an embodiment of the present invention;
FIG. 3 is a flow chart of a feature selection genetic algorithm in an embodiment of the present invention;
FIG. 4 is a flow chart of a process in a monitoring queue when real-time detection is based on Cache attacks in the embodiment of the present invention.
The same or similar reference numbers in the drawings identify the same or similar elements.
Detailed Description
The present invention is described in further detail below with reference to the attached drawing figures.
In a typical configuration of the present application, the terminal, the device serving the network, and the trusted party each include one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include non-transitory computer readable media (transient media), such as modulated data signals and carrier waves.
The invention provides a method for detecting attack based on Cache in real time, which comprises two stages: an off-line analysis phase and an on-line detection phase, wherein,
the off-line analysis phase comprises the steps of:
step 1, collecting hardware event data generated when an attack program and a benign program based on Cache are executed;
step 2, processing the collected hardware event data, extracting and selecting features for training a machine learning algorithm to obtain a classifier for identifying the attack based on the Cache;
the on-line detection stage comprises the following steps:
step 3, collecting hardware event data of an active process in the system in real time, and carrying out window division on the hardware event data;
and 4, processing hardware event data in each window, and detecting whether each window has Cache-based attack or not by using a classifier.
In an embodiment of the method for real-time detection based on Cache attack, in step 1,
the attack program based on the Cache comprises the following steps: flush + Reload, Flush + Flush, Prime + Probe, Meltdown, Spectre and XLATE series attacks;
the benign program includes: CPU intensive, IO intensive; the hardware events comprise Cache hit, TLB hit and branch misdetection related events, and are collected by using a hardware performance counter;
the hardware event data is a hardware event data sequence with a time sequence.
In an embodiment of the method for detecting a Cache-based attack in real time, in step 2, the step of processing the collected hardware event data into features for training includes the following steps:
step 2-1, processing the collected hardware event data by using an attention method;
2-2, extracting characteristics of a mean value, a standard deviation, a maximum value, a quantile and a range from the processed data;
and 2-3, selecting the extracted features by using a genetic algorithm according to F-Score.
In an embodiment of the method for detecting a Cache-based attack in real time, in step 2, the machine learning algorithm includes: a decision tree algorithm, a multi-layer perceptron algorithm and an Xgboost algorithm;
the classifier is a classification model which is generated by the machine learning algorithm through training and can identify various types of attacks based on the Cache.
In an embodiment of the method for detecting real-time attacks based on Cache, in the attention method, a time sequence series E of an original hardware event is set to { x ═ x0,x1,...,xt,...,xnConverting into E' ═ y0,y1,...,yt,...,ynAnd (c) the step of (c) in which,
Figure BDA0001974004790000071
ht=α×ht-1+(1-α)×xt,h0=x0where t is the sampling time point, t>0,0<α<1。
In an embodiment of the method for detecting a Cache-based attack in real time, in step 3, the step of collecting hardware event data of an active process in a system in real time includes the following steps:
step 3-1, periodically scanning all active processes in the system, and acquiring process ID lists of all active processes;
step 3-2, comparing the process ID lists of the active processes obtained in two adjacent times to obtain a process ID list of a newly generated active process;
3-3, filtering out the process ID and the sub-process ID in the white list from the process ID list of the newly generated active process, and adding the process ID of the remaining active process into a process ID queue to be detected;
and 3-4, monitoring all active processes in all process ID queues to be detected and acquiring hardware event data.
In one embodiment of the method for detecting the real-time attack based on the Cache, the windows contain a fixed number of hardware event data records, and a plurality of windows are not overlapped;
the white list is the process ID determined to be a benign process.
In an embodiment of the method for detecting a Cache-based attack in real time, in step 3-4, the process ID queue to be detected includes: a filtering queue, a long process queue and a dangerous process queue,
during monitoring, distributing a thread for each process in the filtering queue and the dangerous process queue to collect hardware event data, and distributing a thread for all processes in the long-process queue to collect hardware event data with a fixed window number in turn;
in step 3-3, the process ID queue to be detected is a filtering queue.
In an embodiment of the method for detecting the attack based on the Cache in real time, the process ID of the active process flows among the filtering queue, the long process queue and the dangerous process queue as follows:
the process ID initially enters the filter queue, and if the active process completes within a specified time and no exception is detected, its process ID is removed from the filter queue upon completion of the active process;
moving the process ID from a filter queue to a long process queue if the active process is not completed within a specified time and no exception is detected;
moving the process ID from a filter queue to a critical process queue if the active process detects an exception within a specified time;
if the active process detects an exception in the long process queue, the process ID is moved from the long process queue to a hazardous process queue.
In an embodiment of the method for real-time detection based on Cache attack, in step 4,
processing hardware event data in the window in the same way as the method described in the step 2-1 and the step 2-2;
when a classifier is used for detecting whether the attack based on the Cache exists in the window, different detection schemes are adopted for the windows from different processes: detecting windows from processes in the filter queue and the long-process queue using an Xgboost classifier; windows from processes in the danger process queue are jointly detected using a decision tree classifier, a multi-tier perceptron classifier, and an Xgboost classifier.
In order to resist the existing attack based on the Cache and quickly respond to the new attack based on the Cache, the method for detecting the attack based on the Cache in real time is provided. The scheme and the architecture provided by the application are not directed to the Cache-based attack taking a specific encryption algorithm as an attack object, and can detect the existing Cache-based attack in real time with lower cost. In addition, when a new attack based on the Cache appears, the scheme provided by the application is used for training an effective classifier, and the architecture provided by the application can be used for dealing with the new attack based on the Cache.
The present application is described in further detail below with reference to the attached figures.
The hardware environment is configured as a PC host, wherein the processor is Intel Core i5-4460, 3.2GHz and is provided with 11 hardware performance counters; and 8 GB. The software environment operating system is Centos7.5.1804, 64 bits, and the kernel version is 3.10.0.
Some embodiments of the present application provide a general scheme and architecture for real-time detection based on Cache attacks, as shown in fig. 1, the scheme includes an offline analysis stage and an online detection stage. In the off-line analysis stage, an attack program and a benign program sample based on the Cache are operated and monitored, hardware event data generated in the operation process of the attack program and the benign program sample are collected, characteristics are extracted and selected according to the collected data to train a machine learning model, and a plurality of classifiers capable of identifying the attack based on the Cache are generated; in the on-line detection stage, the active process in the system is monitored, hardware event data generated during the operation of the active process is collected, the collected data are divided into windows with fixed sizes in real time, the data in each window are processed, and the classifier generated in the off-line analysis stage is used for predicting, so that whether the active process is attacked based on the Cache or not is judged in real time.
Specifically, some embodiments of the present application include three parts, namely, a monitoring module, a learning module and a detection module, as shown in fig. 2, the offline analysis phase is composed of the offline monitor and the learning module of the monitoring module, and the specific flow thereof is represented by a solid line, while the online detection phase is composed of the online monitor and the detection module of the monitoring module, and the specific flow thereof is represented by a dashed line.
The monitoring module consists of an offline monitor responsible for collecting hardware event data generated during execution of the sample program and an online monitor responsible for collecting hardware event data generated by the active processes in the system in real time. The working steps of the offline monitor shown in fig. 2 are as follows:
firstly, running a sample program;
step two, acquiring a process ID of a running sample program from an operating system;
monitoring the running sample program according to the process ID and collecting hardware event data generated in the running process of the sample program;
and fourthly, writing the collected hardware event data into a file.
In the first step, the sample program comprises an attack program and a benign program based on Cache, wherein the attack program based on Cache is from Github and comprises Flush + Reload, Flush + Flush, Prime + Probe, Meltdown, Spectre and XLATE series attacks; benign programs come from some of the basic commands of standard benchmark and Linux, including compute-intensive and I/O-intensive programs.
In step two, the process ID of the sample program is acquired to the operating system by the command ps according to the name of the running sample program.
In step (c), hardware event data is obtained by accessing a hardware performance counter through an interface provided by a papi (performance Application Programming interface), the hardware event data is sampled every 100 microseconds, and the collected hardware event data is a sequence with a time sequence. When the sample program runs, the task of monitoring and collecting hardware event data is terminated. See table below for hardware events:
event name Event description
PAPI_TOT_CYC Total cycle of execution
PAPI_TOT_INS General instruction of execution
PAPI_L3_LDM L3 level cache load miss
PAPI_L3_TCA L3 level cache total access
PAPI_PRF_DM Data prefetch miss
PAPI_TLB_DM Data TLB cache miss
PAPI_BR_CN Conditional branch instruction
PAPI_BR_MSP Conditional branch misprediction instructions
Some embodiments provided herein repeat the steps (r) - (r) five times for each sample procedure, each time running with a different procedure. This can reduce the effect of uncertainty in the program running process on the hardware event data, and can increase the number of samples for training.
Shown in the following table is a portion of the hardware event data collected during the run of the spectrum attack program, where the PAPI _ prefix is omitted, each row represents a record of one sample, and each column represents a time-series record of the corresponding event:
BR_MSP BR_CN PRF_DM TLB_DM TOT_CYC TOT_INS L3_LDM L3_TCA
0 0 0 0 0 0 0 0
3 3758 32 7 45424 36637 8 61
0 64180 1 73 509496 453733 32 13
34 58474 19 91 471392 402074 79 342
415 20187 22 5 516647 151523 1109 1302
230 20811 4 0 469192 153857 1042 1012
182 20258 0 0 543469 165991 1294 1270
213 24678 1 0 529390 176620 1292 1273
175 21912 0 0 528615 169715 1273 1254
210 24738 556 0 530863 182552 1255 1243
171 21960 1 0 519032 166022 1271 1268
178 23213 0 0 512900 177904 1181 1172
the learning module aims at training a model with strong prediction capability according to collected hardware event data, namely obtaining a classifier capable of detecting the attack based on the Cache. The learning module shown in fig. 2 comprises the following steps:
preprocessing the collected hardware event data by using an attention method;
sixthly, extracting the characteristics of the preprocessed data, and selecting the characteristics by using a genetic algorithm;
step seventhly, feeding the selected characteristics into a machine learning algorithm for training;
and step eight, generating and storing a classifier.
In step (v), invalid data possibly existing in the hardware event data collected by the offline monitor is filtered out first, for example, the first row of all-zero record in the above table. The hardware event data is then transformed using the attention method. The transformation method of the invitation method is as follows:
assume that the time series of event E is E ═ x0,x1,...,xt,...,xnLet h0=x0,ht=α×ht-1+(1-α)×xtWhere t is the sampling point, t>0,0<α<1. The converted sequence E' ═ y0,y1,...,yt,...,ynTherein of
Figure BDA0001974004790000121
The data processed by the attention method is irrelevant to the numerical range of the original data, and the fluctuation condition of the original data is reserved, so that the method is suitable for hardware event data acquired under different machine environments and is also suitable for real-time scenes.
In the step sixthly, the characteristics of the maximum value, the mean value, the median, the standard deviation, the quantile, the range and the like are extracted from the preprocessed data sequence to be used as candidate characteristics, and then a subset is selected from the candidate characteristics to be used for training. The subset is selected as the feature, so that the performance cost of learning and prediction can be reduced, and the interference of irrelevant features on model training can be reduced. The characteristic subset is selected by adopting a genetic algorithm, as shown in fig. 3, and the specific process is as follows:
in the initialization phase, genes are encoded, one gene corresponds to one feature subset, and N genes are selected as an initialization population. Assuming that the length of the candidate feature set is L, each gene corresponds to a binary string with the length of L, whereAt any locus, 0 indicates that no corresponding feature is selected, and 1 indicates that a corresponding feature is selected. When a seed group is initialized, N seed groups are randomly selected to be 1-2LThe binary string corresponding to each number is a gene in the genetic algorithm, and the genes corresponding to the N random numbers are initialized populations.
In the evaluation phase, excellent genes were selected as male parents to allow population evolution. The evaluation criteria for whether a gene is superior or not depends on the fitness function, wherein the fitness function returns the average value of F-Score of the feature subset corresponding to each gene on a plurality of machine learning models (multi-level perceptron MLP, decision trees DEC and XGboost), and the higher the fitness is, the better the gene is. In order to make the superior genes have a larger reproduction probability, after calculating the fitness of each individual gene in the initialization population, n superior genes are selected as male parents by using a roulette selection method: setting the fitness of the ith gene in the population as fiThen the probability that it was selected for reproduction is
Figure BDA0001974004790000131
In the evolution stage, two male parents are selected from n male parents for cross selection to form a new gene, and the new gene has mutation with a certain probability. In the cross selection, compared with the gene codes of the two male parents, dominant genes (namely, the same 1 gene position in the two male parents) in the two male parents are selected to be transmitted to the offspring, and non-dominant gene positions (only one male parent is the 1 gene position) are also transmitted to the offspring with a certain probability. In addition to inheriting the paternal gene, the new gene also has a certain probability of gene mutation (i.e., some loci change from 0 to 1, or from 1 to 0).
Repeating the evolution stage for N times to generate N new genes as filial generations, calculating the fitness of the genes of the filial generations, selecting N excellent genes from the filial generations and the parent generations (evaluation stage) to continue reproduction, and continuously repeating the steps. After a certain number of iterations, an approximately optimal feature subset can be obtained.
In step (c), the obtained optimal feature subset is input into three machine learning algorithms of a multilayer perceptron (MLP), a decision tree (DEC) and XGBoost. The best model was selected using a five-fold cross-validation approach in training: firstly, taking 30% of samples as a test set and 70% of samples as a training set; then, randomly dividing the training set into five parts, wherein four parts are used for training, and the rest part is used for verification; and repeating the steps five times, and saving the model with the minimum verification error for real-time prediction.
The working steps of the online monitor shown in fig. 2 are as follows:
step [1], periodically obtaining a process ID list of a current active process from an operating system;
step [2], monitoring the running active process according to the process ID, and collecting the hardware event data of the active process;
and step [3], transmitting the collected hardware event data to a detection module in a stream form in real time.
In step [2], the monitored active processes are not all active processes in the system, and since most active processes in the system are benign processes at any time, it is not necessary to waste too much resources to monitor the processes determined to be benign. Some embodiments of the present application employ an incremental monitoring scheme, that is, compare the process ID lists of the active processes obtained twice in succession to obtain a process ID list of a newly generated active process. In addition, some embodiments of the present application also set a white list, which records process IDs determined to be benign, and filters out process IDs and their child process IDs in the white list when monitoring active processes. Therefore, unnecessary monitoring is omitted, and unnecessary resource waste is reduced.
The detection module aims at discovering all possible Cache attacks and detects hardware event characteristics collected by the online monitor by using the classifier output by the learning module. The working steps of the detection module shown in fig. 2 are as follows:
step [4], after carrying out invalid value filtering, attention conversion and window division on hardware event data collected by an online monitor, calculating selected characteristics for each window, and taking the characteristics as the input of a classifier;
and step [5], the classifier classifies the window according to the characteristics, and once the attack based on the Cache is predicted, a warning is sent out.
To further improve the efficiency of monitoring and detection while reducing the false alarm rate, some embodiments of the present application maintain three monitoring queues: a filter queue, a long process queue and a dangerous process queue. As shown in fig. 4, all newly generated active processes not in the white list enter the filtering queue first, and for each process in the filtering queue, a thread is allocated exclusively for collecting hardware event data thereof; each process is continuously monitored for a fixed number of windows, and each window uses an XGboost classifier to detect whether an exception exists; if the program completes within the fixed window and no exception is detected, the process is removed from the filter queue; if the program detects an exception within the fixed window, the process moves from the filter queue to the critical process queue; if the program is not completed within the fixed window and no exception is detected, the process moves from the filter queue to the long process queue. Distributing a thread to collect hardware event data with a fixed window number in turn for all processes in the long-process queue, and detecting whether abnormality exists or not by using an XGboost classifier for each window; if the exception is detected, the process is moved from the long process queue to the dangerous process queue; if no exception is detected, it is removed from the long process queue by the time the process run is complete. For each process in the dangerous process queue, allocating a thread specially used for collecting hardware event data of the process, and jointly predicting whether attack based on Cache exists or not by using three classifiers, namely XGboost, a decision tree and a multilayer perceptron, in each window; upon detection of a Cache-based attack, an alert is issued. Some embodiments of the present application use three monitoring queues to monitor and detect different processes using different strategies, which reduces the overhead and also reduces the false alarm rate.
To sum up, the present application provides a general scheme and architecture for real-time detection based on Cache attack, which includes two stages: in the off-line analysis stage, monitoring the running of an attack program based on Cache and a benign program sample, collecting hardware event data generated in the running process, extracting and selecting features according to the collected data to train a machine learning model, and generating a classifier capable of identifying various Cache attacks; in the on-line detection stage, the active process in the system is monitored, hardware event data generated during the operation of the active process is collected, the collected data are divided into windows with fixed sizes in real time, the data in each window are processed, and the classifier generated in the off-line analysis stage is used for predicting, so that whether the active process is attacked based on the Cache or not is judged in real time. The scheme is not aiming at the Cache-based attack taking any specific encryption algorithm as an attack object, and can detect the existence of the attack in real time before the Cache-based attack is completed with lower cost.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.
It should be noted that the present invention may be implemented in software and/or in a combination of software and hardware, for example, as an Application Specific Integrated Circuit (ASIC), a general purpose computer or any other similar hardware device. In one embodiment, the software program of the present invention may be executed by a processor to implement the steps or functions described above. Also, the software programs (including associated data structures) of the present invention can be stored in a computer readable recording medium, such as RAM memory, magnetic or optical drive or diskette and the like. Further, some of the steps or functions of the present invention may be implemented in hardware, for example, as circuitry that cooperates with the processor to perform various steps or functions.
In addition, some of the present invention can be applied as a computer program product, such as computer program instructions, which when executed by a computer, can invoke or provide the method and/or technical solution according to the present invention through the operation of the computer. Program instructions which invoke the methods of the present invention may be stored on a fixed or removable recording medium and/or transmitted via a data stream on a broadcast or other signal-bearing medium and/or stored within a working memory of a computer device operating in accordance with the program instructions. An embodiment according to the invention herein comprises an apparatus comprising a memory for storing computer program instructions and a processor for executing the program instructions, wherein the computer program instructions, when executed by the processor, trigger the apparatus to perform a method and/or solution according to embodiments of the invention as described above.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned. Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the apparatus claims may also be implemented by one unit or means in software or hardware. The terms first, second, etc. are used to denote names, but not any particular order.

Claims (6)

1. A method for real-time detection of Cache-based attacks comprises two stages: an off-line analysis phase and an on-line detection phase, wherein,
the off-line analysis phase comprises the steps of:
step 1, collecting hardware event data generated when an attack program and a benign program based on Cache are executed;
step 2, processing the collected hardware event data, extracting and selecting features for training a machine learning algorithm to obtain a classifier for identifying the attack based on the Cache;
the on-line detection stage comprises the following steps:
step 3, collecting hardware event data of an active process in the system in real time, and carrying out window division on the hardware event data;
step 4, processing hardware event data in each window, and detecting whether each window has attack based on Cache by using a classifier;
in step 3, the step of collecting the hardware event data of the active process in the system in real time is as follows:
step 3-1, periodically scanning all active processes in the system, and acquiring process ID lists of all active processes;
step 3-2, comparing the process ID lists of the active processes obtained in two adjacent times to obtain a process ID list of a newly generated active process;
3-3, filtering out the process ID and the sub-process ID in the white list from the process ID list of the newly generated active process, and adding the process ID of the remaining active process into a process ID queue to be detected;
step 3-4, monitoring all active processes in all process ID queues to be detected and acquiring hardware event data;
in step 3-4, the process ID queue to be detected includes: a filtering queue, a long process queue and a dangerous process queue,
during monitoring, distributing a thread for each process in the filtering queue and the dangerous process queue to collect hardware event data, and distributing a thread for all processes in the long-process queue to collect hardware event data with a fixed window number in turn;
in step 3-3, the process ID queue to be detected is a filtering queue;
in step 2, the step of processing the collected hardware event data into features for training is as follows:
step 2-1, processing the collected hardware event data by using an attention method;
2-2, extracting characteristics of a mean value, a standard deviation, a maximum value, a quantile and a range from the processed data;
2-3, selecting the extracted features by using a genetic algorithm according to F-Score;
in the attention method, the time sequence E of the original hardware event is set as { x }0,x1,...,xt,...,xnConverting into E' ═ y0,y1,...,yt,...,ynAnd (c) the step of (c) in which,
Figure FDA0003173815210000021
ht=α×ht-1+(1-α)×xt,h0=x0where t is the sampling time point, t>0,0<α<1。
2. The method according to claim 1, wherein, in the step 1,
the attack program based on the Cache comprises the following steps: flush + Reload, Flush + Flush, Prime + Probe, Meltdown, Spectre and XLATE series attacks;
the benign program includes: CPU intensive, IO intensive; the hardware events comprise Cache hit, TLB hit and branch misdetection related events, and are collected by using a hardware performance counter;
the hardware event data is a hardware event data sequence with a time sequence.
3. The method of claim 1, wherein in step 2, the machine learning algorithm comprises: a decision tree algorithm, a multi-layer perceptron algorithm and an Xgboost algorithm;
the classifier is a classification model which is generated by the machine learning algorithm through training and can identify various types of attacks based on the Cache.
4. The method of claim 1, wherein the windows contain a fixed number of hardware event data records, there is no overlap between windows;
the white list is the process ID determined to be a benign process.
5. The method of claim 1, wherein the process ID of the active process flows between the filter queue, long process queue, and critical process queue as follows:
the process ID initially enters the filter queue, and if the active process completes within a specified time and no exception is detected, its process ID is removed from the filter queue upon completion of the active process;
moving the process ID from a filter queue to a long process queue if the active process is not completed within a specified time and no exception is detected;
moving the process ID from a filter queue to a critical process queue if the active process detects an exception within a specified time;
if the active process detects an exception in the long process queue, the process ID is moved from the long process queue to a hazardous process queue.
6. The method according to claim 1, wherein, in the step 4,
processing hardware event data in the window in the same way as the method described in the step 2-1 and the step 2-2;
when a classifier is used for detecting whether the attack based on the Cache exists in the window, different detection schemes are adopted for the windows from different processes: detecting windows from processes in the filter queue and the long-process queue using an Xgboost classifier; windows from processes in the danger process queue are jointly detected using a decision tree classifier, a multi-tier perceptron classifier, and an Xgboost classifier.
CN201910127173.2A 2019-02-20 2019-02-20 Method for real-time detection of attack based on Cache Active CN109918901B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910127173.2A CN109918901B (en) 2019-02-20 2019-02-20 Method for real-time detection of attack based on Cache

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910127173.2A CN109918901B (en) 2019-02-20 2019-02-20 Method for real-time detection of attack based on Cache

Publications (2)

Publication Number Publication Date
CN109918901A CN109918901A (en) 2019-06-21
CN109918901B true CN109918901B (en) 2021-10-15

Family

ID=66961861

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910127173.2A Active CN109918901B (en) 2019-02-20 2019-02-20 Method for real-time detection of attack based on Cache

Country Status (1)

Country Link
CN (1) CN109918901B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113221118B (en) * 2021-05-11 2023-03-28 卓尔智联(武汉)研究院有限公司 Detection method and device for channel attack on cache side and electronic equipment
CN114679315B (en) * 2022-03-25 2024-05-14 中国工商银行股份有限公司 Attack detection method, apparatus, computer device, storage medium, and program product
CN117077152B (en) * 2023-10-18 2024-01-23 中电科申泰信息科技有限公司 Method for disturbing superscalar processor speculatively executing spectrum attack

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105550578A (en) * 2015-12-10 2016-05-04 上海电机学院 Network anomaly classification rule extracting method based on feature selection and decision tree
CN108629181A (en) * 2018-05-11 2018-10-09 湖南大学 The Cache attack detection methods of Behavior-based control

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8910280B2 (en) * 2012-04-30 2014-12-09 At&T Intellectual Property I, L.P. Detecting and blocking domain name system cache poisoning attacks
US10116436B1 (en) * 2017-09-26 2018-10-30 Intel Corporation Techniques for preventing memory timing attacks

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105550578A (en) * 2015-12-10 2016-05-04 上海电机学院 Network anomaly classification rule extracting method based on feature selection and decision tree
CN108629181A (en) * 2018-05-11 2018-10-09 湖南大学 The Cache attack detection methods of Behavior-based control

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于行为的cache攻击检测系统;李晓虹;《湖南大学工程硕士学位论文》;20181230;正文第20-35页 *

Also Published As

Publication number Publication date
CN109918901A (en) 2019-06-21

Similar Documents

Publication Publication Date Title
US11188643B2 (en) Methods and apparatus for detecting a side channel attack using hardware performance counters
US11038984B2 (en) Data prefetching for large data systems
CN109918901B (en) Method for real-time detection of attack based on Cache
CN109271782B (en) Method, medium, system and computing device for detecting attack behavior
US11741132B2 (en) Cluster-based scheduling of security operations
CN112866023B (en) Network detection method, model training method, device, equipment and storage medium
Farooq et al. Optimal machine learning algorithms for cyber threat detection
US20080120720A1 (en) Intrusion detection via high dimensional vector matching
Zhu et al. Android malware detection based on multi-head squeeze-and-excitation residual network
US11204935B2 (en) Similarity analyses in analytics workflows
US10785243B1 (en) Identifying evidence of attacks by analyzing log text
US11244043B2 (en) Aggregating anomaly scores from anomaly detectors
Wang et al. Hybridg: Hybrid dynamic time warping and gaussian distribution model for detecting emerging zero-day microarchitectural side-channel attacks
Kuruvila et al. Time series-based malware detection using hardware performance counters
CN112583847B (en) Method for network security event complex analysis for medium and small enterprises
CN107920067B (en) Intrusion detection method on active object storage system
CN111104670B (en) APT attack identification and protection method
US20180107823A1 (en) Programmable Hardware Security Counters
Liu et al. SeInspect: Defending model stealing via heterogeneous semantic inspection
Firdaus et al. Selecting root exploit features using flying animal-inspired decision
Li et al. A deep malware detection method based on general-purpose register features
CN114021118A (en) Multi-element behavior detection method, system and medium based on super-fusion server system
US20040216082A1 (en) Methods and apparatus to detect a macroscopic transaction boundary in a program
HajKacem et al. Spark Based Intrusion Detection System Using Practical Swarm Optimization Clustering
CN115374444B (en) Virus detection method and device based on virtual host behavior analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant