CN110865866A - Virtual machine safety detection method based on introspection technology - Google Patents

Virtual machine safety detection method based on introspection technology Download PDF

Info

Publication number
CN110865866A
CN110865866A CN201910930547.4A CN201910930547A CN110865866A CN 110865866 A CN110865866 A CN 110865866A CN 201910930547 A CN201910930547 A CN 201910930547A CN 110865866 A CN110865866 A CN 110865866A
Authority
CN
China
Prior art keywords
virtual machine
information
state
detection
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910930547.4A
Other languages
Chinese (zh)
Other versions
CN110865866B (en
Inventor
罗军舟
凌振
党一菲
师晓敏
李永杰
吉祥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhong Tong Clothing Consulting And Design Research Institute Co Ltd
Southeast University
China Information Consulting and Designing Institute Co Ltd
Original Assignee
Zhong Tong Clothing Consulting And Design Research Institute Co Ltd
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhong Tong Clothing Consulting And Design Research Institute Co Ltd, Southeast University filed Critical Zhong Tong Clothing Consulting And Design Research Institute Co Ltd
Priority to CN201910930547.4A priority Critical patent/CN110865866B/en
Publication of CN110865866A publication Critical patent/CN110865866A/en
Application granted granted Critical
Publication of CN110865866B publication Critical patent/CN110865866B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • H04L63/145Countermeasures against malicious traffic the attack involving the propagation of malware through the network, e.g. viruses, trojans or worms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/4557Distribution of virtual machine instances; Migration and load balancing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45575Starting, stopping, suspending or resuming virtual machine instances
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45587Isolation or security of virtual machine instances
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45591Monitoring or debugging support
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45595Network integration; Enabling network access in virtual machine instances

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computing Systems (AREA)
  • Computer Hardware Design (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Virology (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a virtual machine safety detection method based on a introspection technology, which comprises the following steps: acquiring virtual machine state data from a virtual machine manager layer outside a virtual machine, acquiring virtual machine process state, file and port data through introspection of a memory, and acquiring system call data of the virtual machine through a register; restoring and associating the process state, the file, the port and the system calling information of the virtual machine by analyzing a system data structure and a system symbol table and combining externally acquired data; performing characteristic processing on the state and process information of the virtual machine by using an increment and time slice method; the method comprises the steps of utilizing local abnormal factors to detect abnormal states, utilizing random forests to detect malicious processes in the virtual machine, and combining a network intrusion detection tool to detect network intrusion at a process level according to flow information of the virtual machine; and aiming at the detection result, different levels of response are realized according to user setting, and the safety of the virtual machine is improved.

Description

Virtual machine safety detection method based on introspection technology
Technical Field
The invention belongs to the technical field of virtual machine safety detection, and particularly relates to a virtual machine safety detection method based on a self-provincial technology.
Background
In recent years, cloud computing and virtualization technologies have been widely used, and it is a trend to mount a plurality of virtual machines on one host computer and provide a plurality of services for different users. The accompanying various security problems are gradually exposed, and compared with the traditional physical host, the influence range of the security threats of the virtual machine is relatively increased, and the security problems of the virtual machine are more emphasized.
Virtual machine security detection is a kind of monitoring of virtual machine performance security and function security. Specifically, whether the resource utilization condition of each virtual machine is in a reasonable range or not is checked, and whether the virtual machine is invaded by malicious software or not is checked, so that the safety of the virtual machine and even other co-hosted virtual machines is threatened. The existing virtual machine detection mostly utilizes a traditional security detection method based on a host and a traditional security detection method based on a network, wherein the detection tool lacks isolation between the virtual machine and the virtual machine, is easy to tamper and has low credibility, and the detection tool lacks detailed information in the virtual machine, has poor detection relevance and single dimensionality. Based on the feature that the virtual machine is located on the physical host, a virtual machine introspection technique is proposed to obtain information about the internal operating system of the virtual machine from outside the monitored virtual machine. The isolation of the virtual machine and the host layer is enhanced by collecting the virtual machine information by using the introspection technology, the reliability of detection is improved, various fine-grained information is fully associated, the comprehensiveness of detection is enhanced by using a machine learning method, and the detection efficiency is improved.
Disclosure of Invention
The technical problem is as follows: the invention aims to provide a virtual machine safety detection method technology based on a introspection technology, which utilizes a virtual machine introspection method to collect, recover and associate various types of information of a virtual machine from the outside of the virtual machine, establishes a virtual machine external information view, and realizes comprehensive safety detection of the virtual machine by a machine learning method and a network intrusion detection tool.
The technical scheme is as follows: in order to achieve the above object, the present invention provides a virtual machine security detection method based on introspection technology, comprising the following steps:
the method comprises the following steps:
step 1: acquiring virtual machine state information and process information based on a virtual machine introspection technology, and forming a virtual machine external information view;
step 2: respectively extracting the characteristic vectors of the virtual machine state information and the process information data in the external information view of the virtual machine by using a time sliding window to obtain a virtual machine state information characteristic vector and a process information characteristic vector;
and step 3: the method and the device realize the detection of the abnormal state of the coarse-grained virtual machine and the malicious process of the fine-grained virtual machine, and combine with a network flow intrusion detection tool to realize the malicious flow detection of the process and enhance the detection effect of the malicious process of the virtual machine.
In step 1, the acquiring of the state information of the virtual machine includes obtaining the state information of the virtual machine from the inside and the outside of the virtual machine, where obtaining the state information of the virtual machine from the inside of the virtual machine includes: acquiring state information of the virtual machine through a daemon process in the virtual machine, wherein the daemon process acquires a real-time state in the virtual machine by using a Dstat tool, and the real-time state comprises CPU (Central processing Unit) usage, memory usage, disk read-write quantity and network flow; meanwhile, the daemon process obtains the information of the internal process and the module of the virtual machine by using system commands ps and lsmod respectively, and transmits the information and the information collected by a Dstat tool to the outside of the virtual machine in a socket mode;
the obtaining of the state information of the virtual machine from outside the virtual machine includes: CPU time, total disk I/O amount and total network card I/O amount of the virtual machines of two time points T1 and T2 are obtained by using VMM layer open source tools Xentop and Libvirt, and external state information of the virtual machines is obtained by calculating the variable quantity of various data and the proportion of time periods in the time period from T1 to T2.
In step 1, the process information of the virtual machine is recovered from the outside of the virtual machine, and the process information includes information in a recovery memory and information in a register;
wherein, the information in the memory is recovered by acquiring and recovering 16 kernel structures of the virtual machine through polling so as to obtain a static process state, file information and port information;
and the information in the register is recovered by setting INT 3 software interrupt by using LibVMI, forcing the virtual machine with the system call to be trapped in a virtual machine manager layer, reading the binary data of the current system call through the register, and recovering the binary data into understandable operating system level semantics according to the analysis of the kernel source code of the system call to obtain the current system call information.
The recovering of the information in the memory is to acquire static process states, file information and port information by polling collection and recovering of 16 kernel structures of the virtual machine, and specifically includes: binary data of a module list, a process state, a file and a port are extracted from a memory through a library LibVMI and a symbol table of a virtual machine system, and then each kernel structure and offset thereof in the virtual machine operating system are analyzed, so that the binary data extracted from the memory are restored into understandable operating system semantics.
The forming of the external information view of the virtual machine comprises the following steps: real-time system call is intercepted from the outside of the virtual machine by software interruption, and dynamic system call information in every 1 second is combined with static process state, file information and port information at the end of the period of time by a method of sliding a time window to form an external information view of the virtual machine process.
The step 2 comprises the following steps: for the virtual machine state information, the extracted virtual machine state information characteristic vector comprises CPU, network, disk, memory, process and module information, and the data which has fluctuation and upper and lower bounds in the virtual machine state information characteristic vector is stored in an original form; for the ascending data without upper bound, the ascending data is expressed by a method of increment in a time period, and various numerical virtual machine state information characteristic vectors are formed.
The step 2 comprises the following steps: for the process information data in the external information view of the virtual machine, the extracted process information characteristic vector comprises the state, the associated file, the associated port and the system call of the process, namely the static and dynamic information of the process is extracted in a sliding time window, and the static and dynamic information and the dynamic information are counted to form each numerical value in the process information characteristic vector.
The step 3 comprises the following steps: training a virtual machine abnormal state detection model by using a local abnormal factor algorithm and a virtual machine state information characteristic vector, and detecting the abnormal state of the virtual machine to realize the detection of the abnormal state of the virtual machine with coarse granularity;
and training a virtual machine flow detection model by using a random forest algorithm and a process information characteristic vector for carrying out flow detection and realizing fine-grained malicious process detection of the virtual machine.
When the virtual machine flow detection model detects the flow, the alarm information of the network flow intrusion detection tool is corresponding to the virtual machine through the IP address; and if the alarm information contains port information, combining the port information in the process information acquisition of the virtual machine, and corresponding the alarm information to a process.
The invention also comprises a step 4: according to the administrator self-defined setting and the virtual machine safety detection result, different levels of response are realized according to abnormal alarm of the state, the process and the flow of the virtual machine, and the virtual machine is subjected to migration, pause and shutdown operations.
In the step 3, the malicious process characteristics are automatically acquired, namely, a virtual machine incremental image file is newly built, a virtual machine of an incremental image is started, then a process information acquisition program of the host is started, then the malicious process is executed through network connection built between the host and the inside of the virtual machine, and after the acquisition time is over, the virtual machine is automatically closed, and the incremental image is destroyed.
The method comprises the following steps of detecting the state and the process of a virtual machine from a host level by utilizing a machine learning method; in the flow detection, the alarm information of the network intrusion detection is corresponding to the virtual machine through an IP address; and if the alarm information contains port information, the alarm information is corresponding to a certain process by combining the port information in the process information acquisition of the virtual machine.
In the step 4, for the result of the security detection of the virtual machine, responses at different levels can be realized according to the user-defined setting of the administrator, and the migration, suspension and shutdown operations are performed on the virtual machine.
Has the advantages that: according to investigation, the file, port and system call information of the process can represent the task executed by the process from different aspects, malicious processes can be identified by using the characteristics, and meanwhile, the hiding of malicious software to the behaviors can be prevented by using a virtual machine introspection method. Based on this, compared with the prior art, the detection method provided by the invention has the following advantages:
(1) the invention adopts a method based on the introspection technology to collect, recover and correlate various information of the process, so that the data is comprehensive and the reliability is high;
(2) the invention realizes the detection of the abnormal state and the malicious process of the virtual machine by utilizing a machine learning method on a host level, can effectively discover the unknown abnormal state and the variants of the malicious software, and has higher isolation.
(3) The invention detects the safety condition of the virtual machine from three levels of state, process and network, and has good coverage and more comprehensive performance. Meanwhile, different levels of response can be made according to the setting of an administrator aiming at the detection result.
Drawings
The foregoing and other advantages of the invention will become more apparent from the following detailed description of the invention when taken in conjunction with the accompanying drawings.
Fig. 1 is a framework diagram of the present invention.
Fig. 2 is a block diagram of the present invention.
FIG. 3 is a virtual machine information collection framework
Fig. 4 is a view of virtual machine external information.
FIG. 5 is a virtual machine state detection flow diagram.
Fig. 6 is a flow chart of process sample information automated collection.
Detailed Description
The invention is further explained below with reference to the drawings and the embodiments.
As shown in fig. 1 and fig. 2, the present invention provides a virtual machine security detection method based on introspection technology, which includes the following steps:
(1) virtual machine information acquisition:
and acquiring hardware state information of the virtual machine from the virtual machine manager level through a virtual machine layer management tool. Binary data of process states, files, ports and system calls of the virtual machine are collected from a memory and a register of a virtual machine management layer by utilizing an open source library with a self-provincial function and software interruption, the binary data is restored to understandable operating system level semantics (namely semantics represented by character strings) by combining analysis of kernel source codes and kernel structures, and the latter three are associated with the process to form a virtual machine external information view. And by comparing the internal information and the external information of the virtual machine, the hidden process and the hidden module can be discovered.
(2) Virtual machine feature processing:
processing the state and process information of the virtual machine, and storing data which fluctuates and has upper and lower bounds in various types of information of the virtual machine in an original form; incremental data without upper bounds are represented in increments over a period of time. And carrying out classification statistics on system call, files and port information of the process, and forming a process characteristic vector of the virtual machine according to the process state and the number of each class.
(3) Virtual machine security detection
Through the collection of historical information and background data, a database for state detection and process detection is formed, a machine learning model is trained according to the data, the abnormal state of the virtual machine is detected by using local abnormal factors, and the malicious process of the virtual machine is detected by using a random forest. And (4) combining a network intrusion detection tool, mapping the detected alarm information to the virtual machine through the IP address, and corresponding to the process through the port.
(4) Virtual machine monitor response
The administrator can set the danger coefficient and the response mode of various detection results of the virtual machine in a self-defined mode, automatically realize the operations of migration, suspension and closing of the virtual machine aiming at the safety detection result of the virtual machine, and improve the safety of the virtualization environment.
Example (b):
the virtual machine security detection method based on the introspection technology in the embodiment includes the following steps:
1. virtual machine information acquisition:
an information collection framework based on Virtual Machine Introspection (VMI) comprises two parts of Virtual Machine state information and process information. The virtual machine is divided from a hardware level like a physical host, and the virtual machine can be stored or operated by mainly four components: memory, CPU, network card and magnetic disk. The overall state recovered from the outside is mainly the state information for the four parts, and the recovered process information is mainly the information in the memory and the CPU. Because all the addition, deletion and modification of the disk data are completed through the operation of the memory, and the process static and dynamic information required in the later detection process can be obtained when the memory and the CPU information are acquired, the invention does not perform corresponding recovery on the specific content of the disk any more. Meanwhile, aiming at the information of the network card, the Snort intrusion detection tool combined during flow detection already comprises a sniffer and a recorder for the information of the network card flow data packet, so that excessive research is not carried out here.
Therefore, the collection of virtual machine information is divided into two parts as in fig. 3:
virtual machine state information: information obtained from two parts, namely an internal daemon process and an external part through a VMM (virtual machine monitor) layer mainly comprises memory, CPU, a network card and disk state information;
virtual machine process information: the process state, the file, the port and the system calling information are obtained through data acquisition of a virtual machine memory and a CPU register and semantic recovery, and the information of the process state, the file, the port and the system calling information is related to the corresponding process, so that the process state and the behavior are embodied in an all-round mode.
By integrating the information recovered by the two parts, the real-time information of the virtual machine is finally obtained from the outside, and an external information view of the virtual machine is constructed.
Acquiring state information of the virtual machine:
the state information acquisition of the virtual machine simultaneously obtains the state information of the virtual machine from the inside and the outside respectively. The internal state collection of the virtual machine is mainly completed by a daemon process inside the virtual machine, and can be directly obtained through collection without excessive recovery work. The daemon process utilizes a multifunctional Dstat tool which can replace commands such as vmstat, iostat, netstat and ifstat to collect real-time states inside the virtual machine, wherein the real-time states include state information such as CPU usage, memory usage, disk read-write quantity and network flow. Meanwhile, the daemon process utilizes system commands ps and lsmod to respectively obtain information of the internal process and the module of the virtual machine, the statistical information of the part is also used as part of characteristics of the state of the virtual machine, and the statistical information and the information collected by the Dstat are transmitted to the outside of the virtual machine in a socket mode. Restoring the state information of the virtual machine from the outside requires acquiring hardware information by using a VMM layer open source tool and then processing the hardware information to obtain the hardware information. In Xentop and Libvirt, information such as CPU time, total disk I/O amount and total network card I/O amount of the virtual machine at time points T1 and T2 can be obtained by using related interfaces, and a more common system state representing quantity is obtained by calculating the variable quantity of various data in a time period delta T (T1 to T2) and the proportion of the time period.
Acquiring process information of the virtual machine:
the process information collection of the virtual machine is to recover process information from the outside of the virtual machine, and mainly recover information in a memory and a register. The memory information recovery acquires and recovers 16 kernel structures of the virtual machine through polling so as to obtain static process state, file information and port information; and the register information recovery intercepts the generated system call by setting INT 3 interrupt, and accesses the system call information required by the memory data recovery of the corresponding position in real time.
The memory information recovery is to obtain the state information, the file and the port information of the process, and is obtained by recovering data collected from the memory outside the virtual machine. Binary data of a module list, a process state, a file and a port are extracted from a memory through a library LibVMI and a virtual machine system symbol table, and then each kernel structure and offset thereof in the virtual machine operating system are analyzed, so that the binary data extracted from the memory are restored into understandable operating system semantics. Table 1 lists the 16 core structures involved.
TABLE 1
Figure BDA0002220169540000061
Figure BDA0002220169540000071
Meanwhile, INT 3 software interrupt is set by using LibVMI, a virtual machine with system call is forced to be trapped in a virtual machine manager layer, binary data of the current system call is read through a register, and the binary data is restored into understandable operating system level semantics according to analysis of system call kernel source codes, so that the current system call information is obtained. In order to record the behavior of the virtual machine, 9 important system calls in table 2 are selected, and the parameters are recovered in more detail.
TABLE 1
Figure BDA0002220169540000072
The invention effectively associates various process information, and associates files, port information and processes according to the association structure of file descriptors in a process module; when the virtual machine generates system calling, comparing the process address information stored in the CR3 register with a pointer PGD of a process page directory recovered in a memory, and associating the system calling information with a process; finally, the detailed information of the virtual machine state and the process obtained from the virtual machine manager layer are integrated to form a virtual machine external information view as shown in fig. 4.
The process information recovered from the task list structure in the memory is combined with the interception of system call to form an external view of the memory process list, and a part of list information of the exiting process is maintained through sys _ kill and sys _ exit _ group calls. And combining the recovered memory process list to form a real-time external process list, and comparing the real-time external process list with the process information in the virtual machine to discover a malicious hidden process.
Virtual machine feature processing
In order to form the characteristics of the state and the process of the virtual machine, the process of the previously collected state and process information is processed, and in order to better characterize and unify the characteristics of the state and the process, a numerical value representation method based on increment and counting and a process characteristic association method based on time slices are adopted.
Respectively extracting the characteristics of the acquired virtual machine state information and the process information data in the external information view of the virtual machine by using time sliding windows, wherein the characteristic vectors extracted from the external and internal acquired virtual machine state information comprise software and hardware information such as a CPU (central processing unit), a network, a disk, a memory, a process, a module and the like (shown in a table 5), and each numerical value in the virtual machine state characteristic vector is represented by the increment of the state information in each sliding time window; extracting characteristic vectors of process information in the external view of the virtual machine, wherein the characteristic vectors comprise the state, the associated file, the associated port and the system call of the process (as shown in a table 6), extracting static and dynamic information of the process in a sliding time window, and performing statistics to form each numerical value in the characteristic vectors of the process information;
because the collection of the memory information is a polling mode, the invention adopts different units and variation forms of unified features of an incremental representation method. And directly taking the fluctuation numerical value with the upper and lower bounds as a characteristic value, and taking the difference value between the current collection and the last collection calculated for the information without continuous increase of the upper bound as the characteristic value. And for the files, ports and system call information in the process, a classification counting mode is adopted, and the file descriptors associated with the process are classified into 12 types as shown in the table 3.
TABLE 3
Figure BDA0002220169540000081
Figure BDA0002220169540000091
The system calls are classified into 9 classes as shown in table 4 according to the function.
TABLE 4
Figure BDA0002220169540000092
The virtual machine state information is processed to form features such as those shown in Table 5, which essentially encompass aspects of the virtual machine state from both the hardware and software levels.
TABLE 5
Figure BDA0002220169540000093
Figure BDA0002220169540000101
The state features represent the state conditions of the virtual machine from different aspects, but have different units and change forms, and can not directly form a uniform feature vector for detection and use, the features are divided into two types according to the range of effective values, one type is numerical values with upper and lower boundaries, such as CPU utilization rate and the like, all numerical values can fluctuate in a certain area in the running process of the virtual machine, and the other type is numerical values without upper boundaries, such as page replacement times, interruption times and the like, and the state numerical values can show a continuously increased state in monitoring. The numerical value of the first type of feature can be directly used as the feature, but the second type of feature is overlapped with all the previous states and cannot accurately represent the current virtual machine condition, so the second type of feature is processed and an increment-based representation method is adopted. The last collection of a certain characteristic is that the value is T1At the current acquisition time, T2Incremental based representationThe method represents the value T of the current feature as T2And T1The difference of (c) is shown in formula (1).
T=T2-T1(1)
For example, to obtain CPU utilization, CPU at T can be collected1Time CT of CPU1And T2Time CT of CPU2Then, calculating the CPU usage rate of the period of time by a formula 2-1:
Figure BDA0002220169540000102
for the state of the disk, the I/O rate of the disk is a dynamic variable quantity, and cannot be directly obtained from Xen, and is obtained by using Xen to provide the total I/O amount of the current disk. Therefore, aiming at the situation, a calculation method in a short time is designed, the current disk utilization rate is calculated through the difference of the total I/O amount in the delta T time interval, and the time period starting time T is defined1Total amount of read disks RD1And total number of write disks WD1End time T2Total amount of read disks RD2And total number of write disks WD2And calculating to obtain a DISK I/O read rate and a write rate DISK write rate according to the formulas (3) and (4):
Figure BDA0002220169540000103
Figure BDA0002220169540000104
the state of the network port is the same as the state of the disk, and is obtained by utilizing static information obtained by the Xen interface, namely the sum of uplink and downlink flow. The start time T of the Δ T time period is also defined1Total amount of uplink UN1And downlink total DN1End time T2Total amount of uplink UN2And downlink total DN2And calculating an uplink speed NET uprate and a downlink speed NET downrate of the network port I/O by formulas (5) and (6):
Figure BDA0002220169540000105
Figure BDA0002220169540000111
the process information of the virtual machine is subjected to process static and dynamic information integration, and is processed to finally form characteristics taking the process as a unit, such as table 6.
TABLE 6
Figure BDA0002220169540000112
When static and dynamic data association is carried out, the time difference exists between the static and dynamic data association, and the process related information is counted in a time window mode. Suppose T1 is the last collection point and T2 is the current collection point. The process characteristics at the point T2 comprise the process state, the file and the associated port information of the point T2 and the system call information of each process from T1 to T2, and the static and dynamic information of the processes is effectively unified by adopting a time window.
In the recovery of the process information, it is found that some system calls cannot find the corresponding process at the time point of T2, because there are processes ending in the time periods of T1 and T2, and the corresponding information cannot be obtained already when the static information collection of the process is performed at the time point of T2. For processes that begin before time T1, static information has already been collected, but for short-term processes that begin and end within this time period, the static information collection cannot be captured, resulting in a lack of late-stage process detection. In order to solve the problem, the semantic recovery of two system calls sys _ kill and sys _ exit _ group can be utilized to obtain the processes ending in each time period, and the system call information related to the processes is associated and used as the characteristic information of the corresponding process in the later detection.
All kinds of information of the process are formed into a uniform expression mode through a characteristic vector mode, so that files, ports and system calling information are processed, and all associated file descriptors are classified to form all categories as shown in a table 3.
Meanwhile, in order to improve the speed of real-time detection and reduce the dimension of process characteristic vectors, all intercepted system calls are classified into nine categories according to functions. The system call classifications as in table 4 are formed, and the system call feature vector of the process is formed using the number of system calls that occur for each classification.
Since the value ranges of the virtual machine state and the process characteristics are different, the process characteristics are standardized by using a preprocessing. the method comprises the following steps that a scaler standardized model is formed by taking the trainX as an input feature, and for each input feature trainX _1, the model can be converted into a standard feature scaledX _1, and the specific implementation codes are as follows:
Figure BDA0002220169540000121
third, virtual machine security detection
The method is characterized in that a machine learning method is combined to detect the state and the progress of the virtual machine, a local abnormal factor method is used to realize the state detection of the virtual machine, a random forest method is used to realize the progress detection of the virtual machine, and meanwhile, in order to make up for the defect of external flow detection, a Snort open source network intrusion detection tool is combined to detect the flow of the virtual machine.
The virtual machine state contains complex information and the judgment of the abnormity is fuzzy. In the detection, an anomaly detection model needs to be established, and considering that the states of the virtual machine expressed at different times and under different working conditions have large differences, a method of collecting all data as the same is not suitable for the method. Therefore, a local abnormal factor method is selected to establish a virtual machine abnormal state detection model, and a sample is divided into a plurality of parts for consideration, so that a better effect can be achieved in detection. For the screening of sample data, i.e. the training phase, the iterative mode is utilized, and the whole state training-detection flow is as shown in fig. 5.
The detection of the process flow of the virtual machine has more background data (malicious software), and a virtual machine flow detection model is trained through an automatic acquisition process. In order to improve the detection accuracy and reduce the influence of various noises, the set classifier of random forest is adopted, so that the effect is better.
In order to ensure the reliability of data acquisition and prevent cross infection of malicious software, the invention realizes automatic process information acquisition, as shown in fig. 6. In the data acquisition stage, the incremental mirror image is utilized, after the virtual machine is started each time, the information acquisition program and a sample are operated, each sample is given a plurality of operation times, the acquisition process is terminated after the time, and the virtual machine is closed and recovered to the previous state. Since a subprocess or other unknown operation may be generated when a normal sample or a malicious sample runs, and the process information is also recorded during collection, at least one piece of sample information is formed for each program sample.
Although the collection and detection of the state and the process of the virtual machine are finished outside the virtual machine, the collection and the detection are both directed at internal information of the virtual machine, and external traffic detection is lacked. In order to realize the integrity of the method and the comprehensive safety detection of the virtual machine, the invention combines a Snort open source network intrusion detection tool to realize the flow detection of the virtual machine. The method can position the alarm IP information to a certain virtual machine according to the Snort flow alarm, and then matches the port information in the virtual machine process collection according to the alarm port information, thereby realizing the detection of the virtual machine process flow.
And virtual machine monitoring response:
the administrator can check various information of the state and the process of the virtual machine through the part, and can set the cardinal numbers of various alarms of the state, the process and the flow of the virtual machine and customize the response aiming at different virtual machines and different detection results.
The detection result of the virtual machine alarms the abnormality of the state, the process and the flow, and includes six categories of state abnormality, hidden process, hidden module, system call hijacking, process abnormality and network abnormality, and each category contains different alarm information as shown in table 7. The information is obtained by the server through real-time query of the database, and the data is written into the database by a detection program running on the host according to the real-time detection condition.
TABLE 7
Figure BDA0002220169540000131
The administrator may set a threshold of each response policy for a single virtual machine, and the system may perform the suspending, migrating, and closing operations of the virtual machine according to the threshold set by the user, for example, the threshold for detecting the abnormal state of the virtual machine may use a value of a local abnormal factor as the threshold, the threshold for detecting the abnormal process of the virtual machine may use the number of abnormal processes as the threshold, and the threshold for detecting the abnormal network traffic may set the threshold according to the number of Snort alarms. Through the safety detection of the previous virtual machines and the setting of an administrator, each virtual machine has different danger coefficients of state, process and flow detection. The administrator can set different thresholds for the three virtual machine response strategies, and when the conditions set by the administrator are reached, the system can automatically execute relevant response operations so as to improve the safety of the virtual machines in the same host.
The present invention provides a virtual machine security detection method based on introspection technology, and there are many methods and ways to implement this technical solution, and the above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, a plurality of improvements and decorations can be made without departing from the principle of the present invention, and these improvements and decorations should also be regarded as the protection scope of the present invention. All the components not specified in the present embodiment can be realized by the prior art.

Claims (10)

1. A virtual machine security detection method based on introspection technology is characterized by comprising the following steps:
step 1: acquiring virtual machine state information and process information based on a virtual machine introspection technology, and forming a virtual machine external information view;
step 2: respectively extracting the characteristic vectors of the virtual machine state information and the process information data in the external information view of the virtual machine by using a time sliding window to obtain a virtual machine state information characteristic vector and a process information characteristic vector;
and step 3: the method and the device realize the detection of the abnormal state of the coarse-grained virtual machine and the malicious process of the fine-grained virtual machine, and combine with a network flow intrusion detection tool to realize the malicious flow detection of the process and enhance the detection effect of the malicious process of the virtual machine.
2. The method of claim 1, wherein in step 1, the collecting of the state information of the virtual machine comprises obtaining the state information of the virtual machine from the inside and the outside of the virtual machine, wherein the obtaining of the state information of the virtual machine from the inside of the virtual machine comprises: acquiring state information of the virtual machine through a daemon process in the virtual machine, wherein the daemon process acquires a real-time state in the virtual machine by using a Dstat tool, and the real-time state comprises CPU (Central processing Unit) usage, memory usage, disk read-write quantity and network flow; meanwhile, the daemon process obtains the information of the internal process and the module of the virtual machine by using system commands ps and lsmod respectively, and transmits the information and the information collected by a Dstat tool to the outside of the virtual machine in a socket mode;
the obtaining of the state information of the virtual machine from outside the virtual machine includes: CPU time, total disk I/O amount and total network card I/O amount of the virtual machines of two time points T1 and T2 are obtained by using VMM layer open source tools Xentop and Libvirt, and external state information of the virtual machines is obtained by calculating the variable quantity of various data and the proportion of time periods in the time period from T1 to T2.
3. The method according to claim 2, wherein in step 1, the collecting of the virtual machine process information is recovering the process information from outside the virtual machine, including recovering information in the memory and information in the register;
wherein, the information in the memory is recovered by acquiring and recovering 16 kernel structures of the virtual machine through polling so as to obtain a static process state, file information and port information;
and the information in the register is recovered by setting INT 3 software interrupt by using LibVMI, forcing the virtual machine with the system call to be trapped in a virtual machine manager layer, reading the binary data of the current system call through the register, and recovering the binary data into understandable operating system level semantics according to the analysis of the kernel source code of the system call to obtain the current system call information.
4. The method according to claim 3, wherein the recovering information in the memory is to obtain static process state, file information, and port information by polling collection and recovering of 16 kernel structures of the virtual machine, and specifically comprises: binary data of a module list, a process state, a file and a port are extracted from a memory through a library LibVMI and a symbol table of a virtual machine system, and then each kernel structure and offset thereof in the virtual machine operating system are analyzed, so that the binary data extracted from the memory are restored into understandable operating system semantics.
5. The method of claim 4, wherein said forming a virtual machine external information view comprises: real-time system call is intercepted from the outside of the virtual machine by software interruption, and dynamic system call information in every 1 second is combined with static process state, file information and port information at the end of the period of time by a method of sliding a time window to form an external information view of the virtual machine process.
6. The method of claim 5, wherein step 2 comprises: for the virtual machine state information, the extracted virtual machine state information characteristic vector comprises CPU, network, disk, memory, process and module information, and the data which has fluctuation and upper and lower bounds in the virtual machine state information characteristic vector is stored in an original form; for the ascending data without upper bound, the ascending data is expressed by a method of increment in a time period, and various numerical virtual machine state information characteristic vectors are formed.
7. The method of claim 6, wherein step 2 comprises: for the process information data in the external information view of the virtual machine, the extracted process information characteristic vector comprises the state, the associated file, the associated port and the system call of the process, namely the static and dynamic information of the process is extracted in a sliding time window, and the static and dynamic information and the dynamic information are counted to form each numerical value in the process information characteristic vector.
8. The method of claim 7, wherein step 3 comprises: training a virtual machine abnormal state detection model by using a local abnormal factor algorithm and a virtual machine state information characteristic vector, and detecting the abnormal state of the virtual machine to realize the detection of the abnormal state of the virtual machine with coarse granularity;
and training a virtual machine flow detection model by using a random forest algorithm and a process information characteristic vector for carrying out flow detection and realizing fine-grained malicious process detection of the virtual machine.
9. The method of claim 8, wherein when the virtual machine traffic detection model performs traffic detection, alarm information of a network traffic intrusion detection tool is mapped to the virtual machine through an IP address; and if the alarm information contains port information, combining the port information in the process information acquisition of the virtual machine, and corresponding the alarm information to a process.
10. The method of claim 9, further comprising the step of 4: according to the administrator self-defined setting and the virtual machine safety detection result, different levels of response are realized according to abnormal alarm of the state, the process and the flow of the virtual machine, and the virtual machine is subjected to migration, pause and shutdown operations.
CN201910930547.4A 2019-09-29 2019-09-29 Virtual machine safety detection method based on introspection technology Active CN110865866B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910930547.4A CN110865866B (en) 2019-09-29 2019-09-29 Virtual machine safety detection method based on introspection technology

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910930547.4A CN110865866B (en) 2019-09-29 2019-09-29 Virtual machine safety detection method based on introspection technology

Publications (2)

Publication Number Publication Date
CN110865866A true CN110865866A (en) 2020-03-06
CN110865866B CN110865866B (en) 2022-04-05

Family

ID=69652481

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910930547.4A Active CN110865866B (en) 2019-09-29 2019-09-29 Virtual machine safety detection method based on introspection technology

Country Status (1)

Country Link
CN (1) CN110865866B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112256404A (en) * 2020-11-17 2021-01-22 杭州安恒信息技术股份有限公司 Virtual machine introspection method, device, equipment and medium
CN116881917A (en) * 2023-09-08 2023-10-13 北京安天网络安全技术有限公司 Malicious process association processing method and device, electronic equipment and medium
CN117032881A (en) * 2023-07-31 2023-11-10 广东保伦电子股份有限公司 Method, device and storage medium for detecting and recovering abnormality of virtual machine

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120240224A1 (en) * 2010-09-14 2012-09-20 Georgia Tech Research Corporation Security systems and methods for distinguishing user-intended traffic from malicious traffic
CN103996003A (en) * 2014-05-20 2014-08-20 金航数码科技有限责任公司 Data wiping system in virtualization environment and method thereof
CN106445639A (en) * 2016-09-30 2017-02-22 北京奇虎科技有限公司 Methods and devices for monitoring virtual machines
CN108469984A (en) * 2018-04-17 2018-08-31 哈尔滨工业大学 It is a kind of to be examined oneself function grade virtual machine kernel dynamic detection system and method based on virtual machine
CN109033839A (en) * 2018-08-10 2018-12-18 天津理工大学 A kind of malware detection method based on dynamic multiple features
CN109597675A (en) * 2018-10-25 2019-04-09 中国科学院信息工程研究所 Virtual machine Malware behavioral value method and system
CN109714314A (en) * 2018-11-21 2019-05-03 中国电子科技网络信息安全有限公司 A kind of construction method for the holographic vulnerability database reappearing loophole Life cycle

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120240224A1 (en) * 2010-09-14 2012-09-20 Georgia Tech Research Corporation Security systems and methods for distinguishing user-intended traffic from malicious traffic
CN103996003A (en) * 2014-05-20 2014-08-20 金航数码科技有限责任公司 Data wiping system in virtualization environment and method thereof
CN106445639A (en) * 2016-09-30 2017-02-22 北京奇虎科技有限公司 Methods and devices for monitoring virtual machines
CN108469984A (en) * 2018-04-17 2018-08-31 哈尔滨工业大学 It is a kind of to be examined oneself function grade virtual machine kernel dynamic detection system and method based on virtual machine
CN109033839A (en) * 2018-08-10 2018-12-18 天津理工大学 A kind of malware detection method based on dynamic multiple features
CN109597675A (en) * 2018-10-25 2019-04-09 中国科学院信息工程研究所 Virtual machine Malware behavioral value method and system
CN109714314A (en) * 2018-11-21 2019-05-03 中国电子科技网络信息安全有限公司 A kind of construction method for the holographic vulnerability database reappearing loophole Life cycle

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
刘垚磊,杨瑞,杨艺: "基于iForest的虚拟机异常检测机制", 《第33次全国计算机安全学术交流会论文集》 *
吉梁,程子栋: "虚拟化技术安全威胁与对策探讨", 《信息通信》 *
崔超远,李勇钢,乌云,孙丙宇: "基于自适应的虚拟机进程实时监控方法", 《计算机学报》 *
罗军舟,杨明,凌振,吴文甲,顾晓丹: "匿名通信与暗网研究综述", 《计算机研究与发展》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112256404A (en) * 2020-11-17 2021-01-22 杭州安恒信息技术股份有限公司 Virtual machine introspection method, device, equipment and medium
CN112256404B (en) * 2020-11-17 2024-04-05 杭州安恒信息技术股份有限公司 Virtual machine introspection method, device, equipment and medium
CN117032881A (en) * 2023-07-31 2023-11-10 广东保伦电子股份有限公司 Method, device and storage medium for detecting and recovering abnormality of virtual machine
CN116881917A (en) * 2023-09-08 2023-10-13 北京安天网络安全技术有限公司 Malicious process association processing method and device, electronic equipment and medium
CN116881917B (en) * 2023-09-08 2023-11-10 北京安天网络安全技术有限公司 Malicious process association processing method and device, electronic equipment and medium

Also Published As

Publication number Publication date
CN110865866B (en) 2022-04-05

Similar Documents

Publication Publication Date Title
CN109831465B (en) Website intrusion detection method based on big data log analysis
CN111092852B (en) Network security monitoring method, device, equipment and storage medium based on big data
US9747452B2 (en) Method of generating in-kernel hook point candidates to detect rootkits and the system thereof
CN101751535B (en) Data loss protection through application data access classification
CN110865866B (en) Virtual machine safety detection method based on introspection technology
CN109933984B (en) Optimal clustering result screening method and device and electronic equipment
CN103679030B (en) Malicious code analysis and detection method based on dynamic semantic features
WO2009097610A1 (en) A vmm-based intrusion detection system
CN109347808B (en) Safety analysis method based on user group behavior activity
CN111277606A (en) Detection model training method, detection method and device, and storage medium
CN113595975B (en) Detection method and device for Webshell of Java memory
Ben Salem et al. Masquerade attack detection using a search-behavior modeling approach
CN114707144A (en) Virtual machine escape behavior detection method and device
CN108959922B (en) Malicious document detection method and device based on Bayesian network
CN116881962B (en) Security monitoring system, method, device and storage medium
Peng et al. Micro-architectural features for malware detection
Vigna et al. Host-based intrusion detection
KR101988747B1 (en) Ransomware dectecting method and apparatus based on machine learning through hybrid analysis
CN116150746A (en) Attack detection method and device, electronic equipment and storage medium
CN108573148B (en) Confusion encryption script identification method based on lexical analysis
Sapegin et al. High-speed security analytics powered by in-memory machine learning engine
Wang et al. MrKIP: Rootkit Recognition with Kernel Function Invocation Pattern.
CN111832030A (en) Data security audit device and method based on domestic password data identification
Cabău et al. Malware classification using filesystem footprints
CN114154160B (en) Container cluster monitoring method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant