CN110618854B - Virtual machine behavior analysis system based on deep learning and memory mirror image analysis - Google Patents

Virtual machine behavior analysis system based on deep learning and memory mirror image analysis Download PDF

Info

Publication number
CN110618854B
CN110618854B CN201910772362.5A CN201910772362A CN110618854B CN 110618854 B CN110618854 B CN 110618854B CN 201910772362 A CN201910772362 A CN 201910772362A CN 110618854 B CN110618854 B CN 110618854B
Authority
CN
China
Prior art keywords
memory
mirror image
virtual machine
characteristic
executable file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910772362.5A
Other languages
Chinese (zh)
Other versions
CN110618854A (en
Inventor
吴春明
陈双喜
王婉飞
姜鑫悦
吴安邦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN201910772362.5A priority Critical patent/CN110618854B/en
Publication of CN110618854A publication Critical patent/CN110618854A/en
Application granted granted Critical
Publication of CN110618854B publication Critical patent/CN110618854B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/566Dynamic detection, i.e. detection performed at run-time, e.g. emulation, suspicious activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45587Isolation or security of virtual machine instances

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Virology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a virtual machine behavior analysis system based on deep learning and memory mirror image analysis, which carries out delta coding by acquiring memory mirror image data, extracts map characteristic point information from a coded memory map, trains a neural network by using the obtained characteristic information to obtain a classifier, and finally operates the neural network to analyze unknown virtual machine behavior by using the obtained classifier. The invention has simple operation, easy realization and convenient modularization; the invention has wide application range, can be used for detecting various attack modes such as known attack, unknown attack and the like, and can not influence the detection performance of the invention even if an attacker leaves attack after being latent for a period of time; in addition, the invention has better robustness, reliability and usability on different system platforms.

Description

Virtual machine behavior analysis system based on deep learning and memory mirror image analysis
Technical Field
The invention belongs to the field of wireless network security, particularly relates to the field of mimicry active defense, and relates to a virtual machine behavior analysis system based on deep learning and memory mirror image analysis.
Background
Virtualized cloud platforms are an important part of cloud computing. The virtualized cloud platform refers to that a plurality of operating systems are simultaneously operated on the same cloud platform, and each system has an independent operating space. By operating a plurality of virtual servers on one server, the use efficiency of the machine is improved, so that the hardware purchasing expense is reduced, and the method is an important mode for creating a green data center. The cloud platform based on the virtual machine enables a user to independently build a business environment of the user, is stable in operation, has good expansibility and mobility, and is widely applied to the fields of financial industry, retail industry, digital marketing, education industry, government and enterprise units and the like.
The virtualized cloud platform structure has an open characteristic, and therefore a series of security problems related to virtual machines are derived. Resource data and applications running in the virtual machine are vulnerable to intruders. Therefore, the virtual machine needs more security mechanisms to accelerate the deployment of the large-scale cloud service. The first problem is how to accurately judge the behavior of the virtual machine in real time and judge whether the virtual machine is attacked maliciously.
At present, the method for solving the problem of the operation safety of the virtual machine comprises the following steps: a virtual machine running state judgment method based on network flow data, logs and priori knowledge. The virtual machine behavior judging method based on network flow data detects whether the virtual machine is attacked maliciously by judging whether data received by a virtual machine network card contains a malicious data packet. This method requires that the communication protocol be resolvable and unable to cope with unknown protocols. In addition, the use of a large number of data packets results in a high computational overhead for the determination method. The method based on log analysis judges whether the virtual machine is attacked maliciously by analyzing the system log of the virtual machine. However, the log itself has hysteresis, and the system needs to determine a series of activities and actions to determine the occurrence of the intrusion, which is very disadvantageous for preventing even active intrusion behavior. The judgment method based on the priori knowledge needs known attack behaviors and cannot cope with unknown bugs, unknown backdoors and unknown attacks.
In order to guarantee the security of the virtual machine in real time, a quick and effective virtual machine behavior analysis method independent of a leaky library and an attack library is urgently needed to improve the accuracy and efficiency of threat discovery and realize the reliability, availability and security of the virtual machine.
Disclosure of Invention
The invention aims to provide a virtual machine behavior analysis system based on deep learning and memory mirror image analysis, aiming at the defects of the prior art. The method and the system aim at internal attack and external attack, known attack and unknown attack in the network, ensure the safety of the virtual machine platform, give early warning to unknown threats in time, correctly judge the behavior of the virtual machine in real time, and improve the safety, reliability and usability of the cloud virtual machine.
The purpose of the invention is realized by the following technical scheme: a virtual machine behavior analysis system based on deep learning and memory mirror image analysis comprises the following steps:
(1) acquiring memory mirror image data, comprising the following substeps:
(1.1) at an initial time t0And acquiring initial memory mirror image data by using a memory forensics tool to obtain an initial memory.
(1.2) at an arbitrary time t0And + delta t, on a VirtualBox and VMware virtualization platform, respectively and automatically sampling memory mirror image data under the conditions that all the isomorphs are not attacked and under the attacked condition at the current moment according to memory management mechanisms of different operating systems to obtain the current memory, namely a normal sample and a malicious sample.
(2) Performing delta encoding, comprising the sub-steps of:
and (2.1) operating a memory forensics tool, and determining an EXE type executable file and a DLL type dynamic link library list in the initial memory respectively by using pslist and dlllist commands for the initial memory obtained in the step (1.1).
(2.2) determining an executable file of an EXE type and a dynamic link library list of a DLL type in the current memory respectively for the pslist and the dlllist command in the memory forensics tool operated by the current memory obtained in the step (1.2);
(2.3) analyzing the EXE type executable file and the DLL type dynamic link library list obtained in the steps (2.1) and (2.2), and determining an executable file which is in the current memory but not in the initial memory, namely a new executable file;
(2.4) generating a prediction memory for each new executable file according to the initial memory, comprising the following substeps:
(2.4.1) determining the process ID of each new executable file, and simultaneously determining the base address of the process in the virtual memory address space;
(2.4.2) for the process of each new executable file, operating a memmap command in a memory evidence obtaining tool in the current memory according to the process base address in the step (2.4.1) to extract the mapping relation between the process virtual memory and the physical memory;
(2.4.3) copying the new executable file from the virtual disk to the initial memory, and executing the following two steps for each virtual memory page of the new executable file: firstly, copying a new executable file to an initial memory by using the mapping relation between the virtual memory and the physical memory extracted in the step (2.4.2) when the virtual memory page is in the current memory; then, recording page copy information, including the source page position of the virtual memory page, the target page position in the physical memory and the page length; finally generating a prediction memory;
(2.5) outputting header information, including path information of the new executable file to be loaded and page copy information of all the new executable files extracted in the step (2.4.3);
(2.6) using the prediction memory generated in the step (2.4) as a source, using the current memory as a comparison object, and using xdelta3 for coding to obtain a memory map after the current memory mirror image data is coded; m, N, respectively representing the row number and the column number of the memory map, and I (I, j) ═ a, b, c, representing the element of the ith row and the jth column of the memory map; wherein I is more than or equal to 0 and less than M, j is more than or equal to 0 and less than N, a, b and c are 32-bit floating point numbers, and I (I, j) is a three-dimensional vector;
(3) extracting the memory map feature point information obtained in the step (2.6), wherein the memory map feature point information comprises feature point positions, feature point sizes and feature strength of the feature points, and the method comprises the following substeps:
(3.1) constructing a Hessian matrix, specifically comprising the following steps: calculating a determinant of a Hessian matrix H (i, j) corresponding to each element in the memory map as a characteristic value of the element, wherein the calculation formula is as follows:
det(H(i,j))=Dii·Djj-0.9Dij·Dij
wherein D isii=I(i+1,j)+I(i-1,j)-2I(i,j),Djj=I(i,j+1)+I(i,j-1)-2I(i,j),Dij=I(i+1,j)+I(i,j-1)-2I(i,j);
(3.2) constructing a scale space by adopting a SURF mode: firstly, filtering an original image of a memory map by adopting a 9 x 9 box filter to be used as a bottom image; then gradually increasing the size of the box filter, and continuously filtering the original image of the memory map; finally, obtaining filter response graphs of different scales and constructing a scale space; the scale space has 4 layers, and the scaling ratio between the layers is 2;
(3.3) accurately positioning the characteristic points, specifically: in each 3 × 3 × 3 local region, performing non-maximum suppression on the scale space constructed in the step (3.2); comparing each element in the scale space with the characteristic values of 26 elements of the three-dimensional neighborhood of the element, wherein the elements with the characteristic values larger or smaller than the surrounding 26 elements are taken as characteristic points, and recording the positions (i, j) of the characteristic points and the scale s;
(3.4) determining map feature points and feature vectors according to the threshold, specifically: comparing the characteristic value of each characteristic point obtained in the step (3.3) under the corresponding scale with a preset threshold value, and if the corresponding characteristic value is smaller than the preset threshold value, the characteristic point is not taken as a final characteristic point; if the corresponding characteristic value is larger than or equal to a preset threshold value, taking the characteristic point as a final characteristic point, and expressing the characteristic vector as [ i, j, s, det (H (i, j, s)) ]; wherein, i and j are the line number and the column number of the final characteristic point in the memory map, s is the filter scale corresponding to the final characteristic point, and det (H (i, j, s)) is the characteristic value of the final characteristic point under the scale s;
(3.5) counting the feature vectors, specifically: judging the source of the feature vector obtained in the step (3.4), wherein the source comprises the memory mirror image data under the condition of not being attacked and the memory mirror image data under the condition of being attacked in the step (1.2); determining a label z corresponding to each feature vector, wherein the feature vector is represented by z-0 derived from memory mirror image data under the condition of not being attacked, and the feature vector is represented by z-1 derived from the memory mirror image data under the condition of being attacked; finally, obtaining a characteristic vector sequence [ i, j, s, det (H (i, j, s)), z ];
(4) training a neural network, specifically: taking the characteristic vector sequence obtained in the step (3.5) as an input sample of the deep neural network, and training the deep neural network to obtain a virtual machine behavior classifier by taking whether the behavior of the virtual machine is normal as output;
(5) operating a neural network, and analyzing unknown virtual machine behaviors, specifically: and (4) analyzing the virtual machine with unknown running state by using the virtual machine behavior classifier obtained in the step (4), and judging whether the unknown virtual machine behavior is normal or not.
Further, the initial time t in the step (1.1)0Are positive real numbers.
Further, Δ t in the step (1.2) is a positive real number.
Further, in the step (1.2), a normal sample can be obtained by adopting a common memory mirror image means; and for the malicious samples, creating shared spaces for all the heterogeneous executors to store different types of malicious tool samples, and configuring simulated intrusion environments for all the heterogeneous executors so as to obtain memory mirror image data when the virtualization platform is attacked by different types.
Further, the 26 elements of the three-dimensional neighborhood of one element in the step (3.3) refer to 8 elements on the same scale as the element and 9 elements of two scale layers above and below the element.
Further, the threshold value preset in the step (3.4) depends on the number of features to be recognized, and the higher the threshold value is set, the fewer features can be recognized.
Further, the deep neural network in the step (4) is any one existing deep neural network structure.
The invention has the beneficial effects that: the method utilizes a memory mirror image data analysis and deep learning mechanism to analyze the behavior attribute of the virtual machine through the coding characteristics of the memory mirror image data; compared with the existing virtual platform state analysis method, the method is simple to operate, easy to realize and convenient to modularize; the invention has wide application range, can be used for detecting various attack modes such as known attack, unknown attack and the like, and can not influence the detection performance of the invention even if an attacker leaves attack after being latent for a period of time; in addition, the invention has better robustness, reliability and usability on different system platforms.
Drawings
FIG. 1 is a schematic diagram of a system model in an embodiment of the invention;
FIG. 2 is a flow chart of the method of the present invention.
Detailed Description
The technical scheme of the invention is described in detail by referring to the accompanying drawings and embodiments.
In consideration of the fact that the memory mirror image data can completely represent the running state of one virtual machine, the invention provides a virtual machine behavior analysis system based on deep learning and memory mirror image analysis by utilizing the memory mirror image data and combining a deep neural network.
As shown in fig. 1, the system model of the present embodiment is: a plurality of operating systems are run on a virtual platform, wherein the operating systems comprise WinServer, Ubuntu, CentOS and RedHat. By introducing a backdoor and virus malicious tool database into each operating system through a manual method, the memory mirror image data when each system is not attacked and the memory mirror image data after being attacked by different types can be obtained at any time. The method extracts the memory data characteristics by using the data through the memory map coding, and further judges whether the virtual machine behavior state is attacked or not by using the memory characteristics, wherein the process is shown as the attached figure 2, and the method specifically comprises the following steps:
step one, acquiring memory mirror image data; the specific process is as follows:
(1) at an initial time t0Acquiring initial memory mirror image data by using a memory forensics tool when the initial memory mirror image data is 0;
(2) after the time delta t is 1, on a VirtualBox or VMware virtualization platform, according to the memory management mechanisms of different operating systems, memory mirror image data under the normal condition (not attacked) and under the attacked condition of each isomer at the current moment, namely a normal sample and a malicious sample, are respectively and automatically sampled. For normal samples, a common memory mirror image method is adopted to realize the normal samples; for the malicious samples, creating shared spaces for all the heterogeneous executors to store malicious tool samples of different types, and configuring simulated intrusion environments for all the heterogeneous executors so as to obtain memory mirror image data when the virtual platform is attacked by different types;
step two, delta coding is carried out; the specific process is as follows:
(1) operating a memory forensics tool, and respectively determining an EXE type executable file and a DLL type dynamic link library list in the initialized memory by using pslist and dlllist commands for the initialized memory;
(2) respectively determining an EXE type executable file and a DLL type dynamic link library list in a current memory for pslist and dlllist commands in a memory forensics tool of the current memory;
(3) analyzing the EXE/DLL list obtained in the last two steps, and determining executable (PE) files in the current memory but not in the initial memory;
(4) generating a prediction memory for each new PE according to the initial memory;
a) determining the process ID of each new PE, and simultaneously determining the base address of the process in the virtual memory address space;
b) for the process of each new PE, operating a memmap command in a memory evidence obtaining tool in a current memory to extract the mapping relation between a process virtual memory and a physical memory;
c) copying the new PE from the corresponding file on the virtual disk to the initial memory; for each virtual memory page of the PE file, the following two steps are performed: firstly, if the page is in the current memory, copying the PE file to the initial memory by using the mapping relation between the virtual memory and the physical memory obtained in the step (4) b) in the step two; secondly, recording page copy information, including a source page position in the PE file, a target page position in the physical memory and a page length;
(5) outputting header information which comprises path information of the new PE needing to be loaded and all copy pages of each PE;
(6) using the predicted memory as a source and the current memory as a comparison object, and using xdelta3 for coding to obtain a memory map after the current memory mirror image data is coded; m and N respectively represent the row number and the column number of the map, I (I, j) ═ a, b and c respectively represent the elements of the ith row and the jth column of the map, I is more than or equal to 0 and less than M, j is more than or equal to 0 and less than N, a, b and c are floating point numbers of 32 bits, and I (I, j) is a three-dimensional vector;
extracting memory map feature point information, including feature point positions, feature point sizes and feature strength of the feature points; the specific process is as follows:
(1) constructing a Hessian matrix;
the Hessian matrix is a core operator of the feature extraction algorithm. The Hessian matrix H of any one of the binary functions f (x, y) is expressed as:
Figure BDA0002173987590000051
the eigenvalues of f (x, y) are represented by the determinant of the matrix H:
Figure BDA0002173987590000061
for the characteristic extraction process, in order to accelerate the calculation speed in practical application, the hessian matrix is solved in an approximate mode, and the determinant of the hessian matrix H (i, j) corresponding to the element of the ith row and the jth column in the memory mirror image map is calculated as follows:
det(H(i,j))=Dii·Djj-0.9Dij·Dij
where, denotes the vector dot product, i.e. the sum of the product of the elements, Dii=I(i+1,j)+I(i-1,j)-2I(i,j),Djj=I(i,j+1)+I(i,j-1)-2I(i,j),Dij=I(i+1,j)+I(i,j-1)-2I(i,j);
Performing the calculation on each element in the memory mirror image map to obtain a determinant of a Hessian matrix corresponding to each pixel point in the map, namely a characteristic value of the pixel point;
(2) constructing a scale space;
the scale space is the representation of a map under different resolutions; in order to simulate multi-scale features of image data, extreme points are found on a space domain and a scale domain, preliminary feature points are determined, a scale space needs to be constructed for the map, and feature values of the map on different scale domains are constructed through repeated binary functions and Gaussian function kernel convolution;
the method adopts an SURF mode to construct a scale space; for any memory mirror image map, the size of an original image is kept unchanged, and the original image is filtered by changing the size of a template box to construct a scale space; meanwhile, SURF can adopt parallel operation to process each layer of image in the scale space simultaneously; convolving the filtering template with the integral image in the gradually increased box size, obtaining a response image by a Hessian matrix determinant corresponding to each pixel point, and constructing a pyramid;
firstly, adopting a response image obtained by a 9 x 9 box filter as an image of the bottom layer, then gradually increasing the size of the box and continuously carrying out filtering processing on the original image; dividing a scale space into 4 layers, wherein the scaling ratio between the layers is 2, and each layer comprises a filter response graph with different scales; each layer is processed by adopting gradually increasing filter size, so that a series of maps with different scales containing multiple layers are obtained;
(3) accurately positioning the characteristic points;
in each 3 × 3 × 3 local region, non-maximum suppression is performed; for each pixel point, comparing 8 points on the same scale with 9 points on two scale layers above and below the pixel point, only using extreme points which are larger or smaller than 26 surrounding field values as feature points, and recording feature point positions (i, j) and a scale s;
(4) determining map feature points and feature vectors according to a threshold;
and for each characteristic point obtained in the last step, comparing the characteristic value of the point under the corresponding scale with a preset threshold value. If the corresponding characteristic value is smaller than a preset threshold value, the point cannot be used as a characteristic point; if the corresponding characteristic value is larger than or equal to a preset threshold value, the point can be taken as a final characteristic point, and the characteristic vector is represented as [ i, j, s, det (H (i, j, s)) ], wherein i, j is the number of a row and a column of the characteristic point in the map, s is the corresponding filter scale when the point can be taken as the characteristic point, and det (H (i, j, s)) is the characteristic value of the point under the scale s;
(5) counting the feature vectors;
determining a corresponding label z for each feature vector according to the source of the feature vector, namely the feature vector is from the memory mirror image data which is not attacked or the memory mirror image data which is attacked, wherein z is 0 to represent that the feature vector is from the memory mirror image data which is not attacked, and z is 1 to represent that the feature vector is from the memory mirror image data which is attacked;
at this moment, the memory map after Delta coding is abstracted into a specific coded tagged feature vector sequence through feature extraction;
step four, training a neural network;
one input sample of the deep neural network is denoted as [ i, j, s, det (H (i, j, s)), z ]; selecting an existing deep neural network structure to train to obtain a classifier, and analyzing the behavior of an unknown virtual machine in actual operation;
running a neural network, and analyzing unknown behaviors of the virtual machine;
and analyzing the virtual machine with unknown running state by using the neural network trained in the step four, and judging whether the behavior of the unknown virtual machine is normal or not.
The above is an embodiment of the present invention, and the present invention is not limited by the above embodiment, and the specific implementation method may be determined by combining the technical scheme of the present invention with an actual application scenario.

Claims (7)

1. A virtual machine behavior analysis system based on deep learning and memory mirror image analysis is characterized by comprising the following steps:
(1) acquiring memory mirror image data, comprising the following substeps:
(1.1) at an initial time t0Acquiring initial memory mirror image data by using a memory forensics tool to obtain an initial memory;
(1.2) at an arbitrary time t0+ Δ t, on a VirtualBox, VMware virtualization platform, according to the memory management mechanisms of different operating systems, automatically sampling the memory mirror image data of each isomer under the non-attacked and attacked conditions at the current moment respectively to obtain the current memory, namely a normal sample and a malicious sample;
(2) performing delta encoding, comprising the sub-steps of:
(2.1) operating a memory forensics tool, and respectively determining an EXE type executable file and a DLL type dynamic link library list in the initial memory by using a pslist command and a dlllist command for the initial memory obtained in the step (1.1);
(2.2) determining an executable file of an EXE type and a dynamic link library list of a DLL type in the current memory respectively for the pslist and the dlllist command in the memory forensics tool operated by the current memory obtained in the step (1.2);
(2.3) analyzing the EXE type executable file and the DLL type dynamic link library list obtained in the steps (2.1) and (2.2), and determining an executable file which is in the current memory but not in the initial memory, namely a new executable file;
(2.4) generating a prediction memory for each new executable file according to the initial memory, comprising the following substeps:
(2.4.1) determining the process ID of each new executable file, and simultaneously determining the base address of the process in the virtual memory address space;
(2.4.2) for the process of each new executable file, operating a memmap command in a memory evidence obtaining tool in the current memory according to the process base address in the step (2.4.1) to extract the mapping relation between the process virtual memory and the physical memory;
(2.4.3) copying the new executable file from the virtual disk to the initial memory, and executing the following two steps for each virtual memory page of the new executable file: firstly, copying a new executable file to an initial memory by using the mapping relation between the virtual memory and the physical memory extracted in the step (2.4.2); then, recording page copy information, including the source page position of the virtual memory page, the target page position in the physical memory and the page length; finally generating a prediction memory;
(2.5) outputting header information, including path information of the new executable file to be loaded and page copy information of all the new executable files extracted in the step (2.4.3);
(2.6) using the prediction memory generated in the step (2.4) as a source, using the current memory as a comparison object, and using xdelta3 for coding to obtain a memory map after the current memory mirror image data is coded; m, N, respectively representing the row number and the column number of the memory map, and I (I, j) ═ a, b, c, representing the element of the ith row and the jth column of the memory map; wherein I is more than or equal to 0 and less than M, j is more than or equal to 0 and less than N, a, b and c are 32-bit floating point numbers, and I (I, j) is a three-dimensional vector;
(3) extracting the memory map feature point information obtained in the step (2.6), wherein the memory map feature point information comprises feature point positions, feature point sizes and feature strength of the feature points, and the method comprises the following substeps:
(3.1) constructing a Hessian matrix, specifically comprising the following steps: calculating a determinant of a Hessian matrix H (i, j) corresponding to each element in the memory map as a characteristic value of the element, wherein the calculation formula is as follows:
det(H(i,j))=Dii·Djj-0.9Dij·Dij
wherein D isii=I(i+1,j)+I(i-1,j)-2I(i,j),Djj=I(i,j+1)+I(i,j-1)-2I(i,j),Dij=I(i+1,j)+I(i,j-1)-2I(i,j);
(3.2) constructing a scale space by adopting a SURF mode: firstly, filtering an original image of a memory map by adopting a 9 x 9 box filter to be used as a bottom image; then gradually increasing the size of the box filter, and continuously filtering the original image of the memory map; finally, obtaining filter response graphs of different scales and constructing a scale space; the scale space has 4 layers, and the scaling ratio between the layers is 2;
(3.3) accurately positioning the characteristic points, specifically: in each 3 × 3 × 3 local region, performing non-maximum suppression on the scale space constructed in the step (3.2); comparing each element in the scale space with the characteristic values of 26 elements of the three-dimensional neighborhood of the element, wherein the elements with the characteristic values larger or smaller than the surrounding 26 elements are taken as characteristic points, and recording the positions (i, j) of the characteristic points and the scale s;
(3.4) determining map feature points and feature vectors according to the threshold, specifically: comparing the characteristic value of each characteristic point obtained in the step (3.3) under the corresponding scale with a preset threshold value, and if the corresponding characteristic value is smaller than the preset threshold value, the characteristic point is not taken as a final characteristic point; if the corresponding characteristic value is larger than or equal to a preset threshold value, taking the characteristic point as a final characteristic point, and expressing the characteristic vector as [ i, j, s, det (H (i, j, s)) ]; wherein, i and j are the line number and the column number of the final characteristic point in the memory map, s is the filter scale corresponding to the final characteristic point, and det (H (i, j, s)) is the characteristic value of the final characteristic point under the scale s;
(3.5) counting the feature vectors, specifically: judging the source of the feature vector obtained in the step (3.4), wherein the source comprises the memory mirror image data under the condition of not being attacked and the memory mirror image data under the condition of being attacked in the step (1.2); determining a label z corresponding to each feature vector, wherein the feature vector is represented by z-0 derived from memory mirror image data under the condition of not being attacked, and the feature vector is represented by z-1 derived from the memory mirror image data under the condition of being attacked; finally, obtaining a characteristic vector sequence [ i, j, s, det (H (i, j, s)), z ];
(4) training a neural network, specifically: taking the characteristic vector sequence obtained in the step (3.5) as an input sample of the deep neural network, and training the deep neural network to obtain a virtual machine behavior classifier by taking whether the behavior of the virtual machine is normal as output;
(5) operating a neural network, and analyzing unknown virtual machine behaviors, specifically: and (4) analyzing the virtual machine with unknown running state by using the virtual machine behavior classifier obtained in the step (4), and judging whether the unknown virtual machine behavior is normal or not.
2. The system according to claim 1, wherein the initial time t in the step (1.1) is the initial time t0Are positive real numbers.
3. The system according to claim 1, wherein Δ t in step (1.2) is positive and real.
4. The virtual machine behavior analysis system based on deep learning and memory mirror image analysis as claimed in claim 1, wherein in the step (1.2), the normal sample can be obtained by using a common memory mirror image method; and for the malicious samples, creating shared spaces for all the heterogeneous executors to store different types of malicious tool samples, and configuring simulated intrusion environments for all the heterogeneous executors so as to obtain memory mirror image data when the virtualization platform is attacked by different types.
5. The system for analyzing virtual machine behavior based on deep learning and memory mirroring analysis as claimed in claim 1, wherein the 26 elements in the three-dimensional neighborhood of one element in the step (3.3) refer to 8 elements on the same scale with the element and 9 elements of two scale layers above and below the element.
6. The system according to claim 1, wherein the threshold preset in the step (3.4) depends on the number of features to be identified, and the higher the threshold is set, the fewer features can be identified.
7. The virtual machine behavior analysis system based on deep learning and memory mirror analysis of claim 1, wherein the deep neural network in step (4) is any one of existing deep neural network structures.
CN201910772362.5A 2019-08-21 2019-08-21 Virtual machine behavior analysis system based on deep learning and memory mirror image analysis Active CN110618854B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910772362.5A CN110618854B (en) 2019-08-21 2019-08-21 Virtual machine behavior analysis system based on deep learning and memory mirror image analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910772362.5A CN110618854B (en) 2019-08-21 2019-08-21 Virtual machine behavior analysis system based on deep learning and memory mirror image analysis

Publications (2)

Publication Number Publication Date
CN110618854A CN110618854A (en) 2019-12-27
CN110618854B true CN110618854B (en) 2022-04-26

Family

ID=68922283

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910772362.5A Active CN110618854B (en) 2019-08-21 2019-08-21 Virtual machine behavior analysis system based on deep learning and memory mirror image analysis

Country Status (1)

Country Link
CN (1) CN110618854B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111158790B (en) * 2019-12-31 2021-04-13 清华大学 FPGA virtualization method for cloud deep learning reasoning
CN111931179B (en) * 2020-08-13 2023-01-06 北京理工大学 Cloud malicious program detection system and method based on deep learning
CN115454681B (en) * 2022-11-10 2023-01-20 维塔科技(北京)有限公司 Batch processing program execution method, device and system

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103580960A (en) * 2013-11-19 2014-02-12 佛山市络思讯环保科技有限公司 Online pipe network anomaly detection system based on machine learning
CN108334781A (en) * 2018-03-07 2018-07-27 腾讯科技(深圳)有限公司 Method for detecting virus, device, computer readable storage medium and computer equipment

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7225343B1 (en) * 2002-01-25 2007-05-29 The Trustees Of Columbia University In The City Of New York System and methods for adaptive model generation for detecting intrusions in computer systems
FR3022371A1 (en) * 2014-06-11 2015-12-18 Orange METHOD FOR SUPERVISION OF THE SAFETY OF A VIRTUAL MACHINE IN A COMPUTER ARCHITECTURE IN THE CLOUD

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103580960A (en) * 2013-11-19 2014-02-12 佛山市络思讯环保科技有限公司 Online pipe network anomaly detection system based on machine learning
CN108334781A (en) * 2018-03-07 2018-07-27 腾讯科技(深圳)有限公司 Method for detecting virus, device, computer readable storage medium and computer equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
《基于Android内存镜像的恶意软件检测研究》;曹飞;《中国优秀硕士学位论文全文数据库 信息科技辑 I1138-261》;20170331;全文 *
A Fast Learning Algorithm for Deep Belief Nets;Geoffrey E;《Neural Computation》;20061230;全文 *

Also Published As

Publication number Publication date
CN110618854A (en) 2019-12-27

Similar Documents

Publication Publication Date Title
Raff et al. Malware detection by eating a whole exe
Kolosnjaji et al. Empowering convolutional networks for malware classification and analysis
CN110618854B (en) Virtual machine behavior analysis system based on deep learning and memory mirror image analysis
CN110135157B (en) Malicious software homology analysis method and system, electronic device and storage medium
CN111652290B (en) Method and device for detecting countermeasure sample
CN108345827B (en) Method, system and neural network for identifying document direction
US11025649B1 (en) Systems and methods for malware classification
JP4394020B2 (en) Data analysis apparatus and method
CN109858239B (en) Dynamic and static combined detection method for CPU vulnerability attack program in container
CN111931179B (en) Cloud malicious program detection system and method based on deep learning
Dai et al. SMASH: A malware detection method based on multi-feature ensemble learning
US20160148389A1 (en) Static Image Segmentation
Alarifi et al. Anomaly detection for ephemeral cloud IaaS virtual machines
CN112468487B (en) Method and device for realizing model training and method and device for realizing node detection
CN110135160A (en) The method, apparatus and system of software detection
KR101963756B1 (en) Apparatus and method for learning software vulnerability prediction model, apparatus and method for analyzing software vulnerability
US20200380123A1 (en) Fast identification of trustworthy deep neural networks
Jain et al. An efficient image forgery detection using biorthogonal wavelet transform and improved relevance vector machine
KR102352954B1 (en) Real-time Abnormal Insider Event Detection on Enterprise Resource Planning Systems via Predictive Auto-regression Model
Zheng et al. A new malware detection method based on vmcadr in cloud environments
CN110581857B (en) Virtual execution malicious software detection method and system
Hashemi et al. Runtime monitoring for out-of-distribution detection in object detection neural networks
CN117134958A (en) Information processing method and system for network technology service
Liu et al. SeInspect: Defending model stealing via heterogeneous semantic inspection
CN110414233A (en) Malicious code detecting method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant