CN111931179B

CN111931179B - Cloud malicious program detection system and method based on deep learning

Info

Publication number: CN111931179B
Application number: CN202010814447.8A
Authority: CN
Inventors: 田东海; 马锐; 赵润泽; 郁裕磊; 魏行; 胡昌振
Original assignee: Beijing Institute of Technology BIT
Current assignee: Beijing Institute of Technology BIT
Priority date: 2020-08-13
Filing date: 2020-08-13
Publication date: 2023-01-06
Anticipated expiration: 2040-08-13
Also published as: CN111931179A

Abstract

The invention discloses a cloud malicious program detection system and method based on deep learning, belongs to the technical field of software safety, and is higher in efficiency and accuracy. The system comprises an information acquisition module, a data preprocessing module and a training model module. The information acquisition module specifically comprises: the program sample set comprises program samples used in malicious program detection; the program automatic execution sample is used for automatically executing the program sample in the virtual machine; running a program sample in the virtual machine each time, extracting system real-time parameter information and dynamic link library information in the running process, after the program sample is executed, storing a virtual machine snapshot, and analyzing the virtual machine snapshot to obtain internal memory forensics information; and sending each information to a data preprocessing module. The data preprocessing module carries out data preprocessing to obtain a dynamic link library characteristic vector, a system real-time parameter matrix and a memory forensics matrix, and the dynamic link library characteristic vector, the system real-time parameter matrix and the memory forensics matrix are sent to the training model module. The training model module pre-constructs and trains the neural network model.

Description

Cloud malicious program detection system and method based on deep learning

Technical Field

The invention relates to the technical field of software security, in particular to a cloud malicious program detection system and method based on deep learning.

Background

Malware detection refers to a method that can identify malware. Cloud computing, which is one of the most popular and important IT trends today, is a process of providing services of shared computing resources (software or data) to computers or other devices over the internet. How to detect malicious programs at the cloud end is an important development direction for malicious program detection at present. Therefore, the detection work of the cloud malicious program is very important.

Deep learning is an important method in the field of malicious program detection, and most of the popular malicious program detection methods use deep learning technology, so that deep learning receives wide attention in the aspects of practice and research.

At present, deep learning is widely applied in the field of malicious program detection, and because deep learning, particularly convolutional neural networks, achieves excellent results in the field of image processing, the results in this respect are usually obtained by means of deep learning in the malicious program detection. Usually, a malicious program is converted into an image or a digital matrix similar to the image, and then the image is trained according to a deep learning step to obtain a final result.

There are many methods for converting malicious programs into images, and the method is usually used for converting malicious programs into gray-scale maps, and the method utilizes a method for converting binary files into gray-scale maps, for a binary file, each byte range of the binary file is between 00 and FF, which can correspond to 0 to 255 of a gray-scale map, and the binary file is converted into a matrix, and each bit represents a pixel point, namely, the binary file can be converted into a gray-scale map. For these gray-scale maps, the difference between malicious and benign programs is usually seen from the texture on the image.

The method for converting the malicious program into the gray-scale image is usually used for static analysis, but for the development of the malicious program, the purpose of detection can be fulfilled only by running the malicious program, and the malicious program can be analyzed by tools such as a sandbox. The sandbox can analyze an API sequence called by a malicious program during system operation, convert the API into a vector, generally regard the API as a section of text, convert the API into a word vector by following a natural language processing method, and convert an API sequence list obtained during each sample operation into a matrix which can also be used as an input of a neural network.

At present, most of malicious program static analysis methods are difficult to adapt to the development of malicious programs, the internet environment is more and more complex, the software environment becomes diversified, and with the development of the malicious programs, people who write the malicious programs generally use polymorphic or deformed technologies, so that the method for identifying the malicious programs by using the traditional characteristic labeling mode is difficult to detect new malicious programs, and meanwhile, a plurality of disguised malicious programs are difficult to detect. Malicious programs have evolved in recent years, and some of the malicious programs have functions of anti-detection and anti-analysis.

The dynamic analysis can detect the malicious program more effectively, and usually requires interception and analysis of the API function call of the malicious program through a sandbox environment, but the environment configuration of the sandbox environment is very complex, and the analysis report thereof needs to be processed more carefully.

Many dynamically acquired malicious program features may be interfered by the malicious program, for example, API information of the malicious program is acquired by using a sandbox, but the API information may be intentionally called by the malicious program to disguise, and analyzing the API information may have a certain influence on the detection result.

At present, many studies begin to use deep learning to detect malicious programs, but at present, the studies usually optimize a neural network model by deepening a network level, tuning a network, or some other method, so that the neural network model is more and more complex, and the model training time is longer.

The method for converting the malicious program into the data suitable for deep learning training is complex, and the method for converting the malicious program into the data is also various, but the conversion method is not simple and convenient, and some complex technologies (such as sandbox technologies) or stricter environment configurations may need to be applied.

The structure of the deep learning model can influence the accuracy of judging the malicious program, but at present, the deep learning is applied to the field of malicious program detection for a long time, and the model is more and more difficult to optimize.

Most detection schemes have high performance overhead, are difficult to deploy in an actual environment, and are not suitable for detecting malicious codes in the cloud.

Therefore, how to utilize the existing malicious program detection technology to enable the malicious program detection to be more efficient and accurate, and the malicious code detection more adaptive to the cloud is a problem to be solved urgently at present.

Disclosure of Invention

In view of this, the invention provides a cloud malicious program detection system and method based on deep learning, which is a malicious code detection scheme capable of adapting to a cloud, and has higher efficiency and higher accuracy.

In order to achieve the purpose, the technical scheme of the invention is as follows: the cloud malicious program detection system based on deep learning comprises an information acquisition module, a data preprocessing module and a training model module.

The information acquisition module comprises a virtual machine, a program automatic execution script and a program sample set; the program sample set comprises program samples used in malicious program detection; the program automatic execution sample is used for automatically executing the program sample in the virtual machine; running a program sample in the virtual machine each time, extracting system real-time state parameter information and dynamic link library information in the running process, storing a virtual machine memory snapshot after the program sample is executed, and analyzing the virtual machine memory snapshot to obtain memory forensics information; and sending the system real-time state parameter information, the dynamic link library information and the internal memory forensics information obtained when each program sample is executed to a data preprocessing module.

The data preprocessing module carries out the following data preprocessing: converting the dynamic link library information into a dynamic link library characteristic vector, converting the system real-time state parameter information into a system real-time parameter matrix, and extracting digital characteristic information in the internal memory evidence obtaining information to convert the digital characteristic information into an internal memory evidence obtaining matrix; and the dynamic link library characteristic vector, the system real-time parameter matrix and the memory forensics matrix are sent to a training model module.

The training model module is used for constructing and training a neural network model in advance; the neural network model consists of a first feature extraction part, a second feature extraction part, a feature fusion part and a full connection layer; the first and second feature extraction parts are both composed of a convolutional layer and a pooling layer; the input of the first characteristic extraction part is a system real-time parameter matrix, and the output is characteristic information of the system real-time parameter matrix; the input of the second characteristic extraction part is a memory forensics matrix, and the output is the characteristic information of the memory forensics matrix; the feature fusion module is used for performing feature fusion on the output of the first and second feature extraction parts and the feature vector of the dynamic link library to obtain fusion features; and obtaining classification output of the neural network model after the fusion characteristics pass through the full connection layer, namely judging whether the target virtual machine has a malicious program.

Further, the malware detection system includes a model training pattern and an actual measurement pattern.

In the model training mode, the program samples are collected into the program training samples obtained.

The program training samples include known class programs and class labels thereof, the class labels including normal programs and malicious programs.

And training the neural network model in the training model module by combining the class label through the dynamic link library characteristic vector, the system real-time parameter matrix and the memory forensics matrix obtained by the information acquisition module and the data preprocessing module of the known class program to obtain the trained neural network model.

Under the actual measurement mode, the program samples are concentrated into program test samples which are programs of unknown types; and obtaining a judgment result of whether the malicious program exists in the target virtual machine or not by utilizing the trained neural network model through the dynamic link library characteristic vector, the system real-time parameter matrix and the memory forensics matrix which are obtained by the unknown program information acquisition module and the data preprocessing module.

Further, the information acquisition module extracts real-time state parameter information of the system by adopting a Python Psutil module; and analyzing the virtual machine memory snapshot by using a Volatinity tool to obtain memory forensics information.

Furthermore, in the data processing module, each row of the system real-time parameter matrix and the memory forensics matrix corresponds to a process when the program sample is executed, and the inline data is the digital characteristic in the system real-time parameter or the memory forensics information generated in the corresponding process.

Converting the dynamic link library information into a dynamic link library characteristic vector, which specifically comprises the following steps: the dynamically linked library information includes: the occurrence number of different dynamic link libraries in each process during the execution of the program sample; calculating the contribution degree of each dynamic link library to the current process discrimination by using a TF-IDF algorithm, and screening to obtain the dynamic link libraries with the contribution degrees larger than a set threshold value; taking the occurrence times of the screened dynamic link libraries in the current process to form an initial vector, and further clustering the initial vectors of different processes by using a k-means algorithm to obtain initial category labels of different processes; and forming a one-dimensional vector by the primary category labels of all the processes, namely the dynamic link library feature vector.

Further, the feature fusion part firstly fuses the feature information of the real-time parameter matrix of the system and the feature information of the internal memory evidence obtaining matrix by adopting a connection concat mode or an add mode to obtain an intermediate fusion result, and then fuses the intermediate fusion result with the feature vector of the dynamic link library in the connection concat mode to obtain fusion features.

Another embodiment of the present invention provides a cloud-side malicious program detection method based on deep learning, including the following steps:

s1, establishing a virtual machine environment; and deploying a program automatic execution script in the virtual machine for automatically executing the sample in the program sample set.

The program sample set samples are initially program training samples; the program training samples include known class programs and class labels thereof, the class labels including normal programs and malicious programs.

S2, automatically executing a program sample set sample by the virtual machine, and operating one program sample each time; in the execution process of each program training sample, the virtual machine extracts system real-time state parameter information and dynamic link library information, after the program sample is executed, the virtual machine memory snapshot is stored, and the virtual machine memory snapshot is analyzed to obtain memory forensics information.

And S3, converting the dynamic link library information into a dynamic link library characteristic vector, converting the system real-time state parameter information into a system real-time parameter matrix, and extracting digital characteristic information in the internal memory evidence obtaining information to convert the digital characteristic information into an internal memory evidence obtaining matrix.

And S4, inputting the dynamic link library characteristic vector, the system real-time parameter matrix and the memory forensics matrix corresponding to the program training sample into a pre-constructed neural network model, and training the neural network model to obtain the trained neural network model.

The neural network model consists of a first feature extraction part, a second feature extraction part, a feature fusion module and a full connection layer; the first and second feature extraction parts are both composed of a convolutional layer and a pooling layer; the input of the first characteristic extraction part is a system real-time parameter matrix, and the output is characteristic information of the system real-time parameter matrix; the input of the second characteristic extraction part is a memory forensics matrix, and the output is the characteristic information of the memory forensics matrix; the feature fusion module is used for performing feature fusion on the output of the first feature extraction part and the output of the second feature extraction part and the feature vector of the dynamic link library; and the fusion features output by the feature fusion module are classified and output by the neural network model after passing through the full connection layer, namely whether the malicious program exists in the target virtual machine is judged.

And S5, setting the program sample set samples as program test samples, wherein the program test samples are programs of unknown types.

And S6, executing S2 and S3 to obtain a dynamic link library characteristic vector, a system real-time parameter matrix and a memory forensics matrix of the program test sample, and obtaining a judgment result whether the malicious program exists in the target virtual machine or not by using the trained neural network model.

Further, in S2, extracting the system real-time status parameter information and the dynamic link library information, specifically: in the process of running a program sample by a virtual machine, extracting more than two times of system real-time state parameter information and dynamic link library information according to a set time interval; the real-time parameter information comprises execution parameters corresponding to each process when the program sample is executed; the dynamic link library information comprises the occurrence number of the dynamic link library corresponding to each process when the program sample is executed.

Further, in S3, converting the system real-time state parameter information into a system real-time parameter matrix, and extracting the digital feature information in the internal memory forensics information to convert into an internal memory forensics matrix, specifically including the following steps:

s301, counting processes contained in corresponding program samples, namely: counting the number of common processes and single processes, wherein the common processes refer to the processes which appear more than once in all program samples; a single process refers to a process that occurs only once in all program samples.

S302, determining the number of rows of the matrix as the number of the common processes plus the maximum number of single processes in all the program samples.

S303, filling data corresponding to each process into corresponding rows according to the system real-time state parameter information to obtain a system real-time parameter matrix; and extracting digital characteristic information in the internal memory forensics information, and filling the digital characteristic information corresponding to each process into a corresponding row to obtain an internal memory forensics matrix.

Further, in S3, the information of the dynamic link library is converted into a feature vector of the dynamic link library, specifically: the dynamically linked library information includes: the occurrence number of different dynamic link libraries in each process during the execution of the program sample; calculating the contribution degree of each dynamic link library to the current process discrimination by using a TF-IDF algorithm, and screening to obtain the dynamic link libraries with the contribution degrees larger than a set threshold value; taking the occurrence frequency of the screened dynamic link library in the current process to form an initial vector, and further clustering the dynamic link libraries obtained by screening the initial vectors of different processes by using a k-means algorithm to obtain primary category labels of different processes; and forming a one-dimensional vector by the primary category labels of all the processes, namely the dynamic link library feature vector.

Has the advantages that:

the invention provides a cloud virtual machine malicious program detection system and method based on deep learning, which utilize a common virtual machine to simulate a cloud environment, execute a sample in the virtual machine without complex environment configuration, and simultaneously can directly acquire digital characteristics and directly convert the digital characteristics into a matrix for a deep learning model to train; therefore, the calculation complexity is greatly reduced, and the calculation efficiency is improved. According to the invention, various features are obtained, the efficiency of the neural network is increased by utilizing feature fusion, and a mode of optimizing a model is not utilized, so that deep learning can be optimized from the aspect of features, an excessively complex neural network architecture can be avoided, cloud malicious program detection can be realized only by using a simple neural network architecture, and the accuracy and the efficiency are improved.

Drawings

Fig. 1 is a schematic diagram of a cloud-based malicious program detection system based on deep learning according to an embodiment of the present invention;

FIG. 2 is a schematic fusion diagram of concat fusion mode used in the embodiment of the present invention;

fig. 3 is a fusion diagram of an add fusion method used in the embodiment of the present invention.

Detailed Description

The invention is described in detail below by way of example with reference to the accompanying drawings.

The invention provides a cloud malicious program detection system based on deep learning, which comprises an information acquisition module, a data preprocessing module and a training model module, as shown in fig. 1.

The information acquisition module comprises a virtual machine, a program automatic execution script and a program sample set; the program sample set comprises program samples used in malicious program detection; the program automatic execution sample is used for automatically executing the program sample in the virtual machine; running a program sample in the virtual machine each time, extracting system real-time state parameter information and dynamic link library information in the running process, storing a virtual machine memory snapshot after the program sample is executed, and analyzing the virtual machine memory snapshot to obtain memory forensics information; and sending the system real-time state parameter information, the dynamic link library information and the internal memory forensics information obtained when each program sample is executed to the data preprocessing module.

The data preprocessing module carries out the following data preprocessing: converting the dynamic link library information into a dynamic link library characteristic vector, converting the system real-time state parameter information into a system real-time parameter matrix, and extracting digital characteristic information in the internal memory forensics information to convert the digital characteristic information into an internal memory forensics matrix; and the dynamic link library characteristic vector, the system real-time parameter matrix and the memory forensics matrix are sent to a training model module.

A training model module is used for constructing and training a neural network model in advance; the neural network model consists of a first feature extraction part, a second feature extraction part, a feature fusion part and a full connection layer; the first and second feature extraction portions are each composed of a convolutional layer and a pooling layer; the input of the first characteristic extraction part is a system real-time parameter matrix, and the output is characteristic information of the system real-time parameter matrix; the input of the second characteristic extraction part is a memory forensics matrix, and the output is the characteristic information of the memory forensics matrix; the feature fusion module is used for performing feature fusion on the output of the first and second feature extraction parts and the feature vector of the dynamic link library to obtain fusion features; and obtaining classification output of the neural network model after the fusion characteristics pass through the full connection layer, namely the classification of the program sample.

In the embodiment of the invention, the system comprises a model training mode and an actual measurement mode.

Under the actual measurement mode, the program samples are concentrated into program test samples which are programs of unknown types; and obtaining a judgment result of whether the malicious program exists in the target virtual machine or not by utilizing the trained neural network model through the dynamic link library characteristic vector, the system real-time parameter matrix and the memory forensics matrix which are obtained by the unknown program information obtaining module and the data preprocessing module.

In the embodiment of the invention, each module is specifically designed as follows:

an information acquisition module:

the first thing involved is a script that automatically executes the program, and the large number of data set samples did not need to be manually run. All samples are run in the virtual machine, only one sample is run in the virtual machine at a time, and the relevant parameters of the running state of the operating system and the information of the dynamic link library are extracted. And after the execution is finished, storing the memory snapshot of the virtual machine, and analyzing the snapshot to obtain the memory forensics information. And closing the virtual machine and continuing to run the next target program.

When a program runs, whether the program is benign or malicious, operations are carried out in the system, and whether the operations are carried by the program or performed by a user, changes can be made to some state parameters in the system, and the changes can be represented by system parameter change conditions of each process in the system. The Python is generally required to be used for extracting system parameters, and although the Python can be extracted by using some self-contained modules, the extraction method is complex and data needs to be analyzed so as to be used. Psutil is short for process and system utilities, namely a process and system utility tool, which can monitor systems and processes as the name suggests. The Python is used for calling the tool, and some state-related parameters in the system, such as a CPU, a memory, a disk, process information and the like, can be acquired.

Traditional antivirus software runs in a system, however, when a malicious program runs on the system, the antivirus software in the system may be found, and then a malicious state cannot be shown, or the antivirus software may be damaged by itself, so as to achieve the purpose of attacking the system. Some advanced malicious programs show some malicious states inside the system, which may cause some detection tools inside the system to make malicious attacks, affect the detection states of the detection tools, and cause the detection results to be possibly normal, but actually the system is infected by the malicious programs. Thus, neither detection tools installed on existing operating systems, nor detection schemes based on such tools, are necessarily trusted. The psutil runs inside the system, so that the obtained data may result in normal data, but the actual programs are malicious programs, the malicious states of the programs may not be detected, and the results may be affected when the data are subsequently processed. Therefore, by means of the method of adding the memory feature information, more features are fused, and the detection result is improved to a certain extent. The memory mirror image is a file for storing the memory of the operating system, and can be completed by utilizing the snapshot storing function of the virtual machine when the virtual machine is started, and the mirror image can be stored on the host. The method is credible, the whole process is finished in the host without the participation of an operating system in the virtual machine, so that a malicious program running in the virtual machine cannot monitor the process, the detection method based on the memory image of the virtual machine cannot be interfered and damaged, the malicious program can show malicious characteristics as in a normal system, the characteristics can be stored in the memory snapshot of the system, the malicious program cannot find the process of generating the snapshot in the running process, and therefore the malicious characteristics shown by the malicious program are stored in the memory snapshot of the virtual machine, and the memory image contains state information showing the behavior of the malicious program. The Volatinity is a tool special for analyzing the memory image, is developed based on python, can perform memory forensics analysis on most operating systems, and can extract the semantic features of the operating systems from the memory image file by using the tool. It is used to analyze the memory dump of the virtual machine, and it can use different plug-ins to analyze the memory dump file and extract the corresponding program semantic information. The malicious program can cause some changes (such as memory writing operation) to the system memory during the execution process in the system, the semantic features of the program in the memory are analyzed through the tool, and the deep learning model can be used for modeling and analyzing the semantic features.

A Dynamic Link Library (DLL) is a program module within which function functions are available for use by a program or other DLL. There are a large number of files or modules in a system that are constructed using DLLs, an API is an application programming interface that can be implemented by one or a set of DLLs, so that any API used by a process in the system will use a DLL, and the process will use the API functions derived in these DLLs to interact with the file system, process, registry, network, and Graphical User Interface (GUI). Therefore, the process running in the system, whether malicious or benign, calls the dynamic link library of the system when running, and the process can call the executable code in the dynamic link library to complete a series of functions required to be completed by the process. For a benign process, the functions to be completed are usually some normal functions, so the called DLLs are also the more common dynamic link libraries, and for a malicious process, some malicious operations need to be executed during the execution, so some special dynamic link libraries may be called.

The processing flow of the information acquisition module is as follows:

(1) configuring an environment: the virtual machine environment is established, an environment for running codes needs to be configured inside the virtual machine, and some application programs need to be installed, so that a user can execute programs to finish common tasks (such as webpage browsing, character editing and video playing), and the virtual machine is started from the mirror image after the virtual machine is started.

(2) Executing code that automatically runs the sample: the code can automatically execute the sample and obtain data that does not require manual acquisition. The code can automatically start a virtual machine system, execute the sample in the virtual machine while starting the system, simultaneously acquire the running state of the operating system and the information of the dynamic link library, and store the information after extracting the information. In the host, after the information is extracted, the memory mirror image of the virtual machine is stored, the virtual machine is closed, and the memory mirror image is analyzed to obtain the memory forensics information. And then the virtual machine is restarted to run the next sample.

(3) Acquiring real-time parameters of the system: in the virtual machine system, the operating system running state parameters can be acquired after the sample starts to run. However, some abnormal situations may occur in the process of extracting system parameters, for example, the process does not exist, or some parameters of the process cannot be extracted, and these parameters that cannot be acquired need to be set to-1. Since some programs or processes may not have any expression at the beginning of execution, some changes to the system may be generated only after a certain period of execution, or the processes themselves may change after a certain period of execution, or shut down, or exhibit some malicious states, it may not be possible to express all the states of the processes by extracting the process system state parameter information at a single time in the virtual machine. In the invention, the system state parameters of 6 processes are obtained, and the process is fully run at an interval of 30 seconds every time, so that the process can display the due operation on the operating system as much as possible. Some malicious programs may cause system crash in the virtual machine after being executed, and similarly, the acquired parameters of the process system are less than 6 times, in this case, all the parameters of all the processes are set to be-1, which means that the virtual machine system is damaged and the extraction of the parameters fails.

(4) Obtaining internal memory evidence obtaining parameters: after the image is saved, the virtual machine image is analyzed at the host. Since the analysis time is long, it should be performed simultaneously with the action of opening the virtual machine to save time. The acquired feature information is stored in the form of an analysis report, and specific digital information is acquired in the next module.

A data processing module:

each row of the system real-time parameter matrix and the memory forensics matrix corresponds to one process of the system when the program sample is executed, and the in-row data is the system real-time parameter generated in the corresponding process or the digital characteristic in the memory forensics information; the dynamically linked library feature vector corresponds to a process of the system in which the program sample is executed, and a position in the dynamically linked library feature vector indicates a number of occurrences of a dynamically linked library in the corresponding process.

The data processing module needs to process the above feature information and convert them into a feature matrix or a feature vector.

In the data processing module, the data to be processed is mainly internal memory forensics information, and the real-time state parameter information of the system is directly stored in a numerical form, so that the real-time state parameter information does not need to be further processed, and the internal memory forensics information is an analysis report obtained by analysis of a Volatinity tool, so that the internal memory forensics information needs to be further processed. The specific processing method is to extract the relevant digital characteristic information according to the analysis report format. For this reason, the character string needs to be analyzed and processed using a regular expression.

The main work of the data processing module is to convert data into a feature matrix or a feature vector, and the sizes of the feature matrices required by the neural network need to be the same, so that the sizes of the feature matrices need to be determined first. Each row of all feature matrices should correspond to a unique process, and it is not possible to have processes arranged randomly. The neural network can compare the change of the corresponding process in each sample in the training process to judge whether the sample is a malicious sample, if the samples of each row in each matrix are randomly arranged, the neural network can not make accurate judgment, otherwise, the accuracy is influenced, so that the process represented by each row of each characteristic matrix is unique and the arrangement sequence is the same. The PID is a unique identifier of a process, and can be used as an identifier of a process, however, if each row of the matrix corresponds to a PID, the matrix is very large and extremely sparse, which is not conducive to training of a neural network, and therefore, the method using the PID as the row number of the matrix is not feasible. In the invention, PID, process name and process path information are used as the unique identification of a process and are used for determining the line number of the characteristic matrix. Because the PID is uncertain, but the process name and the path information of the process are determined for one process, firstly, all combinations of the process name and the process path information are extracted by traversing all the feature information acquired by utilizing the psutuil, the combination is used as the determination of the number of rows of the input matrix of the neural network, which process is represented by each row is determined in advance, and then the extracted data is written into the position corresponding to the matrix, so that the input of the neural network can be obtained.

Although the number of matrix rows can be determined in this way, since the number of samples is large, if one row per process is to be guaranteed, the number of rows is still large, and therefore a distinction is made between "common processes" and "single processes" in all samples. "common process" means a process that occurs more than 1 time in all samples, and more than 1 time means that a process does not only occur in a system when a single sample is running but occurs when different samples are running, the process does not belong to a specific process or a sub-process of a sample, but generally exists in the system, and is called in the process of running the sample, and some communication may exist between the process or the sub-process of the sample, the common process is arranged above a feature matrix one by one, and if the process exists, data of the process can be filled in a corresponding row. By "single process" is meant a process that has only occurred once, which may be a particular process for a particular sample, and if the process also ensures that each row of each feature matrix corresponds to the same process, the number of rows of the feature matrix is too large, so that the processes are arranged down in the feature matrix of each sample in the order of the just-in-process queue. The number of rows of the matrix is fixed, so all samples need to be traversed, the maximum value of the number of single processes in all samples is obtained, and the number of rows of the characteristic matrix is: the total process number + the maximum value of the single process number.

In these data, they may have a large difference in value due to unit difference, and therefore, normalization processing is required for these data. But they also need to be weighted because they contribute to the result to a different extent. The weighting method comprises the steps of utilizing a random forest algorithm in machine learning, utilizing random forest training for each column in sequence according to the columns of the matrix to obtain result accuracy, converting the accuracy into weights in proportion, multiplying the data of each column by the weights, and arranging the columns of the matrix from large to small according to the weights. The weighting process will facilitate subsequent feature fusion.

On the other hand, the information of the dynamic link libraries also needs to be converted into the feature vectors, and because whether each process is malicious or not is not known in advance, the dynamic link libraries are clustered by using an unsupervised learning method of machine learning. The method comprises the steps of firstly, calculating the contribution degree of each dynamic link library to the current process by using a TF-IDF algorithm, screening to obtain the dynamic link libraries with the contribution degrees larger than a set threshold value, wherein the set threshold value can be set according to experience. And taking the occurrence frequency of the screened dynamic link library in the current process to form an initial vector, and marking as 0 if the dynamic link library does not occur, wherein the occurrence frequency of the screened dynamic link library in the process is the initial vector. And further clustering the dynamic link library obtained by screening by using different processes of a k-means + + algorithm to obtain a process clustering result, wherein the process clustering result indicates that each process is possibly malicious or possibly benign, and the process clustering result is a preliminary class label of the process. And forming a one-dimensional vector by the primary category labels of all the processes, namely the dynamic link library feature vector.

The processing flow of the data processing module is as follows:

(1) converting the characteristic data into digital characteristics: the goal is to convert the memory forensics information obtained by the volatility tool into digital features for later use.

(2) Determining the size of the feature matrix: firstly, counting process information, determining an identification method of each process, then counting common processes and single processes, determining the row number of a matrix, and then writing characteristic data into the matrix.

(3) Characteristic weighting: first, to avoid the large numerical difference, all data were normalized. And then calculating the contribution degree of each feature to the classification result by utilizing the random forest, weighting the classification result according to the contribution value, and arranging the columns from large to small according to the weight.

(4) Converting the dynamic link library information into a feature vector: and screening the dynamic link library with larger contribution degree by using a TF-IDF algorithm, and then clustering the dynamic link library by using a k-means algorithm to obtain a classification result so as to approximately represent whether the process is malicious or benign, namely the classification result is a primary class label of the current process. And forming a one-dimensional vector by the primary category labels of all the processes, namely the dynamic link library feature vector.

Training a model module:

the training model module needs to utilize a convolutional neural network to train the characteristic information to obtain a training model, and the result is predicted. The convolutional neural network is originally used for image processing, and compared with a common algorithm, the convolutional neural network can reduce preprocessing and directly process original data, so that the convolutional neural network is very convenient for extracting and analyzing characteristics. The invention also makes use of convolutional neural networks, based on methods in the field of image processing. The invention utilizes various data generated when the malicious program runs to respectively analyze the data, perform characteristic fusion and then learn the data, and aims to enable the network to obtain more characteristic information and obtain better classification effect. The convolutional neural network has two inputs, wherein one is state parameter information of the system in operation, the other is internal memory evidence obtaining information which is obtained through open source psutil and vollatity tools respectively, the two types of information are converted into own characteristic matrixes respectively, the characteristic matrixes pass through a convolutional layer and a pooling layer respectively, then characteristic fusion is carried out, and after the characteristic fusion is carried out, a model is continuously constructed according to the steps of a normal full connection layer. The model comprises a plurality of convolution layers and pooling layers, and the results are obtained through the full-connection layer.

In order to enable the neural network to fully utilize data information generated in the running process of the sample program in the virtual machine in the training process, a neural network fusion mode is adopted. The purpose of neural network fusion is to enable the neural network to receive different characteristics, each part of characteristics are not influenced in the characteristic extraction process, and only in the classification process, the same classifier can obtain the characteristics of the characteristics, so that classification is completed. There are two general fusion methods, i.e., concat method and add method, and both methods are performed from the feature level to complete feature fusion. The fusion of the connection method is shown in fig. 2, which is to connect two parts of features on channels, after the connection, the number of channels is increased, but the information on each basic block is not changed. The added fusion schematic diagram is shown in fig. 3, which actually performs feature addition, and for the addition mode, unlike the connection mode, the number of channels is not changed, but the information on each basic block is increased.

The feature fusion part firstly fuses the feature information of the real-time parameter matrix of the system and the feature information of the internal memory evidence-obtaining matrix by adopting a connection concat mode or an add mode to obtain an intermediate fusion result, and then fuses the intermediate fusion result with the feature vector of the dynamic link library in the connection concat mode to obtain fusion features.

Most of the feature fusion technology of the neural network is used for complex image processing, connection fusion is carried out in a spatial dimension, and addition fusion is carried out for changing information, so that the addition is better than the connection in terms of calculation amount. Since the added operation is the fusion between the characteristic pixels, some information loss may be caused in the fusion process, but the connection is only simple, so that the problem of the loss is not concerned. However, the addition method fuses information to generate an association between information, and the degree of association between information is better than that of the connection method.

For the present invention, the two fusion modes have equivalent effects, and specifically, which mode is adopted can be according to different application scenarios. The Add fusion mode is less computationally intensive than concat, which can retain more feature information.

The processing flow of the training model module is as follows:

(1) building a neural network: and the whole is built according to a model of a convolutional neural network, the two networks are fused before a full connection layer, and the fusion method can select concat or add. After the neural network is constructed, it can be trained.

(2) Training a neural network: and adding the three parts of feature information extracted in the invention into the neural network according to a set mode, and starting training to obtain a training model.

(3) And (4) evaluation results: and verifying the test data according to the training model to obtain an evaluation result, namely verifying the verification data set.

Another embodiment of the present invention further provides a cloud malicious program detection method based on deep learning, including the following steps:

The program sample set sample is initially a program training sample; the program training samples include known class programs and class labels thereof, the class labels including normal programs and malicious programs.

Extracting real-time state parameter information and dynamic link library information of the system, which specifically comprises the following steps: in the process of running a program sample by a virtual machine, extracting more than two times of system real-time state parameter information and dynamic link library information according to a set time interval; the real-time parameter information comprises a state parameter corresponding to each process when the program sample is executed; the dynamic link library information comprises the occurrence number of the dynamic link library corresponding to each process when the program sample is executed.

Converting system real-time state parameter information into a system real-time parameter matrix, extracting digital characteristic information in internal memory forensics information and converting the digital characteristic information into an internal memory forensics matrix, and specifically comprising the following steps:

s301, counting processes contained in corresponding program samples, namely: counting the number of common processes and single processes, wherein the common processes refer to the processes which appear more than once in all program samples; a single process refers to a process that occurs once in and out of all program samples.

S303, filling data corresponding to each process into corresponding rows according to the system real-time state parameter information to obtain a system real-time state parameter matrix; and extracting digital characteristic information in the internal memory forensics information, and filling the digital characteristic information corresponding to each process into a corresponding row to obtain an internal memory forensics matrix.

Converting the dynamic link library information into a dynamic link library characteristic vector, which specifically comprises the following steps:

the dynamically linked library information includes: the occurrence number of different dynamic link libraries in each process during the execution of the program sample; calculating the contribution degree of each dynamic link library to the current process discrimination by using a TF-IDF algorithm, and screening to obtain the dynamic link libraries with the contribution degrees larger than a set threshold value; taking the occurrence frequency of the screened dynamic link library in the current process to form an initial vector, and further clustering the initial vectors of different processes by using a k-means algorithm to obtain primary category labels of different processes; and forming a one-dimensional vector by the primary category labels of all the processes, namely the dynamic link library feature vector.

The neural network model consists of a first feature extraction part, a second feature extraction part, a feature fusion module and a full connection layer; the first and second feature extraction portions are each composed of a convolutional layer and a pooling layer; the input of the first characteristic extraction part is a system real-time parameter matrix, and the output is characteristic information of the system real-time parameter matrix; the input of the second characteristic extraction part is a memory forensics matrix, and the output is the characteristic information of the memory forensics matrix; the feature fusion module is used for performing feature fusion on the output of the first feature extraction part and the output of the second feature extraction part and the feature vector of the dynamic link library; and the fusion features output by the feature fusion module pass through the full connection layer to obtain the classification output of the neural network model, namely the classification of the program sample.

And S6, executing S2 and S3 to obtain a dynamic link library characteristic vector, a system real-time parameter matrix and a memory forensics matrix of the program test sample, and obtaining whether the malicious program exists in the target virtual machine by using the trained neural network model.

In summary, the above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. The cloud malicious program detection system based on deep learning is characterized by comprising an information acquisition module, a data preprocessing module and a training model module;

the information acquisition module comprises a virtual machine, a program automatic execution script and a program sample set; the program sample set comprises program samples used in malicious program detection; the program automatic execution script is used for automatically executing the program sample in a virtual machine; running a program sample in the virtual machine each time, extracting system real-time state parameter information and dynamic link library information in the running process, storing a virtual machine memory snapshot after the program sample is executed, and analyzing the virtual machine snapshot to obtain memory forensics information; the system real-time state parameter information, dynamic link library information and internal memory forensics information obtained when each program sample is executed are sent to the data preprocessing module;

the data preprocessing module carries out the following data preprocessing: converting the dynamic link library information into a dynamic link library characteristic vector, converting the system real-time state parameter information into a system real-time parameter matrix, and extracting digital characteristic information in the internal memory forensics information to convert the digital characteristic information into an internal memory forensics matrix; the dynamic link library characteristic vector, the system real-time parameter matrix and the memory forensics matrix are sent to the training model module;

the training model module is used for constructing and training a neural network model in advance; the neural network model consists of a first feature extraction part, a second feature extraction part, a feature fusion module and a full connection layer; the first and second feature extraction parts are both composed of a convolutional layer and a pooling layer; the input of the first characteristic extraction part is a system real-time parameter matrix, and the output is characteristic information of the system real-time parameter matrix; the input of the second characteristic extraction part is a memory forensics matrix, and the output is the characteristic information of the memory forensics matrix; the feature fusion module is used for performing feature fusion on the output of the first feature extraction part and the output of the second feature extraction part and the feature vector of the dynamic link library to obtain fusion features; and the fusion characteristics pass through the full connection layer to obtain the classification output of the neural network model, namely the judgment result of whether the malicious program exists in the target virtual machine or not.

2. The system of claim 1, wherein the malware detection system comprises a model training pattern and a measured pattern;

under the model training mode, the program sample set is collected to obtain program training samples;

the program training sample comprises known class programs and class labels thereof, wherein the class labels comprise normal programs and malicious programs;

the known class program trains a neural network model in the training model module through a dynamic link library characteristic vector, a system real-time parameter matrix and a memory forensics matrix obtained by the information acquisition module and the data preprocessing module in combination with the class label to obtain a trained neural network model;

under the actual measurement mode, the program samples are concentrated into program test samples which are unknown programs; and the unknown program obtains a judgment result whether the malicious program exists in the target virtual machine or not by using the trained neural network model through the dynamic link library characteristic vector, the system real-time parameter matrix and the memory forensics matrix obtained by the information acquisition module and the data preprocessing module.

3. The system according to claim 1 or 2, wherein the information acquisition module adopts a Python correlation module to extract system real-time status parameter information;

the information acquisition module analyzes the virtual machine snapshot by using a Volatinity tool to obtain memory forensics information.

4. The system according to claim 1 or 2, wherein each row of the system real-time parameter matrix and the memory forensics matrix in the data preprocessing module corresponds to a process during execution of the program sample, and the inline data is a digital feature in the system real-time parameter or the memory forensics information generated in the corresponding process;

the converting the dynamic link library information into the dynamic link library feature vector specifically comprises the following steps:

5. The system of claim 1 or 2, wherein the feature fusion module first fuses feature information of a real-time parameter matrix of the system and feature information of a memory forensics matrix by a connection concat mode or an add mode to obtain an intermediate fusion result, and then fuses the intermediate fusion result with the feature vector of the dynamic link library by the connection concat mode to obtain a fusion feature.