CN116112211A

CN116112211A - Knowledge-graph-based network attack chain reduction method

Info

Publication number: CN116112211A
Application number: CN202211564513.6A
Authority: CN
Inventors: 蒋风浪; 周运贤; 吕燕; 叶思迪; 张郁璇; 张轩诚; 易大勇
Original assignee: Zhuhai Hengqin Bringbuys Network Technology Co ltd
Current assignee: Zhuhai Hengqin Bringbuys Network Technology Co ltd
Priority date: 2022-12-07
Filing date: 2022-12-07
Publication date: 2023-05-12

Abstract

The invention discloses a network attack chain reduction method and a system based on a knowledge graph, wherein the method comprises the following steps: a knowledge extraction step, namely analyzing a system log and an information text, and packaging the system log and the information text into JSON files in a unified format; an alarm identification step, namely identifying alarm information in a log by combining with safety equipment such as EDR (electronic data record) and the like; a safety knowledge graph construction step, namely instantiating the safety knowledge graph by utilizing log data and combining external information, and marking alarm information on the graph; a local attack chain identification step, namely finding an attack entry point according to the alarm information, and further identifying a local attack chain on the security knowledge graph; an alarm false alarm elimination step, namely scoring the alarm by utilizing external information so as to score a local attack chain and filtering false alarm according to expert threshold; and the local attack chain merging technology merges the existing attack chains according to the coincident entities of the local attack chains, and simultaneously finds potential association information among different attack chains. According to the invention, a knowledge graph technology is introduced based on the traditional attack tracing technology, so that semantic information of interaction behaviors among system entities in the bottom log is increased, and more accurate and readable attack chain restoration is realized.

Description

Knowledge-graph-based network attack chain reduction method

Technical Field

The invention relates to the field of network security, in particular to a network attack chain restoration method.

Background

Network security is used as a basis for protecting reliable and normal operation of a computer and a network system, aims at protecting illegal invasion and resource damage of malicious communities such as hackers and the like, and relates to important fields such as politics, economy, national defense, military and the like. As the scale of internet users continues to expand, the number of network access terminal devices increases doubled, and network attack events from anonymous hackers or competitors worldwide are increasingly frequent, and network attack techniques are also being put forward.

APT attacks (Advanced PersistentThreat ) are long-term and persistent network attacks that exploit highly sensitive information by advanced attack means in order to achieve the purpose of information theft or resource control for a specific target. Unlike traditional network attack, APT has the characteristics of latency, long time, masking and the like, is difficult to prevent and detect, and increases the difficulty level of network security defense.

In recent years, relevant scholars have focused on attack tracing. The attack tracing is to analyze the subsidence assets and the endogenous information data and identify the attack behaviors by collecting the information data such as flow and log, so that the attack path, the technical policy and the real intention of an attacker can be restored to a certain extent, and the attacker can find out the attack flow to complete the penetration of the system.

In the process of tracing the attack, one or more attack chains for describing the contextual information of the attack behaviors of the attacker are generated. An attack chain is typically presented as a path or sub-graph on a graph structure, where nodes in the graph may represent various entities, such as system processes, files, ports, vulnerability information, alarm information, etc., and edges in the graph may represent interactions between entities, such as processes reading files or communicating with a particular port, etc. When a safety disaster accident occurs, a traceable person can restore an attack link of an attacker through excavating the attack behavior, so that the process of APT attack penetration system is detected and restored. It is revealed how an attacker attacks the system step by step and what impact is caused.

The currently widely applied attack chain restoration method mainly uses a traceability graph to identify an attack path. The traceability graph is a set of system interaction events, describes the data flow, control flow and time sequence relations among system entities as a directed graph, effectively describes the bottom logic relation among the system entities, and keeps the history records of all the execution of the system. Compared with the traditional attack detection method, the traceability graph stores system behaviors irrelevant to the attack, has richer bottom information and execution history, enables attack detection to be more effective, solves the problem of narrow coverage of the traditional method, and has important significance for attack traceability of APT attack.

The attack chain restoration technology is crucial to the whole attack tracing, and by analyzing the attack chain, the whole attack flow of an attacker can be obtained, and more targeted protection or blocking measures can be formulated according to the attack flow, so that active defense is realized. However, the current attack tracing method cannot provide a stable and effective detection capability, the problems of false alarm, dependent explosion, disconnection and the like easily occur in the restored attack chain, and the comprehensive attack chain restoration cannot be performed, because the method mainly comprises the following two points:

firstly, alarming forget caused by incomplete and wrong data acquisition modes and composition strategies of information such as system logs or the like or overlong APT attack latency is caused. The log during long-time system operation is collected to bring expensive performance cost, and the normal operation of the original service system is affected. Therefore, to meet the basic log volume requirements of the traceable evidence collection, the actual data probe acquisition scheme only collects part of the system log. Because of the non-comprehensiveness of the log, a tracing graph constructed by a tracing analysis method depending on the underlying log may lack a necessary path required for restoring an attack chain, and the disconnected attack chains cannot be associated, so that only a plurality of unconnected attack chains can be obtained.

Secondly, the lack of high-order semantics in the attack tracing process leads to the failure of the attack tracing process to characterize high-order association. The high-level semantics contain some attack information describing the attack intention, and can indicate the specific technique of the attack at this stage and the purpose to be achieved. The introduction of a knowledge base is lacking in the construction process of the traceability graph, and only the interaction information of the underlying system process and the file is available, so that the identification of the attack path can only depend on the statistical information of the nodes and the edges, the semantics required by analysis are missing, and the attack intention of an attacker cannot be described. Therefore, the traceability of the traceability map is weak, and the semantic guidance provided for traceability reasoning by security specialists is very limited.

The guidance of adding expert knowledge in the process of tracing the attack is a technical method which is needed in the industry at present. That is, there is currently a lack of technologies and methods for network attack chain restoration based on expert knowledge.

Disclosure of Invention

The invention provides a network attack chain restoration method based on a knowledge graph technology, aiming at enriching the semantics of an attack chain by using expert knowledge so as to improve the restoration precision of the attack chain in the tracing process.

The inventor does not need to write, and after the claims are finalized, the inventor can copy the invention.

The beneficial effects of the invention are as follows: the invention can solve the problem of lack of related technologies and methods for restoring the network attack chain based on expert knowledge at present. The knowledge-graph-based network attack chain restoration method can enrich the semantics of the attack chain by using expert knowledge, and achieves a traceable analysis result with higher accuracy and higher readability.

Drawings

Fig. 1 is a security knowledge graph ontology framework.

Fig. 2 is a schematic flow chart of the attack chain completion method based on the knowledge graph.

Detailed Description

The technical scheme of the invention is further described in detail below with reference to fig. 2.

The figure provides a method for constructing a safety knowledge graph, the ontology framework of the graph is shown in figure 1, and the method is used for realizing attack chain reduction tasks under various different scenes and comprises the following steps:

step 1, knowledge extraction, namely acquiring entities and relations in a log or an information text, wherein the acquisition of ETW logs in a Windows system, auditd logs in a Linux system and information of an open source can be used;

in this step, the audiotd log data collected in the attack and defense exercise scene is used to combine the information on the CAPEC officer network and the 18 attack tactics of The Unified Kill Chain.

Step 2, alarm identification, namely identifying alarm events in a log by using a suspicious information flow association identification method, wherein the alarm events can be identified by using an expert model-based detection rule base and an EDR tool;

in this step, a Wazuh EDR tool is introduced to assist in identifying the alarm behavior present in the audiod log.

Step 3, constructing a safety knowledge graph, and instantiating the graph in a NoSQL database, wherein a Neo4j database can be used;

in this step, the knowledge-graph is instantiated using the Neo4j database.

Step 4, identifying a local attack chain, finding the earliest alarm as an attack entry, and dividing a more accurate and small-scale attack path in a system entity network of the map so as to generate an attack subgraph;

in this step, the earliest alarm event that is not traversed at present is taken as AOP, and forward DFS is sent out to generate a subgraph taking the vertex as the root.

And 5, eliminating false alarms, scoring the identified local attack chains by using an abnormal path scoring method, distinguishing the dangerous degree and suspicious degree of different attack chains, and filtering false alarms possibly existing in the attack chains based on expert experience thresholds.

In the step, a dynamic programming and greedy combination method is used for finding UKC sub-chains by referring to the longest ascending sub-path, an abnormal path scoring method is used for scoring the identified local attack chains, and the attack chain scoring is defined as:

wherein T is the set of all longest subsequences, UKC, of ADP partitioned according to the UKC model ⁱ Representing the ith UKC subchain thereof,

represents the jth alarm event on the sub-chain, len (UKC) ⁱ ) Representing the length of the daughter strand.

And 6, merging local attack chains, and merging the attack chains with the same entity or implicit association relationship into one attack chain to obtain a restored attack chain.

In the step, the attack chains with the same entity or implicit association relation are combined into one.

The embodiment also provides a network attack chain reduction system based on the knowledge graph, which comprises the following steps: knowledge extraction unit, alarm recognition unit, safety knowledge graph construction unit, local attack chain recognition unit, alarm false alarm elimination unit, local attack chain merging unit

The knowledge extraction unit is used for analyzing the system log and the information text and packaging the system log and the information text into JSON files in a unified format;

the alarm identification unit is used for identifying alarm events in the log by using a suspicious information flow association identification method, and can be used for detecting rule libraries and EDR tools based on expert models;

the safe knowledge graph construction unit instantiates a graph in a NoSQL database and can use a Neo4j database;

the local attack chain identification unit finds the earliest alarm as an attack entry, and divides a more accurate and small-scale attack path in a system entity network of the map so as to generate an attack subgraph;

and the alarm false alarm elimination unit is used for scoring the identified local attack chains by using an abnormal path scoring method, distinguishing the dangerous degree and the suspicious degree of different attack chains, and filtering false alarms possibly existing in the attack chains based on expert experience thresholds.

And the local attack chain merging unit merges the attack chains with the same entity or implicit association into one attack chain to obtain the restored attack chain.

Although the invention has been disclosed in detail with reference to the accompanying drawings, it is to be understood that such description is merely illustrative and is not intended to limit the application of the invention. The scope of the invention is defined by the appended claims and may include various modifications, alterations and equivalents of the invention without departing from the scope and spirit of the invention.

Claims

1. A network attack chain reduction method based on a knowledge graph is characterized by comprising the following steps:

2. The knowledge-graph-based cyber attack chain restoration method according to claim 1, wherein in step 2, the alarm event is normalized mainly using "technology" (technologies) defined by ATT & CK. For example, if it is detected that there is a large amount of Data and files in the log that attempt to destroy a particular system or network, thereby interrupting the availability of system, service and network resources, the correspondence will be noted in the security knowledge graph as attack technique "Data de-construction" in ATT & CK.

3. The method for reducing a network attack chain based on a knowledge graph according to claim 1, wherein in the step 3, the ontology framework of the knowledge graph is mainly divided into two modules: one is an endogenous information module, which is mainly read from a system log, and in the module, three system entities of a process, a file and communication in stored log data are used as nodes in a map, and system calls among the entities are summarized into five types of relation types: CREATE, DELETE, EXECUTE, READ, WRITE; the other is an external intelligence module, which is mainly read from the intelligence library and the knowledge library, and in which the alarm nodes matched with the open source knowledge library and the intelligence library are stored, including the killing chain nodes designed with 18 tactics defined by The Unified Kill Chain and the threat intelligence nodes designed with CAPEC (Common Attack Pattern Enumeration and Classification). The nodes correspond to alarm behaviors in the log, namely a certain system call relation of the endogenous information module. The two modules are connected through the alarm information matched from the log, namely, single-step attack behavior.

4. The knowledge-graph-based cyber attack chain restoration method according to claim 1, wherein in step 4, the earliest alarm event is defined as an attack origin AOP (Attack Origin Point). AOP corresponds to a system entity that satisfies two conditions: (1) The entity corresponds to a process for executing the alarm event; (2) No other alarm event is included in the back tracking from the alarm event in the knowledge graph. The local attack chain ADP (Alert Dependency Path) represents a sub-graph derived from the attack origin point AOP in the security graph, with alarm events on each path, corresponding to a short-term and strongly coherent attack performed by an attacker. And (3) starting forward traversal in the atlas by taking the AOP as an entry point, and adding edges which are found in the traversal process and have alarm events into the path. If an alarm event no longer occurs after a side, the process of traversing is stopped at that side. By repeating the above process until all alarm events have been traversed at least once, all ADPs in the map can be identified.

5. The knowledge-graph-based cyber attack chain restoration method according to claim 1, wherein in step 5, the score of a single alarm is defined as:

ThreatScore(technique)＝(a*SeverityScore)+(b*LikelihoodScore)

the threat level (threat level) and the occurrence probability (Likelihood of Attack) are two risk assessment indicators contained in the CAPEC information, and are quantified to 1-5 according to levels of vera Low, medium, high, vera High.

6. The knowledge-based cyber attack chain restoration method according to claim 1, wherein in step 5, considering the partial sequence relation between alarm events, the subsequence of the ADP attack stream is searched according to the tactical diagram of UKC (Unified Kill Chain), and the one of the longest subsequences conforming to UKC is selected.

The score of UKC is defined as:

wherein ,UKCⁱ Representing the ith UKC sub-chain,

7. The knowledge-based cyber attack chain restoration method according to claim 1, wherein in step 5, a penalty factor is introduced to characterize the effect of potential false positives on ADP scores, and the scores are reduced by the number of repetitions of UKC phases occurring in the attack flow of ADP.

The penalty factor is defined as:

wherein len (ADP) represents the length of ADP, n _i Representing the number of alarms corresponding to step i UKC in ADP.

8. The knowledge-based cyber attack chain reduction method according to claim 1, wherein in step 5, the score of ADP is:

wherein ,

is the set of all longest subsequences that ADP divides according to the UKC model. />