CN108123962B

CN108123962B - Method for generating BFS algorithm to attack graph by utilizing Spark

Info

Publication number: CN108123962B
Application number: CN201810055240.XA
Authority: CN
Inventors: 胡昌振; 吕坤; 黄竖骅; 曹宁远
Original assignee: Beijing Institute of Technology BIT
Current assignee: Beijing Institute of Technology BIT
Priority date: 2018-01-19
Filing date: 2018-01-19
Publication date: 2020-07-10
Anticipated expiration: 2038-01-19
Also published as: CN108123962A

Abstract

The invention relates to a method for generating an attack graph by utilizing Spark to realize a BFS algorithm, which belongs to the technical field of information security and comprises the specific operation steps of obtaining a network structure, obtaining bugs existing in each host in a network system, and establishing a host bug information table, updating the authority of the host through a breadth-first search (BFS) algorithm, and drawing the attack graph.

Description

Method for generating BFS algorithm to attack graph by utilizing Spark

Technical Field

The invention relates to a method for generating an attack graph by using a Spark to realize a BFS algorithm, belonging to the technical field of information security.

Background

At present, the network security analysis mainly comprises a rule-based method and a model-based method, and an attack graph is a model-based network security analysis method. The network attack graph is generated because there are some associations between vulnerabilities existing in the network, after a vulnerability is attacked successfully, an attacker can attack the next vulnerability more successfully, and the attack graph can well illustrate the relationship between vulnerabilities.

In order to generate the network attack graph, a large amount of data information of the network is needed, which mainly includes host information and network structure information, and the host information includes software information, vulnerability information, and the like. After the attack graph is generated, observation can be carried out according to the attack graph, and the optimal attack path problem attacked at the highest speed and the optimal protection technology of the network maintained with the minimum work can be visually displayed.

The general problem encountered in generating the attack graph at present is that the system state space is too large, and with the increase of host information, the information required for generating the attack graph grows exponentially, which will cause the storage space required by data to become huge, and the time required for operation also grows exponentially.

The method has the main advantages that ① a network structure is split into a plurality of parts, a plurality of machines jointly generate attack graphs and combine the attack graphs, ② the network structure graph is subjected to depth-first search traversal, a plurality of information node caches are not needed, and space is saved;

the final application of the generated attack graph needs to be fully considered in network modeling and attack graph generation, all attack paths need to be found out when penetration testing is carried out, the complexity or success probability of each atomic attack and the degree of damage brought by the successful utilization of the vulnerability and the like need to be considered when the method is used for risk analysis or the shortest attack path is found, and the cost of each vulnerability patch needs to be calculated when the method is used for guiding vulnerability patch management.

Therefore, the final application of the attack graph determines the model and the generation method to be established to a certain extent. The generation method of the attack graph represents the network model and the information data structure of the vulnerability database. At present, many attack graph generation methods exist, and in order to analyze, compare and evaluate the merits of the methods, an attack graph generation mechanism needs to be analyzed, attributes which can be used for analyzing and comparing the methods are found, the generation methods are classified, existing problems are found, and possible research findings are found.

Disclosure of Invention

The invention aims to solve the problems of low processing speed, incomplete attack graph generation and the like in the existing attack graph generation method, and provides a method for generating an attack graph by using Spark to realize a BFS algorithm.

The purpose of the invention is realized by the following technical scheme.

The invention provides a method for generating an attack graph by using Spark to realize a BFS algorithm, which comprises the following specific operation steps:

step one, acquiring a network structure.

Step 1.1: acquiring software applications of all hosts in a network system, and establishing a corresponding table of the software applications and the hosts.

The software application and host correspondence table comprises: host name and software application name.

Step 1.2: obtaining session links among all hosts in a network system, and establishing a session link table among the hosts. The inter-host session link table includes: a set of source host names and target host names.

Step 1.3: and establishing a host authority table for storing the authority owned by each host in the network system. The host authority table includes: host name and host permissions.

The initial state of the host permission table is null.

And step two, acquiring the vulnerabilities existing in each host in the network system, and establishing a host vulnerability information table. The host vulnerability information table includes: host name, vulnerability ID, pre-condition set and post-condition set.

And step three, updating the host authority through a Breadth First Search (BFS) algorithm.

And updating the host authority through a breadth-first search (BFS) algorithm on the basis of the operations of the first step and the second step. The specific operation is as follows:

step 3.1: more than one host is arranged as an initial host. A host having a session link with an originating host is referred to as a reachable host.

Step 3.2: establishing an available vulnerability information table, wherein the available vulnerability information table comprises: host name, vulnerability ID, pre-condition set and post-condition set.

The initial state of the available vulnerability information table is null.

Step 3.3: judging whether the vulnerability existing in the reachable host is a usable vulnerability or not according to the authority owned by the starting host and the precondition of the vulnerability existing in the reachable host, specifically: when the precondition of a certain bug in the reachable host is satisfied by the authority owned by the initial host, the bug is regarded as an available bug, and the available bug is added into an available bug information table; meanwhile, the post-condition of the available loopholes is used as the newly added authority of the reachable host, and the newly added authority is added into the host authority table.

To increase the running speed, the operation of step 3.3 may also be: using the distributed functionality of the Spark computation engine to: distributing a plurality of pieces of initial host information to different Spark clusters respectively, judging whether the vulnerability existing in the reachable host is an available vulnerability or not according to the authority owned by the initial host and the precondition of the vulnerability existing in the reachable host on each Spark cluster, and specifically: when the precondition of a certain vulnerability in the reachable host is all satisfied by the authority owned by the starting host, the vulnerability is regarded as an available vulnerability, and the information of the available vulnerability is returned to the Spark host. Adding the available vulnerabilities into an available vulnerability information table on a Spark host; meanwhile, the post-condition of the available vulnerability is used as the authority of the reachable host, and the authority of the reachable host is added into the host authority table.

Step 3.4: and traversing the reachable hosts, taking the current reachable hosts as initial hosts, and repeating the operations from the step 3.3 to the step 3.4 until all reachable hosts have no available bugs, and ending the operation.

And step four, drawing an attack graph.

And drawing an attack graph on the basis of the operation of the step three. The method specifically comprises the following steps:

all vulnerability IDs, pre-condition sets and post-condition sets in the available vulnerability information table are used as nodes of the attack graph; and drawing a directed edge from the precondition set of the available vulnerability to the vulnerability ID, and drawing an attack graph from the vulnerability ID of the available vulnerability to the directed edge of the postcondition set.

And obtaining the network attack graph with the known network and the reachable authority through the operation of the steps.

Advantageous effects

Compared with the prior art, the method for generating the attack graph by utilizing the Spark to realize the BFS algorithm has the following advantages that:

① the invention adopts a Breadth First Search (BFS) algorithm to reduce the time needed to traverse the nodes;

② each BFS node performs distributed computation, reducing the size of data to be transferred for distribution, resulting in a reduction in the time complexity of the algorithm.

③ the invention uses Spark engine multi-cluster to process distributed operation to improve several times speed;

④ the method of the invention does not need to split the network structure, and the generated attack path is more comprehensive and has no omission.

Drawings

FIG. 1 is a diagram of a network system architecture in accordance with an embodiment of the present invention;

fig. 2 is a schematic diagram illustrating an operation flow of a method for generating an attack graph by using Spark to implement a BFS algorithm according to an embodiment of the present invention;

fig. 3 is a network attack diagram according to an embodiment of the present invention.

Detailed Description

According to the technical scheme, the invention is described in detail by combining the drawings and the implementation examples.

In this embodiment, the network system structure is shown in fig. 1, and includes 8 hosts H1-H8. H1 can access H2 and H6, H2 can access H3 and H4, H4 and H5 can access each other, H5 and H6 can access each other, H5 and H7 can access each other, and H5 and H8 can access each other; h6 and H7 can be mutually accessed, H6 and H8 can be mutually accessed, and H7 and H8 can be mutually accessed.

The method for generating the attack graph by using the BFS algorithm by using Spark provided by the invention has the operation flow as shown in FIG. 2, and comprises the following specific operation steps:

step one, acquiring a network structure.

Step 1.1: acquiring software applications of all hosts in a network system, and establishing a corresponding table of the software applications and the hosts, as shown in table 1.

TABLE 1 software application to host mapping table

Host name	Software name
		H1	Win10 Microsoft Office
H2	IE. Fast thunder
		H3	Win7、IE6.0
H4	MS Outlook change
		H5	IE6.0
H6	Apache HTTP Server
		H7	Win7、MS IIS6.0
H8	Win10、MySQL

Step 1.2: session links between hosts in a network system are obtained, and an inter-host session link table is established, as shown in table 2.

Table 2 inter-host session link table

	H1	H2	H3	H4	H5	H6	H7	H8
									H1	0	1	1	0	0	1	0	0
H2	0	0	1	1	0	0	0	0
									H3	0	0	1	0	0	0	0	0
H4	0	0	0	1	1	0	0	0
									H5	0	0	0	0	1	1	1	1
H6	0	0	0	0	1	1	1	1
									H7	0	0	0	0	1	1	1	1
H8	0	0	0	0	1	1	1	1

The initial state of the host permission table is null.

And step two, acquiring the vulnerabilities existing in each host in the network system, and establishing a host vulnerability information table as shown in table 3.

TABLE 3 host vulnerability information Table

step 3.1: set host H1 as the initiating host. A host having a session link with an originating host is referred to as a reachable host. The rights of the origin host H1 are obtained and added to the host rights table.

The initial state of the available vulnerability information table is null.

Step 3.3: using the distributed functionality of the Spark computation engine to: and distributing a plurality of pieces of initial host information to different Spark clusters respectively on each Spark cluster. And judging whether the vulnerability existing in the reachable host is an available vulnerability or not according to the authority owned by the starting host and the precondition of the vulnerability existing in the reachable host. And when the precondition of a certain vulnerability in the reachable host is all satisfied by the authority owned by the starting host, the vulnerability is regarded as an available vulnerability, and the information of the available vulnerability is returned to the Spark host. Adding the available vulnerabilities into an available vulnerability information table on a Spark host; meanwhile, the post-condition of the available loopholes is used as the newly added authority of the reachable host, and the newly added authority is added into the host authority table.

For example: the authority of the host H1 is File Access, and the reachable hosts of the host H1 are H2 and H6. The precondition of the loopholes CVE-2017 and 3004 in H2 is File Access, so that the authority of the starting host meets the precondition of the loopholes CVE-2017 and 3004 in H2, the loopholes CVE-2017 and 3004 are regarded as available loopholes, and the information of the available loopholes CVE-2017 and 3004 is returned to the Spark host. Adding the available vulnerabilities CVE-2017 and 3004 into an available vulnerability information table on a Spark host; meanwhile, the post condition File/MemoryAccess which can utilize the vulnerabilities CVE-2017-3004 is taken as the authority which can reach the host H2, and the new limit is added into the host authority table.

The precondition of the loophole CVE-2003-0352 in H6 is File Access, so that the permission of the starting host meets the precondition of the loophole CVE-2003-0352 in H6, the loophole CVE-2003-0352 is regarded as an available loophole, and the information of the available loophole CVE-2003-0352 is returned to the Spark host. Adding the available vulnerabilities CVE-2003-0352 into an available vulnerability information table on a Spark host; meanwhile, the post condition Root Right of the available vulnerability CVE-2003-0352 is used as the authority of the accessible host H6, and the new limit is added into the host authority table.

For example: when step 3.4 is executed for the 1 st time, the reachable hosts H2 and H6 are traversed, the current reachable hosts H2 and H6 are used as starting hosts, and then the operations from step 3.3 to step 3.4 are repeated until all reachable hosts have no available vulnerabilities, and the operation is ended.

Obtaining a host permission table through the operation of the third step, as shown in table 4; an available vulnerability information table is obtained, as shown in table 5.

Table 4 host authority table

Host name	Authority
		H1	File Access
H2	File Access
		H2	Memory Access
H3	Authorization
		H4	User Right
H5	Root Right
		H6	Root Right
H7	File Access
		H8	File Access

TABLE 5 available vulnerability information Table

And step four, drawing an attack graph.

Through the operations of the above steps, the network attack graph shown in fig. 3 is obtained.

Claims

1. A method for generating an attack graph by using Spark to realize a breadth-first search BFS algorithm is characterized in that: the specific operation steps are as follows:

step one, acquiring a network structure;

step 1.1: acquiring software applications of all hosts in a network system, and establishing a corresponding table of the software applications and the hosts;

the software application and host correspondence table comprises: host name and software application name;

step 1.2: acquiring session links among hosts in a network system, and establishing a session link table among the hosts; the inter-host session link table includes: a source host name and target host name set;

step 1.3: establishing a host authority table for storing the authority owned by each host in the network system; the host authority table includes: host name and host authority;

the initial state of the host permission table is null;

step two, acquiring vulnerabilities existing in each host in the network system, and establishing a host vulnerability information table; the host vulnerability information table includes: host name, vulnerability ID, pre-condition set and post-condition set;

step three, updating the host authority through a breadth-first search BFS algorithm;

updating the host authority through a breadth-first search BFS algorithm on the basis of the operations in the first step and the second step; the specific operation is as follows:

step 3.1: setting more than one host as an initial host; a host having a session link with an initiating host is called a reachable host;

step 3.2: establishing an available vulnerability information table, wherein the available vulnerability information table comprises: host name, vulnerability ID, pre-condition set and post-condition set;

the initial state of the available vulnerability information table is null;

step 3.3: judging whether the vulnerability existing in the reachable host is a usable vulnerability or not according to the authority owned by the starting host and the precondition of the vulnerability existing in the reachable host, specifically: using the distributed functionality of the Spark computation engine to: distributing a plurality of pieces of initial host information to different Spark hosts respectively, and judging whether the vulnerability existing in the reachable host is an available vulnerability or not on each Spark host according to the authority owned by the initial host and the precondition of the vulnerability existing in the reachable host; when the precondition of a certain bug in the reachable host is all satisfied by the authority owned by the starting host, the bug is regarded as an available bug, and the information of the available bug is returned to the Spark host; adding the available vulnerabilities into an available vulnerability information table on a Spark host; meanwhile, the post-condition of the available vulnerability is used as the authority of the reachable host, and the authority of the reachable host is added into the host authority table;

step 3.4: traversing the reachable hosts, taking the current reachable hosts as initial hosts, and then repeating the operations from the step 3.3 to the step 3.4 until all reachable hosts do not find new available vulnerabilities, and ending the operation in the step three;

step four, drawing an attack graph; the method specifically comprises the following steps:

all vulnerability IDs, pre-condition sets and post-condition sets in the available vulnerability information table are used as nodes of the attack graph; drawing a leading condition set of the available vulnerabilities to a directed edge of a vulnerability ID, and drawing a vulnerability ID of the available vulnerabilities to a directed edge of a trailing condition set to finish drawing an attack graph;

and obtaining the network attack graph of the known network through the operations of the first step to the fourth step.