CN109040027B

CN109040027B - Active prediction method of network vulnerability node based on gray model

Info

Publication number: CN109040027B
Application number: CN201810763946.1A
Authority: CN
Inventors: 胡昌振; 吕坤; 高程昕
Original assignee: Beijing Institute of Technology BIT
Current assignee: Beijing Institute of Technology BIT
Priority date: 2018-07-12
Filing date: 2018-07-12
Publication date: 2020-08-18
Anticipated expiration: 2038-07-12
Also published as: CN109040027A

Abstract

The invention relates to an active prediction method of a network vulnerability node based on a grey model, and belongs to the technical field of information security. The method comprises the steps of determining the weight of the system in a network system by acquiring real-time host information, topology information, vulnerability information and other characteristics in the network and utilizing a gray correlation analysis method, completing unified calculation of observation data, inputting calculated state information into a gray prediction model, and determining a gray coefficient by utilizing a least square method to realize the prediction model; and finally, performing correlation analysis according to the situation increment of the network node which is not reached and a prediction model curve, and taking the node where the closest situation increment is as the next network vulnerability prediction node.

Description

Active prediction method of network vulnerability node based on gray model

Technical Field

The invention relates to an active prediction method of a network vulnerability node based on a grey model, and belongs to the technical field of information security.

Background

With the rapid development of computer networks, security holes and hidden dangers in network information systems are also in the endlessly, the types and the number of network attacks are multiplied, and the basic networks and the information systems face severe security threats. Traditional information security is limited by technology, and therefore, a passive defense mode is adopted more. However, with the advent of technologies such as big data analysis, SDN, security information collection, etc., the information system security monitoring technology is more and more accurate in analyzing security situations, and is more and more accurate in early warning of security events, and passive defense gradually changes to active defense. In this context, the study of active defenses is also becoming increasingly interesting.

In the invention, a CVE (Common Vulnerabilities & exposition) compatible database is used. The CVE is a dictionary table that gives a common name for widely recognized information security vulnerabilities or vulnerabilities that have been exposed. And the users are helped to share data in various independent vulnerability databases and vulnerability assessment tools. This makes the CVE a "key" for secure information sharing. Using the CVE name of the vulnerability, corresponding information can be quickly found in any other CVE-compatible database.

Disclosure of Invention

The invention combines network security situation perception with a gray model, aims to provide an active prediction method of a network vulnerability node based on the gray model, determines the weight of the node in a network system by acquiring the real-time characteristics of host information, topological information, vulnerability information and the like in the network and utilizing a gray correlation analysis method to complete the unified calculation of observation data, inputs the calculated state information into the gray prediction model, and determines a gray coefficient by utilizing a least square method to realize the prediction model; and finally, performing correlation analysis according to the situation increment of the network node which is not reached and a prediction model curve, and taking the node where the closest situation increment is as the next network vulnerability prediction node.

The purpose of the invention is realized by the following technical scheme, which comprises the following specific operations:

the invention provides an active prediction method of a network vulnerability node based on a gray model, which comprises the following specific operation steps:

step one, acquiring a network security situation characteristic item and calculating a network state. The method specifically comprises the following steps:

step 1.1: and determining the security situation characteristic items of the network system. The network security situation is described by dividing the network security situation into three dimensions from top to bottom, wherein the three dimensions are as follows: the operational situation dimension, the vulnerability situation dimension and the abnormal situation dimension. Wherein, the safety situation characteristic item describing the operation situation dimension comprises: CPU utilization, memory utilization and disk read rates. The security posture feature items used to describe the vulnerability posture dimension include: vulnerability type, vulnerability score, event type and identity authentication degree; the safety situation characteristic items used for describing the abnormal situation dimension comprise: the number of attack sources, attack time, attack frequency, and device presence. Therefore, the security posture feature items of the network system include 11 items, which are respectively: CPU utilization rate, memory utilization rate, disk reading rate, vulnerability type, vulnerability score, event type, identity authentication degree, attack source number, attack time, attack frequency and equipment online state.

Step 1.2: and periodically acquiring observation data of the security situation characteristic items of a single host in the network system at different moments as a research object. And calculating the mean value of the observation data of each security situation characteristic item of all the hosts in the network system at each moment, and determining the expression value weight of each security situation characteristic item under the global action of the whole network system by using a grey correlation analysis method, thereby determining the influence weight of the network security characteristic item on the network global state expression. The method comprises the following specific steps:

step 1.2.1: and calculating the average value of the observation data of each security situation characteristic item of all the hosts in the network system at each moment to obtain an observation matrix A, as shown in a formula (1).

Wherein t represents the tth time, and t is 1,2,3 …; f. of_t(1),f_t(2),…,f_t(11) Respectively represent the observed data mean values corresponding to the 11 security situation feature items at the t-th moment.

Step 1.2.2: carrying out dimensionless processing on the observation matrix A by a formula (2) to obtain the dimensionless processed observation matrix A₁As shown in equation (3).

Wherein i is 1,2, …, 11.

Step 1.2.3: setting a dimensionless processed observation matrix A₁The first column vector of (1) is an observation vector and the other column vectors are comparison vectors. And (4) calculating the correlation coefficient of each subentry in each comparison vector through formula (4), and forming a correlation coefficient matrix M, as shown in formula (5).

Wherein j is 2,3, …, 11.

Step 1.2.4: and (4) obtaining the association degree between any two network security situation characteristic items through a formula (6).

Wherein k is 1,2, …, 11; gamma ray_(f(i),f(k))Representing the relevance of the network security situation characteristic items f (i) and f (k); gamma ray_(f(i),f(1))The value of (c) is calculated by formula (7); gamma ray_(f(k),f(1))The value of (c) is calculated by the formula (8).

Wherein T is the total number of the taken time points.

Step 1.2.5: and (4) obtaining a correlation matrix M 'among the network security situation characteristic items according to the result of the step 1.2.4, wherein the correlation matrix M' is shown as a formula (9).

The relevance matrix M 'is a nonnegative symmetric matrix, and has a maximum module eigenvalue according to the property of the nonnegative symmetric matrix, and is represented by a symbol lambda, so that lambda C is equal to M' C, wherein lambda is a nonnegative value, and C is an eigenvector₁,ω₂,…,ω₁₁]^T，ω_iAnd the influence weight of the ith network security feature item on the global state of the network is shown, wherein i is 1,2, … and 11.

Through the operation of the step, the influence weight of each network security feature item on the network global state is obtained.

Step 1.3: and (3) acquiring observation data of all network security feature items of all the hosts in the network system at different time points, acquiring the single host situation shown by each host at each time point according to the influence weight of each network security feature item on the network global state obtained in the step 1.2, and acquiring the importance ratio between the hosts in the network system by using a gray correlation analysis method.

Step 1.3.1: the host situation calculation matrix B is formed by host situation values of each host in the network system at different time points, as shown in formula (10).

Wherein h represents the h-th host, and h is 1,2,3 …; s_t(1),s_t(2),…,s_t(h) The host situation values respectively representing the 1 st, 2 nd, … th and h th hosts in the network system at the t-th time are calculated by the formula (11).

Wherein s is_t(h) Indicating the host situation value of the h-th host at the t-th time; f. of_th(i) And the observed value of the ith network security situation characteristic item f (i) at the ith host at the moment t is represented.

Step 1.3.2: carrying out dimensionless processing on the single host situation matrix B as shown in formula (12) to obtain a dimensionless processed single host situation calculation matrix B₁As shown in equation (13).

Step 1.3.3: setting single host situation matrix calculation B after dimensionless processing₁The first column vector of (1) is an observation vector and the other column vectors are comparison vectors. By the formula (1)4) And calculating the correlation coefficient of each sub-item in each comparison vector, and forming a correlation coefficient matrix H, as shown in formula (15).

Wherein m is 1,2,3, ….

Step 1.3.4: the correlation between any two hosts is obtained by the formula (16).

Wherein q is 1,2,3, …; gamma ray_(h(m),h(q))Representing the relevance of the network hosts h (m) and h (q); gamma ray_(h(m),h(1))The value of (c) is calculated by formula (7); gamma ray_(h(q),h(1))The value of (c) is calculated by the formula (8).

Step 1.3.5: according to the result of step 1.3.4, the correlation matrix H' between the network hosts is obtained as shown in formula (19).

Because the inter-host correlation matrix H ' is a non-negative symmetric matrix, according to the property of the non-negative symmetric matrix, the maximum mode eigenvalue exists in the correlation matrix M ', and is represented by the symbol λ ', so that λ ' C ' is H ' C '; wherein λ 'is a non-negative value, and C' is a feature vector. Extracting characteristic value and characteristic vector of relevance matrix H' by matlabThe eigenvalues and eigenvectors of the correlation matrix H ' are obtained by calculation, and the eigenvector corresponding to the maximum modulus eigenvalue λ ' is denoted by symbol E, E ∈ C ', E ═ E₁,e₂,…,e_h]^T，e_hIndicating the importance of the h-th host in the network, h is 1,2,3, ….

Through the operation of the step, the importance weight of each network host in the network global is obtained.

Step 1.4: obtaining the overall situation history data of the network system at different moments through a formula (20) according to the influence weight of each network security feature item on the global state of the network obtained in the step 1.2 and the importance weight of each host in the network system obtained in the step 1.3, wherein the overall situation history data is represented by a symbol S, and S is (S is)₁,S₂,…,S_t)。

S_t＝∑_he_h×s_t(h) (20)

And step two, establishing a grey prediction model of the network system based on the overall situation historical data of the network system, and predicting the situation of the network system at the next moment. The method specifically comprises the following steps:

step 2.1: by the symbol X⁽⁰⁾Representing the initial sequence of the gray model, X⁽⁰⁾＝(x⁽⁰⁾(1),x⁽⁰⁾(2),…,x⁽⁰⁾(t)); wherein x is⁽⁰⁾(1),x⁽⁰⁾(2),…,x⁽⁰⁾(t) represents the overall network situation values at time 1, time 2, …, and time t, respectively. Taking the overall situation historical data S of the network system obtained in the step one as an initial sequence X⁽⁰⁾，x⁽⁰⁾(1)＝S₁,x⁽⁰⁾(2)＝S₂,…,x⁽⁰⁾(t)＝S_t。

Step 2.2: calculating an initial sequence X by equation (21)⁽⁰⁾By a first order accumulation of the symbols X⁽¹⁾And (4) showing. X⁽¹⁾＝{x⁽¹⁾(1),x⁽¹⁾(2),…,x⁽¹⁾(t) }, in which x⁽¹⁾(1),x⁽¹⁾(2),…,x⁽¹⁾(t) denotes networks from 1 st to 1 st, from 1 st to 2 nd, …, and from 1 st to tThe overall situation is sum.

x⁽¹⁾(t)＝∑_tx⁽⁰⁾(t) (21)

The purpose of calculating the first-order accumulation generation sequence is to weaken the randomness and relevance of the original data items, model the overall trend from the global perspective, and facilitate the formation of a trend model, understanding and predicting situation changes.

Step 2.3: generating a sequence X due to first order accumulation⁽¹⁾The system adds up irregular historical data sequence to make it become ascending shape sequence with exponential growth rule. The process of calculating the first order accumulation generation sequence is similar to the gray differential equation form of the first order model. Thus generating a sequence X for a first order accumulation⁽¹⁾The first order differential equation is established as shown in equation (22).

And a and b are parameters to be determined by the system respectively, and the value ranges of the a and the b are real numbers.

The formula (22) is integrated by the formula (23) and discretized.

Wherein t' is 1,2, …, t-1.

From the formula (21), the relationship shown in the formula (24) can be obtained.

x⁽¹⁾(t′+1)-x⁽¹⁾(t′)＝x⁽⁰⁾(t′+1)。 (24)

The general formula of the gray prediction model is obtained from formula (23) to formula (24), as shown in formula (25).

The integral term in equation (25) is solved to obtain equation (26).

By the symbol z⁽¹⁾(t' +1) represents the solution of the integral term of equation (25) to obtain equation (27).

Substituting equation (27) into equation (25) yields equation (28).

x⁽⁰⁾(t′+1)＝-az⁽¹⁾(t′+1)+b (28)

Generating a sequence X by adding the first order⁽¹⁾And the initial sequence X⁽⁰⁾The term (2) is brought into the formula (28) and is shown as the expression (29) by the term shifting processing.

Step 2.4: and (3) confirming the values of the parameters a and b in the formula (29) by using a least square method, and establishing a prediction model of the network security situation. The method specifically comprises the following steps:

step 2.4.1: setting 3 alternative parameter vectors, respectively using symbols

Y_tAnd G represents a group represented by,

Y_tand the value of G is shown in equation (30).

Step 2.4.2: substituting equation (30) into equation (29) yields equation (31).

Step 2.4.3: and solving the formula (31) by a least square method to obtain the parameter estimation of the gray prediction model.

Step 2.5: and predicting the security situation of the network system at the next moment by using a grey prediction model, namely predicting the security situation prediction value of the network system. The method specifically comprises the following steps:

step 2.5.1: by symbols

And the predicted value of the grey prediction model for the first-order accumulation generation sequence at the t-th moment is shown.

When t is equal to 1, the first step is carried out,

the result of (2) is shown in equation (32).

When t >1, the parameter estimates of the gray prediction model obtained in step 2.4.3 are substituted into equation (22) and the gray differential equation is solved to obtain equation (33).

The predicted values of the gray prediction model at time t for the first-order accumulation generation sequence, which is formed by equations (32) and (33), are shown in equation set (34).

For the set of equations (34) according to the formula (24)

And

the difference is calculated to obtain an equation set (35).

Wherein the content of the first and second substances,

representing the initial sequence X of the grey prediction model at time t⁽⁰⁾The predicted value of (2).

Through the operation of step 2.5, a gray prediction model as shown in equation (35) is obtained.

Step 2.6: the accuracy of the gray prediction model is checked. The method specifically comprises the following steps:

obtaining an initial sequence X according to equation set (35)⁽⁰⁾By symbols of

And (4) showing.

To evaluate the accuracy of the gray prediction model, the prediction sequence is aligned using equation (36)

With the initial sequence X⁽⁰⁾The comparison is carried out to obtain the accuracy of the prediction model, which is represented by the symbol rel.

If rel >0.9, the prediction result of the gray prediction model is considered to be credible.

And step three, determining the vulnerability node in the network system.

And predicting the situation of the network system by using the grey prediction model obtained in the step two to obtain the predicted values of the situation of the network system at different time points. And analyzing the vulnerability nodes in the network system in real time according to the situation predicted value of the network system to finally obtain the vulnerability nodes of the network system. The method comprises the following specific steps:

step 3.1: and obtaining a host set which is attacked at the t-th moment (the current moment) and a host set which can be attacked at the t + 1-th moment (the next moment) but is not attacked. The method specifically comprises the following steps:

step 3.1.1: the symbol O denotes the set of all hosts in the network system, O ═ O₁,o₂,…,o_h)；o_hIs the h host in the network system.

Step 3.1.2: and acquiring the reachable relation between the hosts in the network system according to the topological structure of the network system and the network routing table, and establishing a reachable information table between the hosts. The inter-host reachable information table includes a source host and a destination host.

Step 3.1.3: the symbol D denotes the set of attacked hosts in the network system at time t, where D ═ D₁,d₂,…,d_y)，

1≤y≤h，d_yIs the y th host computer of which the network system is attacked.

Step 3.1.4: the symbol P represents a set of hosts that may be attacked but have not yet been attacked in the network system at time t +1, where P ═ P₁,p₂,…,p_z)，

z＝1,2,…,z′，z′≤h-y。

Step 3.1.5: and acquiring the request access rate of each host relative to the whole network system according to the network system log, and representing the request access rate by using a symbol P (h). Then, according to the formula (37), it is calculated that the host d is attacked at the t +1 th time_yAttack the host p under the condition of_zWith the symbol P (P) as the conditional probability of (2)_z|d_y) And (4) showing.

Wherein, P (d)_y|p_z) Representing a known host p_zAttacked host d_yConditional probability of being attacked; p (P)_z) Represents a host p_zA priori probability of being attacked; p (d)_y|～p_z) Representing a known host p_zHost d when not being attacked_yConditional probability of being attacked; p (. about.p)_z) Represents a host p_zA priori probability of not being attacked.

Step 3.2: the host p is obtained according to the formula (38)_zThe resulting increase in the network system situation at time t +1 compared to time t is denoted by the symbol Δ.

Δ＝S_t+1-S_t(38)

Wherein S is_t+1Representing the network security situation at the t +1 th moment; s_tRepresenting the network security situation at time t.

Step 3.3: increment predicted value with host p_zThe host situation of (2) performs correlation analysis to obtain P_zThe method comprises the following steps of:

step 3.3.1: and obtaining a host security situation matrix which can be attacked at the t +1 th moment and is represented by a symbol SP, as shown in formula (39).

Wherein s is_t(1)、s_t(2)…s_t(z) represents situation values of the 1 st host, the 2 nd host, and the … th host, respectively.

Step 3.3.2: and calculating the relevance of the formula (38) and the formula (39).

Taking Δ obtained by equation (38) as the reference sequence, equation (39) obtains the term s of SP_t(z) is a comparison sequence. According to the formula (40), the correlation degree of the comparison sequence SP with respect to the reference sequence delta is obtained.

Wherein r (z) represents a host p_zIs related to delta.

By the symbol R ═ R₁,r₂,…,r_z]^TUnmarked host p as an attack_zIs associated with the delta. Obtained R ═ R₁,r₂,…,r_z]^TIndicating the possibility that the host which is not marked by the attack at the time t +1 causes the change delta of the network situation, r_zThe magnitude of (d) indicates the intensity of the situation change delta possibility caused by the host computer, r_zA larger value indicates a greater probability that the host will cause a delta. Will be the largest r_zAnd the corresponding host is used as the node which is most vulnerable at the t +1 th moment, so that the prediction from the overall continuous time-based network situation to the discrete space-based vulnerability host node is realized.

Step 3.3: the conditional probability is calculated for the hosts in the set P according to equation (37), and the conditional probability ranking result is represented by the symbol U as shown in equation (41).

U＝(u₁,u₂,…,u_z) (41)

The ranking result of equation (41) is compared with the result R ═ R of the vulnerability node prediction of the set P₁,r₂,…,r_z]^TThe comparison is performed with the symbol l indicating the number of correspondence rankings that are consistent with the conditional probability order.

The accuracy of the mapping method, denoted by the symbol ul, can be verified according to equation (42).

Advantageous effects

Compared with the prior art, the active prediction method of the network vulnerability node based on the gray model has the advantage that the network vulnerability node can be more accurately predicted.

Drawings

FIG. 1 is a diagram of a network system architecture in accordance with an embodiment of the present invention;

FIG. 2 is a diagram of eigenvalues and eigenvectors of matrix B in an embodiment of the present invention;

FIG. 3 is a diagram of the results of a network security situation gray model determined using the least squares method in an embodiment of the present invention;

FIG. 4 is a block diagram of an overall system of network security situation impact characteristic indicators in accordance with an embodiment of the present invention;

fig. 5 is a diagram of eigenvalues and eigenvectors for matrix B in an embodiment of the present invention.

Detailed Description

The following embodiments are described in detail with reference to the above technical solutions.

In this embodiment, there are 6 hosts in the network system, and the network structure is shown in fig. 1. Fig. 1 depicts a topology of a simulated network system environment, which is mainly composed of two parts: a network main body and a backup node. The part connected by the solid line is a device reachability network formed by an external network to an internal network of the network system. The environment sets a proxy server as a system boundary to isolate the internal network and the external network, and the proxy server becomes a first barrier for controlling external access. Two web servers and database servers are then respectively provided for providing simple web requests and data support thereof. The part from the dotted frame in the figure is the hot backup node of the corresponding device, and the dotted connection indicates that the dotted connection is used as the hot backup node of the corresponding device to serve as the security policy transfer. The description of each host node in the figure is shown in table 1.

Table 1 network system node description table

The method provided by the invention is used for predicting the vulnerability node in the network, and the specific implementation steps are as follows:

step 1.2.1: and calculating the mean value of the observation data of each security situation characteristic item of all the hosts in the network system at each moment to obtain an observation matrix A, as shown in a formula (43).

Wherein i is 1,2, …, 11; t is 6.

Wherein j is 2,3, …, 11.

Wherein T is the total number of the taken time points.

Step 1.3.1: the host situation calculation matrix B is formed by host situation values of each host in the network system at different time points, as shown in formula (44).

The eigenvalues and eigenvectors for matrix B are shown in fig. 2, respectively.

Step 1.3.3: setting single host situation matrix calculation B after dimensionless processing₁The first column vector of (1) is an observation vector and the other column vectors are comparison vectors. The correlation coefficient of each sub-term in each comparison vector is calculated by equation (14), and a correlation coefficient matrix H is constructed as shown in equation (15).

Wherein m is 1,2,3, ….

The correlation matrix H ' between the hosts is a nonnegative symmetric matrix, and according to the property of the nonnegative symmetric matrix, the maximum module eigenvalue exists in the correlation matrix M ', and is represented by a symbol lambda ', so that lambda ' C ' is equal to H ' C ', wherein lambda ' is a nonnegative value, C ' is an eigenvector, the correlation matrix H ' is subjected to eigenvalue and eigenvector extraction calculation by utilizing matlab, the eigenvalue and the eigenvector of the correlation matrix H ' are obtained, the eigenvector corresponding to the maximum module eigenvalue lambda ' is represented by a symbol E, and E ∈ C ', E is equal to [ E ]₁,e₂,…,e_h]^T，e_hIndicating the importance of the h-th host in the network, h is 1,2,3, … 6.

S_t＝∑_he_h×s_t(h) (20)

step 2.1: by the symbol X⁽⁰⁾Representing the initial sequence of the gray model, X⁽⁰⁾＝(x⁽⁰⁾(1),x⁽⁰⁾(2),…,x⁽⁰⁾(t)); wherein x is⁽⁰⁾(1),x⁽⁰⁾(2),…,x⁽⁰⁾(t) represents the overall network situation values at time 1, time 2, …, and time t, respectively. Taking the overall situation historical data S of the network system obtained in the step one as an initial sequence X⁽⁰⁾，x⁽⁰⁾(1)＝S₁,x⁽⁰⁾(2)＝S₂,…,x⁽⁰⁾(t)＝St。

Step 2.2: calculating an initial sequence X by equation (21)⁽⁰⁾By a first order accumulation of the symbols X⁽¹⁾And (4) showing. X⁽¹⁾＝{x⁽¹⁾(1),x⁽¹⁾(2),…,x⁽¹⁾(t) }, in which x⁽¹⁾(1),x⁽¹⁾(2),…,x⁽¹⁾(t) represents the overall situation sum of the network from the 1 st time to the 1 st time, from the 1 st time to the 2 nd time, …, and from the 1 st time to the t-th time, respectively.

x⁽¹⁾(t)＝∑_tx⁽⁰⁾(t) (21)

The formula (22) is integrated by the formula (23) and discretized.

Wherein t' is 1,2, …, t-1.

x⁽¹⁾(t′+1)-x⁽¹⁾(t′)＝x⁽⁰⁾(t′+1)。 (24)

The integral term in equation (25) is solved to obtain equation (26).

Substituting equation (27) into equation (25) yields equation (28).

x⁽⁰⁾(t′+1)＝-az⁽¹⁾(t^′+1)+b(28)

In this embodiment, an expression shown in formula (45) is obtained.

step 2.4.1: setting 3 alternative parameter vectors, respectively using symbols

Y_tAnd G represents a group represented by,

Y_tand the value of G is shown in equation (30).

Step 2.4.2: substituting equation (30) into equation (29) yields equation (31).

Step 2.4.3: and solving the formula (31) by a least square method to obtain the parameter estimation of the gray prediction model. The result of determining the gray model of the network security situation by using the least square method is shown in fig. 3.

step 2.5.1: by symbols

When t is equal to 1, the first step is carried out,

the result of (2) is shown in equation (32).

For the set of equations (34) according to the formula (24)

And

the difference is calculated to obtain an equation set (35).

Wherein the content of the first and second substances,

And (4) showing.

And step three, determining the vulnerability node in the network system.

z＝1,2,…,z′，z′≤h-y。

Δ＝S_t+1-S_t(38)

Wherein s is_t(1)、s_t(2)…s_t(z) denotes a 1 st host, a 2 nd host, and … th z station, respectivelyThe situation value of the host.

Wherein r (z) represents a host p_zIs related to delta.

U＝(u₁,u₂,…,u_z) (41)

The accuracy of the mapping method is verified according to equation (42), denoted by the symbol ul.

Through the operations of the above steps, the present embodiment is completed.

Step one, generating a network model, and obtaining the filtering of the network model and a service list. The method specifically comprises the following steps:

step 1.1: security posture features of the known network system are defined. The simulated network structure of the network system is shown in fig. 1.

Fig. 1 depicts a topology of a simulated network system environment, which is mainly composed of two parts: a network main body and a backup node. The part connected by the solid line is a device reachability network formed by an external network to an internal network of the network system. The environment sets a proxy server as a system boundary to isolate the internal network and the external network, and the proxy server becomes a first barrier for controlling external access. Two web servers and database servers are then respectively provided for providing simple web requests and data support thereof. The part from the dotted frame in the figure is the hot backup node of the corresponding device, and the dotted connection indicates that the dotted connection is used as the hot backup node of the corresponding device to serve as the security policy transfer. The description of each host node in the figure is shown in table 1.

Table 1 network system node description table

Defining a three-dimensional vector S ═ W, V and R >, wherein W represents an operation dimension index when the network operates and represents the condition of system operation within a certain time; v represents the vulnerability dimension index of the network and represents the vulnerability condition of the system scanned by the scanning tool; r represents the abnormal dimension index of the network, and represents the abnormal behaviors of various network attacks and misoperation in the network within a certain time. A block diagram of the overall system of network security situation impact characteristic indicators is shown in fig. 4.

Step 1.2: the observation matrix A for obtaining the characteristic mean value of the network system is shown as an expression (3.1).

The correlation degree of the features with respect to the feature 1 obtained by processing the matrix a is shown in expression (3.2):

ω_f＝[1 0.451 0.516 0.445 0.759 0.446 0.446 0.748 0.746 0.631 0.685]^T

(3.2)

firstly, the reference system is converted from the system security feature 1 to the system global, and then the feature value and the feature vector extracted for each index are shown in fig. 2 as follows.

Therefore, the relevance of the system security features under the global scope is as follows:

ω_f

＝[0.245 0.288 0.311 0.304 0.307 0.304 0.304 0.309 0.309 0.315 0.314]^T

step 1.3: similarly, a single host situation matrix B formed by the single host situation of the system host at a single time at different observation times is obtained as shown in an expression (3.3):

the eigenvalues and eigenvectors for matrix B are shown in figure 5, respectively.

Fig. 5 shows the result of the eigenvalues and eigenvectors extracted from the matrix B. Selecting the eigenvector corresponding to the maximum eigenvalue to obtain the vector omega_kAs shown in expression (3.4).

ω_k＝[0.346 0.424 0.390 0.425 0.428 0.430]^T(3.4)

Step 1.4: and according to the weight between the network security features determined in the step 1.2 and the weight between the host nodes of the network system determined in the step 1.3, finishing data fusion of the observed values of the sub-situation features scattered in each dimension at a plurality of moments.

The network system obtained by the calculationSumming the historical observation data and results, and obtaining the historical data of the network at the observation time as S_j＝(0.1541,0.1902,0.2119,0.2227,0.204,0.2334)。

And step two, establishing a grey prediction model of the system based on the historical situation value of the network system, and finishing obtaining the situation trend of the system at the next unknown moment.

Step 2.1: initial sequence X for defining a gray model⁽⁰⁾Inputting the historical calculation results of the step one according to a time sequence to obtain X⁽⁰⁾(0.1541,0.1902,0.2119,0.2227,0.204,0.2334), where n is the number of observations in history.

Step 2.2: for the initial sequence X⁽⁰⁾Accumulating item by item (0.1541,0.1902,0.2119,0.2227,0.204 and 0.2334) to form a generating sequence X⁽¹⁾＝(0.1541,0.3443,0.5562,0.7789,0.9829,1.2163)。

Step 2.3: for generating sequence X⁽¹⁾A first-order gray differential equation is established,

x⁽⁰⁾(k+1)＝-az⁽¹⁾(k+1)+b，k＝1,2,…,n-1

and corresponding the value X of the initial sequence⁽⁰⁾(0.1541,0.1902,0.2119,0.2227,0.204,0.2334) and z of the generation sequence processing⁽¹⁾The gray differential equation is substituted with (0.24920,0.45025,0.66755,0.88090, -1.0996), and a gray differential equation system is obtained and expressed as a matrix equation as shown in expression (3.15).

Step 2.4: and confirming parameters in the model by using a least square method, and establishing a prediction model of the network security situation. The result of determining the gray model of the network security situation by using the least square method is shown in fig. 3.

Order to

Obtaining the result

Predicting the value of the situation accumulation at the 7 th moment to be

And (5) restoring to original data to obtain a predicted value:

and finding a predicted sequence

Step 2.5: and gray prediction precision detection: after the fitting parameters are confirmed by the least square method, the results including the historical time period can be calculated by using a grey prediction model of the network security situation. The prediction sequence given by the prediction model can be obtained:

to evaluate the accuracy of fitting a prediction model, it is necessary to predict the sequence

With the original sequence X⁽⁰⁾A comparison is made. Using Euclidean formula to observe original sequence

And the predicted sequence X⁽⁰⁾Making a comparison, i.e. using

And X⁽⁰⁾Squared difference of subentries:

the accuracy of the model was found to be 94.6%.

Step three, space mapping of the network space vulnerability nodes: mapping rules predicted by the time-based prediction model to the spatial network device nodes are implemented.

Step 3.1: and acquiring the reachability extension condition of the network equipment. Definition P_m＝(p₁,p₂,…,p_b) In a simulation environment, it is assumed that the currently simulated attack request has marked all services before the DB server, corresponding to the network topology, p_mCorresponding host, host p₁(10.1.112.124) and host p₁(10.1.112.125) and corresponding backup node [ p ]₃(10.1.112.126) and p₄(10.1.112.127)]None of the nodes are accessed and may serve as the content to be accessed next. At this time, the hosts P₄Is shown in table 2.

Table 2 observation table for security feature of node system accessing host

Step 3.2: and comparing the situation, and performing calculation analysis on situation changes generated by all reachable equipment nodes in the next period and the predicted situation values one by using correlation analysis to obtain the correlation analysis result of the predicted nodes and the current network situation, which is shown in table 3.

TABLE 3 correlation analysis table of prediction node and current network situation

Through situation calculation

Are respectively obtained

R_i＝(r₁,r₂,r₃,r₄)＝(0.6854,0.6617,0.5226,0.5773)

The result reflects the incidence relation between the network security situation which is represented by the host node and corresponds to the network security situation change at the next moment and the whole network security situation change. As can be seen from the results, p₁、p₂、p₃、p₄The correlation degrees of the corresponding host and the network security situation change under the predicted network environment are respectively as follows: r_i(0.6854,0.6617,0.5226, 0.5773). According to the association rule, p₁(10.1.112.124) the host has a high degree of influence on the network security situation, so that the node is predicted as a network device to be accessed next.

The verification of the result can be judged by the prior probability value deduced by the Bayesian network as shown in Table 4.

TABLE 4 attack request Bayesian network inference probability

The conditional probability of the table is calculated through a Bayesian network according to the predecessor probability of the unexpanded node. The results are consistent with the predictions. The Bayesian conditional probability calculation sequence (p1> p2> p4> p3) of the method is consistent with the predicted vulnerability node possibility ranking (r1> r2> r4> r3), so the effectiveness r of the method is 100%.

Claims

1. The active prediction method of the network vulnerability node based on the gray model is characterized in that: the specific operation steps are as follows:

firstly, acquiring a network security situation characteristic item and calculating a network state; the method specifically comprises the following steps:

step 1.1: determining a security situation characteristic item of a network system; the network security situation is described by dividing the network security situation into three dimensions from top to bottom, wherein the three dimensions are as follows: operating situation dimension, vulnerability situation dimension and abnormal situation dimension; the safety situation characteristic items used for describing the operation situation dimension comprise: CPU utilization rate, memory utilization rate and disk reading rate; the security posture feature items used to describe the vulnerability posture dimension include: vulnerability type, vulnerability score, event type and identity authentication degree; the safety situation characteristic items used for describing the abnormal situation dimension comprise: attack source number, attack time, attack frequency and equipment online state; therefore, the security posture feature items of the network system include 11 items, which are respectively: CPU utilization rate, memory utilization rate, disk reading rate, vulnerability type, vulnerability score, event type, identity authentication degree, attack source number, attack time, attack frequency and equipment online state;

step 1.2: periodically acquiring observation data of security situation characteristic items of a single host in a network system at different moments as a research object; calculating the mean value of observation data of each security situation characteristic item of all hosts in the network system at each moment, and determining the expression value weight of each security situation characteristic item under the global action of the whole network system by using a grey correlation analysis method so as to determine the influence weight of the network security characteristic item on the network global state expression; the method comprises the following specific steps:

step 1.2.1: calculating the mean value of the observation data of each safety situation characteristic item of all the hosts in the network system at each moment to obtain an observation matrix A, wherein the observation matrix A is shown in a formula (1);

wherein t represents the tth time, and t is 1,2,3 …; f. of_t(1),f_t(2),…,f_t(11) Respectively representing the observed data mean values respectively corresponding to the 11 security situation characteristic items at the t-th moment;

step 1.2.2: carrying out dimensionless processing on the observation matrix A by a formula (2) to obtain the dimensionless processed observation matrix A₁Such asFormula (3);

wherein i is 1,2, …, 11;

step 1.2.3: setting a dimensionless processed observation matrix A₁The first column vector of (1) is an observation vector, and the other column vectors are comparison vectors; calculating to obtain the correlation coefficient of each subentry in each comparison vector through a formula (4), and forming a correlation coefficient matrix M as shown in a formula (5);

wherein j is 2,3, …, 11;

step 1.2.4: obtaining the association degree between any two network security situation characteristic items through a formula (6);

wherein k is 1,2, …, 11; gamma ray_(f(i),f(k))Representing the relevance of the network security situation characteristic items f (i) and f (k); gamma ray_(f(i),f(1))The value of (c) is calculated by formula (7); gamma ray_(f(k),f(1))The value of (c) is calculated by formula (8);

wherein T is the total number of the taken time points;

step 1.2.5: obtaining a correlation matrix M 'among all network security situation characteristic items according to the result of the step 1.2.4, wherein the correlation matrix M' is shown as a formula (9);

the relevance matrix M ' is a nonnegative symmetric matrix, and has a maximum module characteristic value according to the property of the nonnegative symmetric matrix, and is represented by a symbol lambda, so that lambda C is M ' C, wherein lambda is a nonnegative value, and C is a characteristic vector, the characteristic value and the characteristic vector of the relevance matrix M ' are obtained by utilizing the nonnegative symmetric matrix characteristic value and the characteristic vector extraction tool of matlab, the characteristic vector corresponding to the maximum module characteristic value lambda is represented by a symbol W, W ∈ C, and W is [ omega ] omega ═₁,ω₂,…,ω₁₁]^T，ω_iRepresenting the influence weight of the ith network security feature item on the global state of the network, wherein i is 1,2, …, 11;

through the operation of the step, the influence weight of each network security feature item on the network global state is obtained;

step 1.3: acquiring observation data of all network security feature items of all hosts in the network system at different time points, acquiring a single host situation shown by each host at each time point according to the influence weight of each network security feature item on the global state of the network obtained in the step 1.2, and acquiring importance proportions among the hosts in the network system by using a gray correlation analysis method;

step 1.3.1: the host situation calculation matrix B is formed by host situation values of all hosts in the network system at different time points, as shown in a formula (10);

wherein h represents the h-th host, and h is 1,2,3 …; s_t(1),s_t(2),…,s_t(h) Respectively representing host situation values of the 1 st, 2 nd, … th and h th hosts in the network system at the t-th moment, and obtaining the host situation values through calculation of a formula (11);

wherein s is_t(h) Indicating the host situation value of the h-th host at the t-th time; f. of_th(i) The observed value of the ith network security situation characteristic item f (i) at the moment t of the h host is represented;

step 1.3.2: carrying out dimensionless processing on the single host situation matrix B as shown in formula (12) to obtain a dimensionless processed single host situation calculation matrix B₁As shown in equation (13);

step 1.3.3: setting single host situation matrix calculation B after dimensionless processing₁The first column vector of (1) is an observation vector, and the other column vectors are comparison vectors; calculating the correlation coefficient of each subentry in each comparison vector through formula (14), and forming a correlation coefficient matrix H, as shown in formula (15);

wherein m is 1,2,3, …;

step 1.3.4: obtaining the association degree between any two hosts through a formula (16);

wherein q is 1,2,3, …; gamma ray_(h(m),h(q))Representing the relevance of the network hosts h (m) and h (q); gamma ray_(h(m),h(1))The value of (c) is calculated by equation (17); gamma ray_(h(q),h(1))The value of (c) is calculated by formula (18);

step 1.3.5: obtaining a correlation matrix H 'among the network hosts according to the result of the step 1.3.4, wherein the correlation matrix H' is shown in a formula (19);

the correlation matrix H ' between the hosts is a nonnegative symmetric matrix, and according to the property of the nonnegative symmetric matrix, the maximum module eigenvalue of the correlation matrix M ' exists, and is represented by a symbol lambda ', so that lambda ' C ' is equal to H ' C ', wherein lambda ' is a nonnegative value, C ' is an eigenvector, the correlation matrix H ' is subjected to extraction calculation of the eigenvalue and the eigenvector by matlab, the eigenvalue and the eigenvector of the correlation matrix H ' are obtained, the eigenvector corresponding to the maximum module eigenvalue lambda ' is represented by a symbol E, and E ∈ C ', E is equal to [ E ]₁,e₂,…,e_h]^T，e_hRepresents the importance of the h-th host in the network, h is 1,2,3, …;

through the operation of the step, the importance weight of each network host in the network overall situation is obtained;

step 1.4: obtaining the influence weight of each network security feature item on the network global state obtained in the step 1.2 and the importance weight of each host in the network system obtained in the step 1.3 according to a formula (20)The history data of the overall situation of the system is represented by symbol S (S ═ S)₁,S₂,…,S_t)；

S_t＝∑_he_h×s_t(h) (20)

Establishing a grey prediction model of the network system based on the overall situation historical data of the network system, wherein the grey prediction model is used for predicting the situation of the network system at the next moment; the method specifically comprises the following steps:

step 2.1: by the symbol X⁽⁰⁾Representing the initial sequence of the gray model, X⁽⁰⁾＝(x⁽⁰⁾(1)，x⁽⁰⁾(2)，...，x⁽⁰⁾(t)); wherein x is⁽⁰⁾(1)，x⁽⁰⁾(2)，...，x⁽⁰⁾(t) represents the overall situation values of the network at the 1 st time, the 2 nd time, … and the t th time respectively; taking the overall situation historical data S of the network system obtained in the step one as an initial sequence X⁽⁰⁾，x⁽⁰⁾(1)＝S₁，x⁽⁰⁾(2)＝S₂，...，x⁽⁰⁾(t)＝S_t；

Step 2.2: calculating an initial sequence X by equation (21)⁽⁰⁾By a first order accumulation of the symbols X⁽¹⁾Represents; x⁽¹⁾＝{x⁽¹⁾(1)，x⁽¹⁾(2)，...，x⁽¹⁾(t) }, in which x⁽¹⁾(1)，x⁽¹⁾(2)，...，x⁽¹⁾(t) represents the overall situation sum of the network from the 1 st time to the 1 st time, from the 1 st time to the 2 nd time, …, and from the 1 st time to the t-th time, respectively;

x⁽¹⁾(t)＝∑_tx⁽⁰⁾(t) (21)

calculating a first-order accumulation generation sequence to weaken the randomness and relevance of original data items, and modeling the overall trend from the global perspective so as to form a trend model and understand and predict situation changes;

step 2.3: generating a sequence X due to first order accumulation⁽¹⁾The system carries out accumulation processing on an irregular historical data sequence to change the irregular historical data sequence into a rising shape sequence with an exponential growth rule; process for calculating first order accumulation generating sequence and grey differential power of first order modelThe process forms are similar; thus generating a sequence X for a first order accumulation⁽¹⁾Establishing a first order differential equation as shown in formula (22);

wherein, a and b are parameters to be determined by the system respectively, and the value ranges of a and b are real numbers;

integrating the formula (22) through a formula (23) and carrying out discretization processing;

wherein, t' is 1,2, 1, t-1;

the relationship shown in formula (24) can be obtained from formula (21);

x⁽¹⁾(t′+1)-x⁽¹⁾(t′)＝x⁽⁰⁾(t′+1)； (24)

obtaining a general formula of the gray prediction model from formula (23) to formula (24), as shown in formula (25);

solving the integral term in the formula (25) to obtain a formula (26);

by the symbol z⁽¹⁾(t' +1) represents the solution of the integral term of equation (25) to obtain equation (27);

substituting the formula (27) into the formula (25) to obtain a formula (28);

x⁽⁰⁾(t′+1)＝-az⁽¹⁾(t′+1)+b (28)

adding the first order to give birthIn sequence X⁽¹⁾And the initial sequence X⁽⁰⁾The term (2) is brought into the formula (28) and is shown as an expression (29) through term shifting processing;

step 2.4: confirming the values of the parameters a and b in the formula (29) by using a least square method, and establishing a prediction model of the network security situation; the method specifically comprises the following steps:

step 2.4.1: setting 3 alternative parameter vectors, respectively using symbols

Y_tAnd G represents a group represented by,

Y_tand the value of G is shown in formula (30);

step 2.4.2: substituting formula (30) into formula (29) to obtain formula (31);

step 2.4.3: solving a formula (31) through a least square method to obtain parameter estimation of a gray prediction model;

step 2.5: predicting the security situation of the network system at the next moment by using a grey prediction model, namely predicting the security situation prediction value of the network system; the method specifically comprises the following steps:

step 2.5.1: by symbols

Representing the predicted value of the first-order accumulation generation sequence of the grey prediction model at the t-th moment;

when t is equal to 1, the first step is carried out,

the result of (c) is shown in equation (32);

when t is more than 1, substituting the parameter estimation of the gray prediction model obtained in the step 2.4.3 into a formula (22), and solving a gray differential equation to obtain a formula (33);

the predicted value of the gray prediction model at the time t, which is formed by the formula (32) and the formula (33), on the first-order accumulation generation sequence is shown as an equation set (34);

for the set of equations (34) according to the formula (24)

And

carrying out difference solving to obtain an equation set (35);

wherein the content of the first and second substances,

representing the initial sequence X of the grey prediction model at time t⁽⁰⁾The predicted value of (2);

obtaining a gray prediction model shown as a formula (35) through the operation of the step 2.5;

step 2.6: detecting the precision of the grey prediction model; the method specifically comprises the following steps:

Represents;

With the initial sequence X⁽⁰⁾Comparing to obtain the accuracy of the prediction model, and expressing the accuracy by a symbol rel;

if rel is greater than 0.9, the prediction result of the gray prediction model is considered to be credible;

step three, determining a vulnerability node in the network system;

predicting the situation of the network system by using the gray prediction model obtained in the step two to obtain the predicted values of the situation of the network system at different time points; analyzing vulnerability nodes in the network system in real time according to the situation predicted value of the network system to finally obtain the vulnerability nodes of the network system; the method comprises the following specific steps:

step 3.1: obtaining a t-th moment, namely a host set which is attacked at the current moment and a host set which is attacked at the t +1 th moment but is not attacked; the method specifically comprises the following steps:

step 3.1.1: the symbol O denotes the set of all hosts in the network system, O ═ O₁,o₂,…,o_h)；o_hIs the h host in the network system;

step 3.1.2: acquiring reachable relations among all hosts in the network system according to the topological structure of the network system and the network routing table, and establishing a reachable information table among the hosts; the inter-host reachable information table comprises a source host and a destination host;

step 3.1.3: the symbol D represents the set of attacked hosts in the network system at time t,

d_yis the y-th host computer of which the network system is attacked;

step 3.1.4: the symbol P represents the set of hosts that may be attacked but have not yet been attacked in the network system at time t +1,

step 3.1.5: according to the network system log, acquiring the request access rate of each host relative to the whole network system, and expressing the request access rate by a symbol P (h); then, according to the formula (37), it is calculated that the host d is attacked at the t +1 th time_yAttack the host p under the condition of_zWith the symbol P (P) as the conditional probability of (2)_z|d_y) Represents;

wherein, P (d)_y|p_z) Representing a known host p_zAttacked host d_yConditional probability of being attacked; p (P)_z) Represents a host p_zA priori probability of being attacked; p (d)_y|～p_z) Representing a known host p_zHost d when not being attacked_yConditional probability of being attacked; p (. about.p)_z) Represents a host p_zA priori probability of not being attacked;

step 3.2: the host p is obtained according to the formula (38)_zThe resulting increment of the network system situation at the time t +1 compared to the time t is denoted by the symbol Δ;

Δ＝S_t+1-S_t(38)

wherein S is_t+1Representing the network security situation at the t +1 th moment; s_tRepresenting the network security situation at the t-th moment;

step 3.3.1: obtaining a host security situation matrix which can be attacked at the t +1 th moment, and expressing the matrix by using a symbol SP, as shown in a formula (39);

wherein s is_t(1)、s_t(2)…s_t(z) state values of the No. 1 host, the No. 2 host and the No. … z host respectively;

step 3.3.2: calculating the association degree of the formula (38) and the formula (39);

taking Δ obtained by equation (38) as the reference sequence, equation (39) obtains the term s of SP_t(z) is a comparison sequence; obtaining the correlation degree of the comparison sequence SP to the reference sequence delta according to a formula (40);

wherein r (z) represents a host p_zThe degree of association of the situation of (1) with delta;

by the symbol R ═ R₁，r₂，...，r_z]^TUnmarked host p as an attack_zA set of association degrees of the situation of (1) and delta; obtained R ═ R₁，r₂，...，r_z]^TIndicating the possibility that the host which is not marked by the attack at the time t +1 causes the change delta of the network situation, r_zThe magnitude of (d) indicates the intensity of the situation change delta possibility caused by the host computer, r_zLarger indicates a greater likelihood that the host will cause a delta; will be the largest r_zThe corresponding host is used as the node which is most vulnerable at the t +1 th moment, and the prediction from the overall continuous time-based network situation to the discrete space-based vulnerability host node is realized;

step 3.3: calculating conditional probability for the hosts in the set P according to the formula (37), and expressing the conditional probability ordering result by using a symbol U, wherein the conditional probability ordering result is shown in a formula (41);

U＝(u₁，u₂，...，u_z) (41)

the ranking result of equation (41) is compared with the result R ═ R of the vulnerability node prediction of the set P₁，r₂，...，r_z]^TComparing, and representing the number of the relevance ranking consistent with the conditional probability sequence by a symbol l;

the accuracy of the vulnerability node confirmation result can be verified according to the formula (42) and is expressed by a symbol ul;