CN111193742A

CN111193742A - D-S evidence theory-based power communication network anomaly detection method

Info

Publication number: CN111193742A
Application number: CN201911410137.3A
Authority: CN
Inventors: 莫穗江; 高国华; 李瑞德; 王�锋; 张欣欣; 温志坤; 黄定威; 杨玺; 张欣; 汤铭华; 梁英杰; 廖振朝; 陈嘉俊; 李伟雄; 童捷; 张天乙
Original assignee: Guangdong Power Grid Co Ltd; Jiangmen Power Supply Bureau of Guangdong Power Grid Co Ltd
Current assignee: Guangdong Power Grid Co Ltd; Jiangmen Power Supply Bureau of Guangdong Power Grid Co Ltd
Priority date: 2019-12-31
Filing date: 2019-12-31
Publication date: 2020-05-22

Abstract

The invention relates to the field of power communication networks, in particular to a power communication network anomaly detection method based on a D-S evidence theory. By adopting the method, the accuracy of sensing the network abnormity and the type thereof in the power communication network can be effectively improved, and meanwhile, the real-time detection of the network abnormity can be realized according to the real-time data real-time fusion.

Description

D-S evidence theory-based power communication network anomaly detection method

Technical Field

The invention relates to the field of power communication networks, in particular to a power communication network anomaly detection method based on a D-S evidence theory.

Background

With the advance of research and practice of smart power grids, power grids in the traditional sense are gradually fused with information communication systems and monitoring control systems, the safety of power communication networks is closely connected with the operation safety of the power grids, and the safety of the power communication networks is the central importance of the safety of the power grids.

The electric power communication network system has the characteristics of complexity, dynamics and the like, has certain vulnerability, and brings great pressure and challenge to network safety work due to security risks from the inside and the outside because security incidents such as denial of service attack, network scanning, network deception, virus trojans, information leakage and the like emerge endlessly, so that an electric power communication network abnormity detection technology is needed to accurately identify the abnormal incidents of network safety and determine whether the network is in an abnormal state in real time, and the working personnel can conveniently deal with the abnormal problems in time.

In the existing network anomaly detection method, one method is to comprehensively judge network traffic by a plurality of characteristics on the basis of a D-S evidence theory and introduce a self-adaptive mechanism to ensure the detection accuracy. But the defect of the scheme is that the judgment cannot be carried out by combining the data characteristics of the host computer, and the judgment accuracy can be influenced.

Disclosure of Invention

In order to overcome the problem of low accuracy of network anomaly detection in the prior art, the invention provides the electric power communication network anomaly detection method based on the D-S evidence theory, which can effectively improve the accuracy of sensing network anomalies and types thereof in the electric power communication network and can realize real-time network anomaly detection according to real-time data real-time fusion.

In order to solve the technical problems, the invention adopts the technical scheme that: a power communication network anomaly detection method based on a D-S evidence theory comprises the following steps:

the method comprises the following steps: selecting characteristics influencing network abnormity from network connection state data collected from a power communication network, and performing data preprocessing;

step two: determining BBA (basic probability distribution, each possible probability distribution in the identification framework) in the identification framework in the D _ S evidence theory based on a K-means clustering method (K-means algorithm);

step three: determining the identification frame by using an expert system;

step four: fusion and decision making were performed using Dempster (D _ S evidence theory) composition rules.

Preferably, in the first step, network key information data with a fixed time length is selected from original records of network key information collected in the power communication network; and cleaning the selected key information data, and removing the data records containing the missing values.

Preferably, the key information data includes three pieces of information, which are traffic information, operation information, and alarm information of the network protection device, respectively. The flow information comprises the flow inflow size of each network node and the flow outflow size of the network node; the running information comprises the total number of the services running on each host, the average access amount and the access frequency of each service running on the host; the alarm information of the network protection device comprises an alarm identifier, attack frequency, a source address, a destination address, a source port and a destination port. The flow of the power grid in a normal operation state is stable (except for a specific time interval), if the flow fluctuates abnormally, whether DDOS attack occurs or not can be judged according to the flow information, the total service amount of the host in a normal state is stable, if the flow fluctuates, the host can be used for detecting whether the host is injected into a backdoor or a malicious program or not, the attack type and the attack source can be judged by collecting alarm information, and data support is provided for abnormal detection.

Preferably, in the second step, the calculation process of the clustering intervals and the clustering feature similarity includes:

s1, based on the K-means algorithm, the basic model of the clustering characteristic interval is [ c, r ], c is the clustering center, and r is the clustering radius;

s2: the clustering feature similarity is:

let F₁：[c₁，r₁]，F₂：[c₂，r₂]Two focal elements in the frame are identified for the D _ S evidence theory, with the distance between them being:

wherein, c₁Is the first cluster center; c. C₂Is the second cluster center; r is₁In the first clusterCluster radius of the heart; r is₂The cluster radius of the second cluster center;

s3: the similarity of two clustering characteristic interval models is

Wherein λ is>0 is a support coefficient; d (F)₁，F₂) Is the distance of two focal elements.

Preferably, in the second step, the BBA generation step is as follows:

s2.1: and establishing a clustering interval model of the sample data attribute value.

S2.2: and calculating the distance between the attribute value of the data to be judged and the model interval.

S2.3: and calculating the similarity between the attribute value of the data to be identified and the attribute value of the sample data.

S2.4: and normalizing the similarity to generate BBA.

Preferably, in step three, an identification framework is established according to an expert system; the recognition frame is denoted as Z ═ a₁，a₂，…，a_nZ is an identification frame, n is the number of objects in the frame, and a is an event type; any two objects in the frame are mutually exclusive.

Preferably, in the fourth step, the Dempster combination rule is defined as:

wherein A is a focal element in the identification frame; b is proposition in the recognition frame; m is₁And m₂Respectively are basic credibility assignment combinations from 2 different information sources under the same frame; k is a conflict coefficient and reflects the degree of evidence conflict;

and finally, network abnormity judgment and related type judgment can be carried out according to the probability numerical value obtained after fusion.

Compared with the prior art, the beneficial effects are:

1. the method realizes the determination of the BBA in the identification frame in the D-S evidence theory by using a K-means-based clustering method. The subjective degree of BBA caused by strong subjective degree and evidence high conflict can be effectively determined by reducing expert experience scoring.

2. The invention comprehensively considers the transmission data of the network, extracts the characteristic attribute influencing the network abnormity from the transmission data, and can better reflect the running state of the network.

3. The method takes the instantaneity of the power communication network into full consideration, uses Dempster combination rules for fusion and decision, and utilizes a large amount of operation data in the previous database to construct a clustering interval model and an identification framework, so that the real-time operation data can be subjected to real-time data fusion, and the judgment of real-time network abnormity is accelerated.

Drawings

FIG. 1 is a schematic flow chart of a method for detecting network anomaly of a power communication network based on a D-S evidence theory according to the invention;

fig. 2 is a power communication hierarchical diagram of a power communication network anomaly detection method based on a D-S evidence theory according to the present invention.

Detailed Description

The drawings are for illustrative purposes only and are not to be construed as limiting the patent.

The technical scheme of the invention is further described in detail by the following specific embodiments in combination with the attached drawings:

example 1

Fig. 1-2 show an embodiment of a method for detecting network anomaly of a power communication network based on a D-S evidence theory, which includes the following steps:

the method comprises the following steps: setting an identification frame established by an existing knowledge base or expert experience as Z ═ a₁，a₂，…，a_nWherein Z is an identification frame, a_i(1 ≦ i ≦ n) represents the event type, where any two objects in the frame are mutually exclusive, and n is the number of objects in the frame；

The electric power communication network system has the following selection indexes: and (3) traffic information: the flow of each NETWORK node flows into a large NETWORK _ IN (NI), and the flow of the NETWORK node flows out of a large NETWORK _ OUT (NO); the operation information includes: the total number of SERVICEs running on each host, the average access amount VISIT _ num (vn) and the access frequency VISIT _ freq (vr) of each SERVICE running on the host; the alarm information of the network protection equipment comprises: ATTACK frequency ATTACK _ FREQ (AF);

step two: BBA generation of sample index data.

Selecting test data: for each object (event) in the identification framework, 20 sets of corresponding data (from the power communication network system and the related database) are selected, wherein each set of data comprises the indexes: { NI, NO, SN, VN, VR, AF }, index NI data can be expressed as NI₁，NI₂，…，NI₂₀And the others are similar.

Generating a clustering characteristic interval model: as shown in the following table:

table 1: characteristic interval model table

Jiao yuan

NI

NO

SN

VN

VR

AF

a₁

[c₁₁，r₁₁]

[c₁₂，r₁₂]

[c₁₃，r₁₃]

[c₁₄，r₁₄]

[c₁₅，r₁₅]

[c₁₆，r₁₆]

a₂

[c₂₁，r₂₁]

[c₂₂，r₂₂]

[c₂₃，r₂₃]

[c₂₄，r₂₄]

[c₂₅，r₂₅]

[c₂₆，r₂₆]

…

a_n

[c_n1，r_n1]

[c_n2，r_n2]

[c_n3，r_n3]

[c_n4，r_n4]

[c_n5，r_n5]

[c_n6，r_n6]

The generation process in the above table is to directly operate on 20 groups of data, e.g. [ c ]₁₁，r₁₁]The generation process is as follows:

and (3) solving the distance between the data to be detected and the characteristic model interval and solving the similarity: selecting test data C { NI_t，NO_t，SN_t，VN_t，VR_t，AF_tSolving the distance between the BBA and the BBA, calculating the similarity, and then carrying out normalization treatment to obtain the BBA value, which can be specifically represented by the following table:

table 2: similarity between test data and interval model

Jiao yuan

NI

NO

SN

VN

VR

AF

a₁

s₁₁

s₁₂

s₁₃

s₁₄

s₁₅

s₁₆

a₂

s₂₁

s₂₂

s₂₃

s₂₄

s₂₅₅

s₂₆

…

a_n

s_n1

s_n2

s_n3

s_n4

s_n5

s_n6

The generation process in the above table is to directly operate on the table 1 data (where the cluster radius of the test data can be 0, and the similarity coefficient can be 1), such as s₁₁The generation process is as follows:

the normalization was performed to obtain BBA data as follows:

table 3: BBA allocation

Jiao yuan

NI

NO

SN

VN

VR

AF

a₁

P₁₁

P₁₂

P₁₃

P₁₄

P₁₅

P₁₆

a₂

P₂₁

P₂₂

P₂₃

P₂₄

P₂₅₅

P₂₆

…

a_n

P_n1

P_n2

P_n3

P_n4

P_n5

P_n6

The generation process in the above table is to directly perform operation normalization on the table 2 data by taking the following steps (such as P)₁₁)：

D-S data fusion and decision analysis are carried out: first, the collision factor K is obtained, and then the fusion probability of the frame object is obtained as follows.

The formula according to K in step 4 is:

according to the fusion formula in step 4 (here denoted as P (a))₁) For example) there are:

the probability P (a) of the fused event in each recognition frame is obtained through the steps₁)，P(a₂)，…，P(a_n) And further making judgment on the event type and subsequent relevant decisions.

The beneficial effects of this embodiment:

It should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.

Claims

1. A method for detecting network abnormality of a power communication network based on a D-S evidence theory is characterized by comprising the following steps:

step two: determining basic probability distribution in an identification frame in a D _ S evidence theory based on a K-means clustering method;

step three: determining the identification frame by using an expert system;

step four: and D-S evidence theory combination rules are used for fusion and decision-making.

2. The method for detecting the network abnormality of the electric power communication network based on the D-S evidence theory as claimed in claim 1, wherein in the step one, network key information data with a fixed time length is selected from original records of network key information collected in the electric power communication network; and cleaning the selected key information data, and removing the data records containing the missing values.

3. The method for detecting the network abnormality of the power communication network based on the D-S evidence theory as claimed in claim 2, wherein the key information data includes three pieces of information, which are flow information, operation information and alarm information of network protection equipment.

4. The method for detecting the network abnormality of the power communication network based on the D-S evidence theory is characterized in that the traffic information comprises the traffic inflow size of each network node and the network traffic outflow size.

5. The method for detecting the network abnormality of the power communication network based on the D-S evidence theory is characterized in that the operation information comprises the total number of the services operated on each host, the average access amount and the access frequency of each service operated on each host.

6. The method for detecting the network abnormality of the power communication network based on the D-S evidence theory as claimed in claim 3, wherein the alarm information of the network protection device includes an alarm identifier, an attack frequency, a source address, a destination address, a source port and a destination port.

7. The method for detecting the abnormality of the power communication network based on the D-S evidence theory as claimed in claim 2, wherein in the second step, the calculation process of the clustering intervals and the clustering feature similarity comprises:

s2: the clustering feature similarity is:

wherein, c₁Is the first cluster center; c. C₂Is the second cluster center; r is₁The cluster radius of the first cluster center; r is₂The cluster radius of the second cluster center;

s3: the similarity of two clustering characteristic interval models is

8. The method for detecting the network abnormality of the power communication network based on the D-S evidence theory as claimed in claim 7, wherein in the second step, the basic probability distribution is generated as follows:

s2.1: establishing a clustering interval model of sample data attribute values;

s2.2: calculating the distance between the attribute value of the data to be judged and the model interval;

s2.3: calculating the similarity between the attribute value of the data to be identified and the attribute value of the sample data;

s2.4: and normalizing the similarity to generate basic probability distribution.

9. The method for detecting the abnormality of the power communication network based on the D-S evidence theory as claimed in claim 8, wherein in the third step, an identification frame is established according to an expert system; the recognition frame is denoted as Z ═ a₁，a₂，…，a_nZ is an identification frame, n is the number of objects in the frame, and a is represented as an event type; any two objects in the frame are mutually exclusive.

10. The method for detecting the network abnormality of the power communication network based on the D-S evidence theory as claimed in claim 9, wherein in the fourth step, the D-S evidence theory combination rule is defined as:

and then, network abnormity judgment and related type judgment can be carried out according to the probability numerical value obtained after fusion.