CN108270785B - Knowledge graph-based distributed security event correlation analysis method - Google Patents

Knowledge graph-based distributed security event correlation analysis method Download PDF

Info

Publication number
CN108270785B
CN108270785B CN201810036765.9A CN201810036765A CN108270785B CN 108270785 B CN108270785 B CN 108270785B CN 201810036765 A CN201810036765 A CN 201810036765A CN 108270785 B CN108270785 B CN 108270785B
Authority
CN
China
Prior art keywords
event
knowledge
scene
alarm
graph
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810036765.9A
Other languages
Chinese (zh)
Other versions
CN108270785A (en
Inventor
王伟
江荣
贾焰
周斌
李爱平
杨树强
韩伟红
李润恒
徐镜湖
安伦
亓玉璐
杨行
马凯
林佳
尚怀军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan Yilan Situation Technology Co ltd
National University of Defense Technology
Original Assignee
Sichuan Yilan Situation Technology Co ltd
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan Yilan Situation Technology Co ltd, National University of Defense Technology filed Critical Sichuan Yilan Situation Technology Co ltd
Priority to CN201810036765.9A priority Critical patent/CN108270785B/en
Publication of CN108270785A publication Critical patent/CN108270785A/en
Application granted granted Critical
Publication of CN108270785B publication Critical patent/CN108270785B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1433Vulnerability analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0677Localisation of faults
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design

Abstract

The invention discloses a knowledge graph-based distributed network security event correlation analysis method, which specifically comprises the following steps: step 1) constructing a network security knowledge graph comprising five dimensions, namely a basic dimension, a vulnerability dimension, a threat dimension, an alarm event dimension and an attack rule dimension; step 2) designing and realizing a security event correlation analysis algorithm on the basis of the knowledge graph constructed in the step 1); and 3) constructing a real-time big data analysis platform, and applying the association analysis algorithm designed in the step 2) to the constructed big data platform to realize a distributed association analysis system. The invention fully utilizes the current big data processing and analyzing related technology to deal with large-scale data volume, parallelizes the association analysis algorithm and realizes the design of the distributed association analysis algorithm based on the knowledge graph.

Description

Knowledge graph-based distributed security event correlation analysis method
Technical Field
The invention belongs to the field of network security situation awareness, and mainly relates to a knowledge graph-based distributed security event correlation analysis method.
Background
With the increasingly wide application of computer networks, the scale of the computer networks is increasingly large, the network security threats and security risks in multiple layers are continuously increased, the threats and losses formed by network viruses, Dos/DDos attacks and the like are increasingly large, the network attack behaviors develop towards the trends of distribution, scale, complexity and the like, and the requirements of network security cannot be met only by means of single network security protection technologies such as firewalls, intrusion detection, virus prevention, access control and the like. In order to deal with increasingly complex and high-hiding network security threats and ensure the safe operation of a system, the heterogeneous information generated by multi-source security equipment, such as security equipment of Firewall, IDS, vulnerability scanning systems, security auditing systems and the like, needs to be subjected to fusion processing by means of related technologies in the field of network security situation perception, so that the macroscopic situation perception of the whole network is realized. When the overall network situation is sensed by integrating the multi-source safety equipment, the problems of large alarm information amount, high false alarm rate and the like exist, and serious burden is brought to a system administrator.
Alarm correlation provides a solution to the above-described problems by analyzing alarms generated by one or more intrusion detection systems to provide a more efficient, high-level of awareness of attacks. In practice, most safety events are not generated in isolation, and there is a certain timing or causal relationship between them. The security event correlation analysis refers to the correlation integration of an original relatively isolated low-level network security event data set by combining the operating environment of a security event, and the real relation between events hidden behind the data is discovered by means of filtering, aggregation and the like to remove false and true. Alarm association essentially comprises the following substeps. Alarm aggregation is to combine alarms from different IDSs aiming at the same security event, so as to accelerate the processing speed and meet the requirement of situation awareness real-time. Redundant alarms can be reduced through alarm association, an attack scene is reconstructed, and the true attack purpose of an attacker is found, so that relevant defects are repaired in a targeted manner, and the normal operation of the network is maintained. Alarm verification identifies false alarm inputs from the detector if each such step is likely to produce false correlation analysis results. And finally, reconstructing an attack scene, wherein one attack scene generally comprises a series of attack activities to achieve a final attack purpose. The step is to construct an attack route by analyzing the generated alarm, and predict the real attack target of the attacker. The security event correlation analysis in the traditional situation awareness system comprehensively considers information of multiple dimensions, such as a base dimension, a fragile dimension, a threat dimension and the like. However, there are several problems with this type of system.
Firstly, the traditional situation awareness system stores a plurality of dimensions in a relational database independently, such as Mysql, when real-time association analysis is performed, join operations need to be performed on a plurality of tables rapidly, the cooperative working capacity among the dimensions is poor, the real-time performance and accuracy of the association analysis are influenced to a great extent, and the situation awareness system with high real-time requirement is very severe. Second, other information, such as attack rule knowledge, is generally unstructured and is not flexible to store using conventional relational databases. Third, the traditional rule-based association analysis needs to rely on expert knowledge to construct attack scenarios, and the novel attacks not in the attack template knowledge base can not be automatically analyzed and inferred. The most important problem is not the lack of available information, but rather the integration of separate pieces of information to have an overall perception of situational awareness.
The other outstanding problem is that the correlation analysis methods proposed by earlier scholars are algorithms designed by a common single machine, but the current era is a big data era, the rapid development of the internet and the continuous expansion of the network scale cannot well meet the requirement of large-scale data analysis.
Disclosure of Invention
The invention aims to provide a knowledge graph-based distributed network security event correlation analysis method.
The technical scheme adopted by the invention for solving the technical problems is as follows:
a distributed network security event correlation analysis method based on a knowledge graph specifically comprises the following steps:
step 1) constructing a network security knowledge graph comprising five dimensions, namely a basic dimension, a vulnerability dimension, a threat dimension, an alarm event dimension and an attack rule dimension;
step 2) designing and realizing a security event correlation analysis algorithm on the basis of the knowledge graph constructed in the step 1);
and 3) constructing a real-time big data analysis platform, and applying the association analysis algorithm designed in the step 2) to the constructed big data platform to realize a distributed association analysis system.
Preferably, the step 1) specifically includes:
collecting knowledge information of five dimensions, wherein the knowledge information comprises CVE vulnerability knowledge, CAPEC attack classification knowledge, CWE host software knowledge, Snort alarm event knowledge and attack rule expert knowledge;
extracting entity attribute information by writing an xml processing program and a regular expression;
using a graph database Neo4j as a knowledge graph construction tool, and inserting the knowledge information into a knowledge graph by compiling Cypher sentences;
and connecting all sub-graph spectrums into a finished and available graph spectrum according to the relation of the edges required to be constructed among different dimensions.
Preferably, step 2) includes:
step 2-1) a security event preprocessing sub-step, comprising:
preprocessing alarms in various formats generated by security products provided by different manufacturers while analyzing alarm events;
step 2-2) a security event verification sub-step comprising: judging whether the alarm belongs to a real alarm or not through the incidence relation between the alarm and the base dimension and the vulnerability dimension;
step 2-3) a scene reconstruction substep comprising: the actual attack purpose of the attacker is discovered through the relationship between the correlation events.
Preferably, the step 2-1) specifically comprises the following steps: and preprocessing the information reported by different devices into a unified data format.
Preferably, in the step 2-2), the method further comprises:
normalizing the security event and acquiring a target ip address;
the method comprises the steps of quickly positioning a host according to a target ip address, acquiring state information of the host through in-band or out-of-band acquisition, wherein the state information comprises CPU utilization rate, loophole and network broadband utilization rate, and judging to reserve or filter an alarm event by comparing the characteristics of a security event with the acquired host information.
Preferably, in the step 2-3), the method further comprises:
classifying based on the target ip, then quickly searching scenes related to the event from a knowledge graph by using Cypher, and if the scenes do not exist in the memory, buffering the scenes into the memory;
and then constructing a scene instance through the static scene knowledge and the target ip, modifying corresponding event field information in the scene instance when the subsequent same ip comes, and outputting the scene if all instances contained in the scene are met.
Preferably, step 3) specifically includes: the association analysis process of the step 2) is realized on the basis of a real-time big data processing platform Strom, and the realization process needs to be assisted by a plurality of non-relational databases.
Preferably, in step 2-1), the alarm event includes: attack events and spurious noise events; in the step 2-2), performing an event verification function on the alarm event, and reserving a real event;
the upper layer of the event is provided with an alarm layer, the upper layer of the alarm layer is provided with a scene layer, and a tree structure organization is formed among the alarm layer, the scene layer and the scene layer.
Preferably, the step 2-3) specifically comprises the following steps:
for each incoming event, first look up in the buffer if there is a scene associated with the event, buffer we use the Mongodb non-relational database implementation. Judging whether an event needing to be found exists in the buffer or not through a field searching function of the Mongdb;
if all scenes related to the event are not searched in the knowledge graph firstly, buffering the result into Mongodb;
searching whether an instance related to the event target ip exists in the scene instance to be matched, if not, indicating that an attack aiming at the ip does not occur, and creating a scene instance related to the ip at the moment;
each scenario instance will contain several alarms, and each alarm will contain several security events. Therefore, the process of the correlation analysis is a process of continuously matching the alarm attribute and the event attribute from 0 to 1 in the scene, if all the fields are set to be 1, the scene is satisfied, the true attack purpose of the attack is the attack scene, the whole correlation process is finished, and the correlation result is output.
Preferably, the knowledge graph is realized by using a Neo4j non-relational database, and the searching of the database is displayed by using a Cypher graphic retrieval language provided by the Neo4 j;
using Cypher to find a scene associated with an event is a second order relational search because there is also an alarm layer in between the event and the scene.
The invention accelerates the speed of multi-dimensional data combined retrieval on one hand, and improves the flexibility of knowledge representation and updating on the other hand. In addition, the knowledge graph can provide path search between two entity nodes and can be used for knowledge reasoning, knowledge verification, self-learning of new knowledge and the like. The correlation analysis based on the knowledge graph has higher analysis speed and better expandability compared with the traditional method. The invention also makes full use of the current big data processing and analyzing related technology to deal with large-scale data quantity, parallelizes the association analysis algorithm and realizes the design of the knowledge-graph-based distributed association analysis algorithm.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
The present invention will be described in detail below with reference to the accompanying drawings so that the above advantages of the present invention will be more apparent. Wherein the content of the first and second substances,
FIG. 1 is a flow diagram of a network security event based correlation analysis;
FIG. 2 is a framework diagram of a knowledge-graph based distributed network security event management analysis.
Detailed Description
The following detailed description of the embodiments of the present invention will be provided with reference to the drawings and examples, so that how to apply the technical means to solve the technical problems and achieve the technical effects can be fully understood and implemented. It should be noted that, as long as there is no conflict, the embodiments and the features of the embodiments of the present invention may be combined with each other, and the technical solutions formed are within the scope of the present invention.
Additionally, the steps illustrated in the flow charts of the figures may be performed in a computer system such as a set of computer-executable instructions and, although a logical order is illustrated in the flow charts, in some cases, the steps illustrated or described may be performed in an order different than here.
As shown in fig. 1 and 2, to achieve the purpose of the present invention, the specific technical solution of the present invention is as follows:
a distributed network security event correlation analysis method based on a knowledge graph specifically comprises the following steps:
step 1) constructing a network security knowledge graph comprising five dimensions, namely a basic dimension, a vulnerability dimension, a threat dimension, an alarm event dimension and an attack rule dimension;
step 2) designing and realizing a security event correlation analysis algorithm on the basis of the knowledge graph constructed in the step 1);
step 3) constructing a real-time big data analysis platform, and applying the association analysis algorithm designed in the step 2) to the constructed big data platform to realize a distributed association analysis system;
preferably, the step 1) specifically includes collecting five-dimensional knowledge information, which respectively includes CVE vulnerability knowledge, CAPEC attack classification knowledge, CWE host software knowledge, Snort alarm event knowledge and attack rule expert knowledge, and extracting entity attribute information by writing an xml processing program and a regular expression. The knowledge information is inserted into the knowledge graph by writing Cypher sentences by using the graph database Neo4j as a knowledge graph construction tool. And connecting all sub-graph spectrums into a finished and available graph spectrum according to the relation of the edges required to be constructed among different dimensions.
Preferably, the step 2) specifically comprises the following steps:
step 2-1) is first a security event pre-processing. The first problem faced when analyzing alarm events is the generation of alarms in various formats by security products offered by different vendors. Therefore, in order to resolve alarm correlation, it is necessary to preprocess the information reported by different devices into a unified data format, such as IDMEF, which is defined by the Intrusion Detection Working Group (IDWG).
Step 2-2) is followed by security event verification. The IDS generates massive alarms every day, but usually most alarms belong to false alarms, and alarm verification judges whether the alarms belong to real alarms or not through the association relationship between the alarms and the basic dimension and the vulnerability dimension. Because each security event can easily obtain a target ip address after normalization processing, the host can be quickly positioned according to the target ip address, state information of the host, including CPU utilization rate, vulnerability, network broadband utilization rate and the like, can be obtained through in-band or out-of-band collection, and the retention or filtering of alarm events is judged by comparing the characteristics of the security events with the obtained host information.
Step 2-3) is a scene reconstruction step. As a core step of correlation analysis, scene reconstruction discovers the true attack purpose of an attacker through the relationship between correlation events. Firstly, classifying based on the target ip, then quickly searching scenes related to the event from a knowledge graph by using Cypher, and if the scenes do not exist in the memory, buffering the scenes into the memory. And then constructing a scene instance through the static scene knowledge and the target ip, modifying corresponding event field information in the scene instance when the subsequent same ip comes, and outputting the scene if all instances contained in the scene are met.
Preferably, in the step 3), the association analysis process in the step 2) is implemented based on a real-time big data processing platform from, and the implementation process needs to be implemented with the aid of a plurality of non-relational databases.
5, the invention has the following advantages:
compared with the prior art, the invention has the advantages that:
the invention provides a knowledge graph-based distributed network security event correlation analysis method, which uses a knowledge graph form to replace a traditional relational database to store network knowledge information, so that on one hand, the multi-dimensional data joint retrieval speed is accelerated, and on the other hand, the flexibility of knowledge representation and updating is improved. In addition, the knowledge graph can provide path search between two entity nodes and can be used for knowledge reasoning, knowledge verification, self-learning of new knowledge and the like. The correlation analysis based on the knowledge graph has higher analysis speed and better expandability compared with the traditional method. The invention also makes full use of the current big data processing and analyzing related technology to deal with large-scale data quantity, parallelizes the association analysis algorithm and realizes the design of the knowledge-graph-based distributed association analysis algorithm.
The technical key points and points to be protected of the patent are as follows:
1. constructing a Chinese network security knowledge graph which comprises a basic asset dimension, a vulnerability dimension, an attack threat dimension and a static alarm information dimension;
2. constructing a scene rule on the basis of the network security knowledge graph, and designing a security event correlation analysis method based on the knowledge graph;
3. and designing a distributed correlation analysis framework and realizing the correlation analysis method in a distributed manner.
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative only and should not be construed as limiting the invention.
The detailed description is made below with reference to the accompanying drawings to describe embodiments according to the present invention.
Fig. 1 is a flow chart of correlation analysis based on network security events, and as shown in fig. 1, in the embodiment of the present invention, an input security event includes a real attack event and a false noise event, and an event verification function is first required to be performed to retain the real event. In the embodiment, in order to facilitate the construction of subsequent scenes, an alarm layer is arranged at the upper layer of an event, and a scene layer is arranged at the upper layer of the alarm layer, wherein the three layers are similar to a tree structure organization. The method has the advantages of supporting knowledge graph query of various granularities, along with better expandability and higher query efficiency. For each incoming event, a buffer is first looked up for the presence of a scene associated with the event. Buffering we implemented using a Mongodb non-relational database. And judging whether the event needing to be found exists in the buffer or not through a field searching function of the Mongdb. If all scenes associated with the event are not first looked up in the knowledge graph, the results are buffered in Mongodb. Knowledge graph we use a Neo4j non-relational database implementation, and the database lookup is shown using the graphic search language Cypher provided by Neo4j itself. Using Cypher to find a scene associated with an event is a second order relational search because there is also an alarm layer in between the event and the scene. Table join operations with respect to relational databases may find great advantage here. And then searching whether an instance related to the event target ip exists in the scene instances to be matched, if not, indicating that an attack aiming at the ip has not occurred, and at the moment, creating a scene instance related to the ip.
Each scenario instance will contain several alarms, and each alarm will contain several security events. Therefore, the process of the correlation analysis is a process of continuously matching the alarm attribute and the event attribute from 0 to 1 in the scene, if all the fields are set to be 1, the scene is satisfied, and the real attack purpose of the attack is the attack scene. And (5) finishing the whole association process and outputting an association result.
Fig. 2 is a diagram of a knowledge-graph-based distributed network security event management analysis framework, and as shown in fig. 2, in the embodiment of the present invention, the distributed framework mainly includes a data acquisition module, a data collection and merging/distribution module, a real-time data analysis module, and a data storage module.
The data acquisition module in this embodiment uses the Apache organization open source distributed log acquisition system flash, which collects data from various servers and sends the collected data to a designated log buffer server. One benefit of using Flume as a data collection tool is that it can be customized to configure acquisition modes for different data sources, including data source location, acquisition frequency, transmission mode, etc. And the data collection module sends the collected data to the data collection merging and distribution module.
The core function of the data collection, merging and distribution module in this embodiment is to buffer the collected mass data and classify the collected data according to the topic information. The open-source Kafka message middleware system is specifically realized by means of Apache. Kafka is a distributed based message publish-subscribe system. Kafka, like other message publish-subscribe systems, maintains information for messages within topics. The producer writes data to the subject and the consumer reads data from the subject. Since Kafka is characterized by the support of, and is based on, distribution, the subject matter can also be partitioned and overlaid on multiple nodes. The main purposes for which we use the Kafka middleware are: 1. the coupling is reduced. The acquisition and the processing of safety experiment data are completely separated, and the method for realizing the function is to use Kafka message middleware between the safety experiment data and the safety experiment data for transmitting management data, so that the two parts are realized by facing Kafka interface programming, the coupling is reduced, and the expandability of the system is improved. 2. The accuracy is ensured. Since the system will simultaneously generate a large amount of experimental data in a production environment, it would be a huge project if the data were manually managed by a developer's programming, and the reliability of data management could be guaranteed using Kafka. 3. And (4) a buffering function. Due to the fact that production data and consumption speed of common messages are not consistent, alarm data can be generated at a peak value if a series of attacks are generated at a certain moment, data accumulation can occur if the association analysis component cannot process the data quickly, and data loss can be caused along with continuous generation of event information. Data may be buffered to local disk using Kafka, awaiting consumption of subsequent messages.
The real-time association analysis module in the embodiment is a core module of the system, and is realized by means of a Storm real-time data analysis platform with an Apache organization open source. This module is responsible for pulling specified data from the Kafka messaging system to implement the security event correlation analysis function based on the knowledge-graph. Storm is an event handler that can perform distributed processing on streaming data. A Storm application is comprised of "blobs" and "bolts" configured as a directed acyclic graph representing information sources and data handlers. Storm's main feature is the ability to process real-time data, unlike Hadoop which allows batch processing. The main reasons we use storm as a real-time processing tool are: 1. a simple programming model. Similar to MapReduce, parallel batch processing complexity is reduced, Storm reduces the complexity of performing real-time processing. As long as the spout data receiving interface is realized, the bolt data processing interface can be used to construct a topology. 2. And (4) horizontally expanding. The calculation is carried out in parallel among a plurality of threads, processes and servers, and the running speed of the system is guaranteed. 3. A local mode. Storm has a "local mode" that can fully simulate the Storm cluster during processing. 4. The processing logic is distributed. storm uses a layered mode, and the bolt of each layer completes one function and hands over the next layer of processing, so that the whole implementation process can be divided into a plurality of steps, each step has clear functions and is easier to implement.
The data storage module in this embodiment is responsible for storing the original event data information and the scene information output after the correlation analysis is finished. Meanwhile, the system can deploy a plurality of data statistical threads to complete the statistical function of short time and long time periods, and the statistical information also needs to be supported by a storage module. The module stores structured information using a Mysql relational database. The Mongodb database is used to store unstructured information.
In the embodiment of the invention, the modules are operated by data driving. Because the above modules use several relational and non-relational databases, there is a need for database-related experts to design and optimize the databases. Meanwhile, a distributed system tool is used at the bottom layer of each module, so that reasonable configuration including master-slave configuration, node broadband, memory configuration and the like is required according to the actual data processing scale. There is a need for personnel-related operation and maintenance management with distributed deployment experience.
It should be noted that for simplicity of description, the above method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present application is not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects.
Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
Finally, it should be noted that: although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that changes may be made in the embodiments and/or equivalents thereof without departing from the spirit and scope of the invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (6)

1. A distributed network security event correlation analysis method based on a knowledge graph is characterized by specifically comprising the following steps:
step 1) constructing a network security knowledge graph comprising five dimensions, namely a basic dimension, a vulnerability dimension, a threat dimension, an alarm event dimension and an attack rule dimension;
step 2) designing and realizing a security event correlation analysis algorithm on the basis of the knowledge graph constructed in the step 1);
step 3) constructing a real-time big data analysis platform, and applying the association analysis algorithm designed in the step 2) to the constructed big data platform to realize a distributed association analysis system;
in the step 2), the method comprises the following steps:
step 2-1) a security event preprocessing sub-step, comprising:
preprocessing alarms in various formats generated by security products provided by different manufacturers while analyzing alarm events;
step 2-2) a security event verification sub-step comprising: judging whether the alarm belongs to a real alarm or not through the incidence relation between the alarm and the base dimension and the vulnerability dimension;
step 2-3) a scene reconstruction substep comprising: discovering the real attack purpose of an attacker through the relationship between the associated events;
in the step 2-2), the method further comprises the following steps:
normalizing the security event and acquiring a target ip address;
the method comprises the steps that a host is quickly positioned according to a target ip address, state information of the host, including CPU utilization rate, loopholes and network broadband utilization rate, can be acquired through in-band or out-of-band acquisition, and retention or filtering of an alarm event is judged by comparing the characteristics of a security event with acquired host information;
classifying based on the target ip, then quickly searching scenes related to the event from a knowledge graph by using Cypher, and if the scenes do not exist in the memory, buffering the scenes into the memory;
constructing a scene instance through the static scene knowledge and the target ip, waiting for the modification of the corresponding event field information in the scene instance when the subsequent same ip comes, and outputting the scene if the instances contained in the scene are all satisfied;
in the step 2-3), the method specifically comprises the following steps:
for each incoming event, firstly searching whether a scene related to the event exists in a buffer, wherein the buffer is realized by using a Mongobb non-relational database;
judging whether an event needing to be found exists in the buffer or not through a field searching function of the Mongdb;
if all scenes related to the event are not searched in the knowledge graph firstly, buffering the result into Mongodb;
searching whether an instance related to the event target ip exists in the scene instance to be matched, if not, indicating that an attack aiming at the ip does not occur, and creating a scene instance related to the ip at the moment;
each scene instance will contain several alarms, and each alarm will contain several security events;
therefore, the process of the correlation analysis is a process of continuously matching the alarm attribute and the event attribute from 0 to 1 in the scene, if all the fields are set to be 1, the scene is satisfied, the true attack purpose of the attack is the attack scene, the whole correlation process is finished, and the correlation result is output.
2. The knowledge-graph-based distributed network security event correlation analysis method according to claim 1, wherein the step 1) specifically comprises:
collecting knowledge information of five dimensions, wherein the knowledge information comprises CVE vulnerability knowledge, CAPEC attack classification knowledge, CWE host software knowledge, Snort alarm event knowledge and attack rule expert knowledge;
extracting entity attribute information by writing an xml processing program and a regular expression;
and connecting all sub-graph spectrums into a finished and available graph spectrum according to the relation of the edges required to be constructed among different dimensions.
3. The knowledge-graph-based distributed network security event correlation analysis method according to claim 1 or 2, wherein the step 2-1) specifically comprises: and preprocessing the information reported by different devices into a unified data format.
4. The knowledge-graph-based distributed network security event correlation analysis method according to claim 1, wherein in step 3), the method specifically comprises: the association analysis process of the step 2) is realized on the basis of a real-time big data processing platform Strom, and the realization process needs to be assisted by a plurality of non-relational databases.
5. The knowledge-graph-based distributed network security event correlation analysis method according to claim 1, wherein in the step 2-1), the alarm event comprises: attack events and spurious noise events; in the step 2-2), performing an event verification function on the alarm event, and reserving a real event;
the upper layer of the event is provided with an alarm layer, the upper layer of the alarm layer is provided with a scene layer, and a tree structure organization is formed among the alarm layer, the scene layer and the scene layer.
6. The knowledge-graph-based distributed network security event correlation analysis method of claim 1, wherein the knowledge graph is implemented by using a Neo4j non-relational database, and the search of the database is displayed by using a graph retrieval language Cypher provided by Neo4 j;
because there is also an alarm layer in between the event and the scene, finding the scene associated with the event using Cypher is a second order relational search.
CN201810036765.9A 2018-01-15 2018-01-15 Knowledge graph-based distributed security event correlation analysis method Active CN108270785B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810036765.9A CN108270785B (en) 2018-01-15 2018-01-15 Knowledge graph-based distributed security event correlation analysis method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810036765.9A CN108270785B (en) 2018-01-15 2018-01-15 Knowledge graph-based distributed security event correlation analysis method

Publications (2)

Publication Number Publication Date
CN108270785A CN108270785A (en) 2018-07-10
CN108270785B true CN108270785B (en) 2020-06-30

Family

ID=62775447

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810036765.9A Active CN108270785B (en) 2018-01-15 2018-01-15 Knowledge graph-based distributed security event correlation analysis method

Country Status (1)

Country Link
CN (1) CN108270785B (en)

Families Citing this family (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108933793B (en) * 2018-07-24 2020-09-29 中国人民解放军战略支援部队信息工程大学 Attack graph generation method and device based on knowledge graph
CN109005069B (en) * 2018-08-29 2021-07-09 中国人民解放军国防科技大学 Network security knowledge graph association analysis method based on heaven-earth integrated network
CN108881316B (en) * 2018-08-30 2020-12-22 中国人民解放军国防科技大学 Attack backtracking method under heaven and earth integrated information network
CN109347801B (en) * 2018-09-17 2021-03-16 武汉大学 Vulnerability exploitation risk assessment method based on multi-source word embedding and knowledge graph
CN109726293B (en) * 2018-11-14 2020-12-01 数据地平线(广州)科技有限公司 Causal event map construction method, system, device and storage medium
CN109639670B (en) * 2018-12-10 2021-04-16 北京威努特技术有限公司 Knowledge graph-based industrial control network security situation quantitative evaluation method
CN109413109B (en) * 2018-12-18 2021-03-05 中国人民解放军国防科技大学 Heaven and earth integrated network oriented security state analysis method based on finite-state machine
CN110162976B (en) * 2019-02-20 2023-04-18 腾讯科技(深圳)有限公司 Risk assessment method and device and terminal
CN109948911B (en) * 2019-02-27 2021-03-19 北京邮电大学 Evaluation method for calculating network product information security risk
CN111651591B (en) * 2019-03-04 2023-03-21 腾讯科技(深圳)有限公司 Network security analysis method and device
CN110378126B (en) * 2019-07-26 2021-03-26 北京中科微澜科技有限公司 Vulnerability detection method and system
CN110704837A (en) * 2019-09-25 2020-01-17 南京源堡科技研究院有限公司 Network security event statistical analysis method
CN110704743B (en) * 2019-09-30 2022-02-18 北京科技大学 Semantic search method and device based on knowledge graph
CN110807104B (en) * 2019-11-08 2023-04-14 上海明胜品智人工智能科技有限公司 Method and device for determining abnormal information, storage medium and electronic device
CN111010311B (en) * 2019-11-25 2022-07-08 江苏艾佳家居用品有限公司 Intelligent network fault diagnosis method based on knowledge graph
CN110933101B (en) * 2019-12-10 2022-11-04 腾讯科技(深圳)有限公司 Security event log processing method, device and storage medium
CN111611409B (en) * 2020-06-17 2023-06-02 中国人民解放军国防科技大学 Case analysis method integrated with scene knowledge and related equipment
CN111813953B (en) * 2020-06-23 2023-07-07 广州大学 Knowledge body-based distributed knowledge graph construction system and method
CN111797406A (en) * 2020-07-15 2020-10-20 智博云信息科技(广州)有限公司 Medical fund data analysis processing method and device and readable storage medium
CN111988339B (en) * 2020-09-07 2022-03-11 珠海市一知安全科技有限公司 Network attack path discovery, extraction and association method based on DIKW model
CN112291261A (en) * 2020-11-13 2021-01-29 福建奇点时空数字科技有限公司 Network security log audit analysis method driven by knowledge graph
CN112613038B (en) * 2020-11-27 2023-12-08 中山大学 Knowledge graph-based security vulnerability analysis method
CN113259364B (en) * 2021-05-27 2021-10-22 长扬科技(北京)有限公司 Network event correlation analysis method and device and computer equipment
CN113312499B (en) * 2021-06-15 2022-10-04 合肥工业大学 Power safety early warning method and system based on knowledge graph
CN113596025A (en) * 2021-07-28 2021-11-02 中国南方电网有限责任公司 Power grid security event management method
CN113507486B (en) * 2021-09-06 2021-11-19 中国人民解放军国防科技大学 Method and device for constructing knowledge graph of important infrastructure of internet
CN114039765A (en) * 2021-11-04 2022-02-11 全球能源互联网研究院有限公司 Safety management and control method and device for power distribution Internet of things and electronic equipment
CN114844707B (en) * 2022-05-07 2024-04-02 南京南瑞信息通信科技有限公司 Power grid network security analysis method and system based on graph database
CN116401898B (en) * 2023-06-08 2023-08-22 中国人民解放军国防科技大学 Simulation application combination system and method based on publishing subscription

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103124223A (en) * 2011-12-21 2013-05-29 中国科学院软件研究所 Method for automatically judging security situation of IT (information technology) system in real time
CN104539626A (en) * 2015-01-14 2015-04-22 中国人民解放军信息工程大学 Network attack scene generating method based on multi-source alarm logs
CN105119945A (en) * 2015-09-24 2015-12-02 西安未来国际信息股份有限公司 Log association analysis method for safety management center
CN105681303A (en) * 2016-01-15 2016-06-15 中国科学院计算机网络信息中心 Big data driven network security situation monitoring and visualization method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170293698A1 (en) * 2016-04-12 2017-10-12 International Business Machines Corporation Exploring a topic for discussion through controlled navigation of a knowledge graph

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103124223A (en) * 2011-12-21 2013-05-29 中国科学院软件研究所 Method for automatically judging security situation of IT (information technology) system in real time
CN104539626A (en) * 2015-01-14 2015-04-22 中国人民解放军信息工程大学 Network attack scene generating method based on multi-source alarm logs
CN105119945A (en) * 2015-09-24 2015-12-02 西安未来国际信息股份有限公司 Log association analysis method for safety management center
CN105681303A (en) * 2016-01-15 2016-06-15 中国科学院计算机网络信息中心 Big data driven network security situation monitoring and visualization method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
CyGraph: Graph-Based Analytics and Visualization for Cybersecurity;Noel S et al;;《Handbook of Statistics》;20161231;第117-167页 *
KGBIAC:Knowledge Graph Based Intelligent Alert Correlation Framework;Wei Wang et al;;《International Symposium on Cyberspace Safety and Security》;20171231;第523-532页 *
基于网络安全态势感知的网络系统自防御体系;章学妙 等;《计算机应用与软件》;20171206;第159-165页 *

Also Published As

Publication number Publication date
CN108270785A (en) 2018-07-10

Similar Documents

Publication Publication Date Title
CN108270785B (en) Knowledge graph-based distributed security event correlation analysis method
US10909241B2 (en) Event anomaly analysis and prediction
US11343268B2 (en) Detection of network anomalies based on relationship graphs
US10419465B2 (en) Data retrieval in security anomaly detection platform with shared model state between real-time and batch paths
Chhabra et al. Cyber forensics framework for big data analytics in IoT environment using machine learning
CN109902297B (en) Threat information generation method and device
Petrenko et al. Problem of developing an early-warning cybersecurity system for critically important governmental information assets
CN111209269A (en) Big data management system of wisdom city
CN111885040A (en) Distributed network situation perception method, system, server and node equipment
CN113347170B (en) Intelligent analysis platform design method based on big data framework
CN107332685A (en) A kind of method based on big data O&M daily record applied in state's net cloud
Las-Casas et al. A big data architecture for security data and its application to phishing characterization
CN115514558A (en) Intrusion detection method, device, equipment and medium
US10630715B1 (en) Methods and system for characterizing infrastructure security-related events
Cinque et al. A graph-based approach to detect unexplained sequences in a log
Wang et al. A novel multi-source fusion model for known and unknown attack scenarios
Wadhera et al. A systematic Review of Big data tools and application for developments
Ding et al. A data-driven based security situational awareness framework for power systems
Kim et al. Scalable security event aggregation for situation analysis
Xu et al. Research on the computer informatization in multimedia public opinion monitoring
Deshpande et al. A Comprehensive Performance Evaluation of Novel Big Data Log Analytic Framework
Pekarčík et al. Real-time processing of cybersecurity system data for attacker profiling
US20210374103A1 (en) Cognitive disparate log association
Peña et al. A “Fast Data” architecture: Dashboard for anomalous traffic analysis in data networks
Qi et al. A Distributed Framework for APT Attack Analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant