CN115296924A - Network attack prediction method and device based on knowledge graph - Google Patents

Network attack prediction method and device based on knowledge graph Download PDF

Info

Publication number
CN115296924A
CN115296924A CN202211156094.2A CN202211156094A CN115296924A CN 115296924 A CN115296924 A CN 115296924A CN 202211156094 A CN202211156094 A CN 202211156094A CN 115296924 A CN115296924 A CN 115296924A
Authority
CN
China
Prior art keywords
attack
data
knowledge
network
graph
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211156094.2A
Other languages
Chinese (zh)
Other versions
CN115296924B (en
Inventor
饶志宏
刘方
徐锐
聂大成
陈剑锋
许卡
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CETC 30 Research Institute
Original Assignee
CETC 30 Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CETC 30 Research Institute filed Critical CETC 30 Research Institute
Priority to CN202211156094.2A priority Critical patent/CN115296924B/en
Publication of CN115296924A publication Critical patent/CN115296924A/en
Application granted granted Critical
Publication of CN115296924B publication Critical patent/CN115296924B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/288Entity relationship models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1433Vulnerability analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic

Abstract

The invention discloses a network attack prediction method and a device based on a knowledge graph, belonging to the field of network security and comprising the following steps: s101, acquiring data; s102, preprocessing the acquired data; s103, constructing a network security body facing to network attack; s104, extracting data according to the defined knowledge expression model; s105, fusing and correcting the extracted various data to construct a network security knowledge graph; and S106, predicting the attack event by using the constructed network security knowledge graph. The invention improves the prediction accuracy of the attack behavior.

Description

Network attack prediction method and device based on knowledge graph
Technical Field
The invention relates to the field of network security, in particular to a network attack prediction method and device based on a knowledge graph.
Background
At present, networks have penetrated into people's lives from all corners, and various attack strategies are continuously emerging and renewed. The network malicious intrusion attack has been developed from single simple operation (password cracking, file damage, webpage tampering and the like) in the early stage to complex multiple means (vulnerability attack, virus propagation, domain name hijacking, denial of service, APT attack and the like). The possibility that the attack target can be threatened through the single-step attack behavior is very low, most attackers realize an action plan with a specific target through a series of steps and combined coordinated attacks, so that the network has increasingly serious security problems, and the network security presents an offensive and refractory situation. At present, network attack prediction is a key link for realizing active defense of network security. The method researches how to discover behaviors and rules of hacker intrusion by using massive network security data, predicts multistep attack behaviors possibly suffered by a network system in the future, a final target of the hacker intrusion and facilities and equipment possibly suffering from threats, and can take effective and targeted measures to defend and prevent the network system.
At present, there are many methods for predicting network attacks, and according to mode classification of prediction methods, currently mainstream prediction methods are classified into prediction methods based on a neural network, prediction methods based on a game theory, prediction methods based on an attack graph, prediction methods based on data mining, and other methods.
The prediction method based on the neural network is based on an artificial neural network algorithm, has absolute advantages when learning the nonlinear characteristics of the network attack event sequence, has good fitting property, self-learning and self-memory of a target sample and other characteristics, can obtain the characteristic mode of complex nonlinear data in the intelligent attack event, and has the typical work of Tiresias, BRNN-LSTM, ALEAP and the like. The prediction method based on the neural network is based on large-scale sample training, has high accuracy in mining the logical relation and the rule among network attack events, but has strong dependence on the quality of data samples, takes long training time, has high cost, is easy to fall into local minimum points, and is easy to generate overfitting so that the generalization capability is poor.
The prediction method based on the game theory is generally aimed at a confrontation environment with an attack and defense game, different game models are established according to the integrity of opponent information mastered by an attacker and a defender, and the prediction models work in a NashSVM algorithm, a double zero-sum static game, a random prediction game, a dynamic Bayesian game and the like. The method based on the game theory considers the income type strategic reasoning, can more deeply understand the intention of an attacker, including the attack target, the attack source, the relation among attack behaviors and the like, and describes the logical relationship among the behaviors, so as to play games and fight against the attacker and make more targeted decisions.
The prediction method based on the attack graph constructs a model by a graph network structure, such as a directed attack graph, a Markov chain, a Bayesian network graph and the like, and representatively works in a botnet dependency graph, an uncertainty perception attack graph, a double-layer attack and defense model combining the attack graph and a game theory and the like. The algorithm generally takes the identity as a node, an attack means as an edge of a graph network, different relations among entities are represented, the algorithm performs better in a small-scale data scene, and certain priori knowledge is needed as a basis.
Compared with the previous 3 prediction methods, data mining has stronger characterization capability on hidden features and internal modes of deep data, but is generally used as a technical means in the process, and representative work comprises an emotion analysis method, similarity sequence alignment and recommendation system construction. The prediction method based on data mining is used for mining rules among attack information by carrying out statistical analysis, rule association, classification induction and the like on a large number of prior knowledge such as attack alarms, detection results and the like, and classifying and predicting future attacks; or the method is combined with algorithm modeling prediction such as an attack graph and a game theory, and has good performance on the prediction of phishing websites and social network attacks.
The existing network attack prediction method has the following problems: (1) For some compound attacks, direct association may not be provided between multiple attack behaviors, or extraction of behavior characteristics is difficult, for example, encrypted router in-and-out traffic, deep data packets and the like, and for such attacks, the existing prediction method cannot associate attack events initiated by the same attacker, so that prediction errors occur; (2) Hidden features and internal modes of a general data deep layer can represent logical association between attack behaviors and complex attack intentions of attackers, and the existing method cannot reason a plurality of hidden features and implicit relations, so that the prediction accuracy is low; (3) The method aims at the situation that false alarm and false alarm exist in alarm information of an intrusion detection system, the alarm information is used as an important data source for network attack prediction, the attack path prediction is wrong due to wrong alarm information, the fault-tolerant capability of the existing attack prediction method is low, and the prediction accuracy rate in practical application is very low.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, provides a network attack prediction method and device based on a knowledge graph, and improves the prediction accuracy rate of attack behaviors and the like.
The purpose of the invention is realized by the following scheme:
a network attack prediction method based on knowledge graph includes the following steps:
s101, acquiring data;
s102, preprocessing the acquired data;
s103, constructing a network security body facing to network attack;
s104, extracting data according to the defined knowledge expression model;
s105, fusing and correcting the extracted various data to construct a network security knowledge graph;
and S106, predicting the attack event by using the constructed network security knowledge graph.
Further, in step S101, the acquired data includes network asset detection data, vulnerability information data, threat intelligence data, and security device log data.
Further, in step S102, the preprocessing includes data normalization processing, data deduplication and merging processing, data classification processing, and data spatio-temporal registration processing.
Further, in step S103, the sub-step of: and defining a knowledge expression model, and performing knowledge expression by adopting a triple.
Further, in step S104, various types of data triples are extracted according to the defined knowledge expression model.
Further, in step S105, the correcting includes aggregating and merging the security events, correcting event reliability, correcting mutually exclusive events, removing false alarm events, and completing false alarm events; the network security knowledge map comprises an attack mode map, a threat intelligence map and a network asset map; the method specifically comprises the following substeps:
1) Merging the data of the same equipment;
2) Merging the consistent event data;
3) Tagging events based on threat information data, analyzing credibility, correcting exclusive events, and eliminating false alarm events;
4) Establishing an attack pattern library aiming at known attacks by utilizing expert knowledge; completing the missed report event according to the attack mode library; according to the description of the attack chain in the attack mode library, in the related multi-step attack, if one attack step is found to be omitted, whether the omitted attack step is a necessary step of the next attack step of the multi-step attack step is judged, if yes, the fact that the security equipment fails to report the attack event is inferred, and the multi-step attack event is completed; if not, directly entering the step 5);
5) Carrying out map construction on the corrected data to form an attack mode map, a threat information map and a network asset map; and accessing and storing the map by using the basic database.
Further, the basic database is a Neo4J database.
Further, in step S106, the following sub-steps are included:
1) Representing structured knowledge in a knowledge graph for network security as an undirected graphG=(V,E) In which
Figure 471121DEST_PATH_IMAGE001
Represents a collection of entity nodes in the graph,Erepresenting a collection of various relational edges between entities; each triplet in the network security knowledge graph is represented as
Figure 66050DEST_PATH_IMAGE002
Wherein
Figure 733792DEST_PATH_IMAGE003
And
Figure 996146DEST_PATH_IMAGE004
respectively representing linked head and tail entity nodes,
Figure 458351DEST_PATH_IMAGE005
representing the relationship between the two entity nodes; embedding a heterogeneous network of a network security knowledge graph into a low-dimensional vector space to form a low-dimensional vector;
2) On the basis of vectorization, adding a constraint rule condition, converting the constraint condition into a basic database query statement to obtain a candidate sub-graph, performing similarity calculation on the candidate sub-graph, measuring the similarity of an attack event sequence detected by security equipment and an attack mode graph in a constructed knowledge graph by using a similarity calculation algorithm, excavating a hidden relation and a path of the attack event sequence, and predicting an attack path and a target; the constraint rule conditions comprise a vulnerability to be utilized when an attack event sequence occurs, asset attributes of an attack target, and the premise that one attack event occurs is that after a certain attack is successfully executed;
3) Calculating the shortest path of the vectorized attack event sequence and the attack mode subgraph filtered by the constraint condition based on a DTW algorithm;
4) Correcting the obtained attack pattern subgraph by depending on a domain expert;
5) And predicting the attack path and the attack target according to the obtained attack mode subgraph.
Further, in step 3), the method comprises the sub-steps of:
step (1): vectorizing the attack event sequence and the attack mode subgraph filtered by the constraint condition;
step (2): calculating a distance matrix between the vectorized attack event sequence and each attack mode sequence in the attack mode subgraph;
and (3): and finding a path from the upper left corner to the lower right corner of the matrix, wherein if the sum of elements on the path is minimum, the path is an attack pattern subgraph matched with the attack event sequence.
A network attack prediction device based on a knowledge graph comprises a program storage unit and a program running unit, and when a program in the program storage unit is loaded by the program running unit, the network attack prediction device based on the knowledge graph executes the network attack prediction method based on the knowledge graph.
The beneficial effects of the invention include:
(1) The method comprises the steps of carrying out normalization, duplicate removal, cleaning, classification and space-time matching processing on multi-source heterogeneous data by collecting network asset detection data, vulnerability data, open source threat information data, security and protection equipment log data and the like to form standardized format data; and constructing a network security ontology based on the knowledge of the network security field. And (3) extracting knowledge of the network asset detection data, the vulnerability data and the like by combining a network security knowledge expression model, constructing network security knowledge maps such as an attack mode map, a threat intelligence map and a network asset map, and predicting the network attack based on the constructed knowledge maps.
(2) The invention combines the network security domain ontology to extract the knowledge of the network asset data, the vulnerability data, the threat information data and the security equipment log data, and the constructed network security ontology refers to the network security knowledge map description language at home and abroad, thereby improving the expansibility and compatibility of the security knowledge map.
(3) The embodiment of the invention utilizes the constructed network security knowledge graph, embeds the heterogeneous network of the knowledge graph into a continuous low-dimensional vector space based on a TransE translation model, and introduces constraint conditions to improve the efficiency of similarity calculation when calculating the similarity of sub-graphs; meanwhile, similarity calculation is carried out by combining with a Dynamic Time Warping (DTW) algorithm, and matching accuracy can be improved under the condition that false alarm and false alarm are missed in an alarm event sequence, so that the technical effects of improving the accuracy of an attack path and the accuracy of attack target prediction are achieved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a flow chart of a method for predicting cyber-attack based on a knowledge-graph according to an embodiment of the present invention;
FIG. 2 is a flowchart of attack prediction with constraints according to an embodiment of the present invention;
fig. 3 is an example of attack event prediction.
Detailed Description
All features disclosed in all embodiments in this specification, or all methods or process steps implicitly disclosed, may be combined and/or expanded, or substituted, in any way, except for mutually exclusive features and/or steps.
The invention provides a network attack prediction method and a network attack prediction device based on a knowledge graph, aiming at solving the technical problems in the background. The method comprises the following technical concepts: by collecting network asset detection data, vulnerability data, open source threat intelligence data, security equipment log data and the like, the multi-source heterogeneous data are subjected to normalization, duplicate removal, cleaning, classification and space-time matching; and constructing a network security ontology oriented to network attack behaviors based on the knowledge in the network security field. Based on a network security ontology expression model, network security knowledge maps such as an attack mode map, a threat intelligence map and a network asset map are constructed, and target implicit feature mining and implicit relation reasoning are carried out based on the constructed knowledge maps to realize prediction of network attack behaviors. According to the method, the attack behavior prediction accuracy is improved by constructing the network security knowledge maps such as the attack mode map, the threat intelligence map and the network asset map and mining the logic association between the attack behaviors and the attack intention of an attacker by using similar target discovery and implicit relation reasoning.
In order to make the objects, technical solutions and advantages of the present invention clearer and more obvious, the present invention is further described in detail below with reference to the accompanying drawings and technical solutions.
Fig. 1 is a flowchart of a method for predicting a cyber attack based on a knowledge graph according to an embodiment of the present invention, including the following steps:
step S101: the target network assets are detected by using a network port scanning tool, so that network port scanning data, certificate data, DNS data, web website frame data and the like can be obtained; acquiring vulnerability information from vulnerability information bases, vulnerability forums, personal blogs, twitters, gitHub and other information sources through a web crawler technology; and acquiring security event information, IOC information and the like from security company security bulletins, hacker forums, security websites, third party threat information and the like by utilizing a web crawler technology. And obtaining the log data of security equipment such as firewall logs, intrusion detection system logs, sandbox logs and the like by using a cooperation mode.
Step S102: preprocessing the network asset detection data, the vulnerability data, the threat information data and the log data of the security equipment, wherein the processing flow is as follows:
1) Data normalization processing: carrying out homogenization treatment on multi-source heterogeneous data, unifying field structures, and then carrying out data format conversion, including data type conversion, date and time format conversion, chinese coding conversion and conversion from coding to name;
2) Data deduplication and merging: comparing the cleaned data with data in a database according to the key fields, judging whether redundant data exist, and directly storing the data into the database if the redundant data do not exist; if redundant data exists, whether each field of the new data and the existing data is completely the same is judged, and if the fields of the new data and the existing data are the same, the new data is discarded. If not, judging whether the fields with different values have conflict. If no conflict exists, combining the new data with the existing data; and if the new data and the existing data fields have conflict, merging the data after the conflict is eliminated.
3) Data classification: and classifying the multi-source data by using a decision tree, and classifying the data in a multi-layer manner through layering.
4) Data space-time registration: and (3) associating and matching the basic data with time and space (mechanism and geographical position information) coordinates, and marking a space-time coordinate label on each piece of basic data.
Step S103: in terms of knowledge representation, embodiments of the present invention employ triples (entities, relationships, attributes) for knowledge representation. According to ontology modeling of network asset data, referring to CYBOX2.0, entities of the network asset data comprise IP, ports, protocols, equipment, operating systems, certificates, domain names, AS numbers and the like, relationships comprise has relationships, belong _ to relationships, ower relationships and the like, naming rules of the entities and the relationships refer to naming rules of the CYBOX2.0, and compatibility with network security knowledge graph description languages at home and abroad is facilitated. And (4) referring to the STIX standard aiming at triple definitions of vulnerability data, threat intelligence data and security and protection equipment log data. For vulnerability data, an operating system, hardware equipment, software, a protocol, a vulnerability, a utilization code and the like are defined as entity tags, and have a has relation, a cause relation and the like. Entities such as equipment, software, protocols, vulnerabilities, attack events, attack tools and the like are defined for threat information data, and have a has relation, a cause relation, a belong _ to relation and the like. For the log data of the security equipment, entities such as IP, ports and events are defined, and have a has relation, a cause relation, a belong _ to relation and the like.
Step S104: extracting various data triples according to a defined knowledge expression model, and the specific steps are as follows:
1) And (4) aiming at the fact that the network asset data belongs to the structured data, extracting the knowledge of the network asset data directly by adopting a D2R tool according to the knowledge expression model defined in the step S106.
2) Aiming at vulnerability information obtained by crawling a webpage by a web crawler, belonging to semi-structured data, extracting entities, relations and attributes by adopting a rule-based entity identification algorithm; for the vulnerability information obtained from the vulnerability database, since it is structured data, it can be directly extracted according to the knowledge expression model defined in step S103.
3) For threat intelligence data, the web page data acquired by the web crawler is utilized, knowledge extraction is carried out by adopting entity identification based on rules, and the structured threat intelligence data from a third party can be directly extracted according to the knowledge expression model defined in the step S103.
4) For the log data of the security equipment, referring to attack mechanism mode classification, malicious code mode classification and hidden danger mode systems of foreign mainstream CAPEC and ATT & CK, establishing an attack mode library by using expert knowledge, and classifying the log data of the security equipment based on the attack mode library. Because the attack pattern library is structured data, direct extraction can be performed according to the knowledge expression model defined in step S103.
Step S105: and performing fusion correction on various extracted data, including aggregation merging of security events, event reliability correction, mutual exclusion event correction, false alarm event removal, missed alarm event completion and the like, and constructing network security knowledge maps such as an attack mode map, a threat information map, a network asset map and the like. The method comprises the following specific steps:
1) Merging the data of the same equipment;
2) Merging event data consistent with the source IP, the source port, the destination IP and the destination port;
3) And labeling the event based on the threat information data, analyzing the credibility, correcting the mutually exclusive event, and rejecting the false alarm event. For example, for an attack event, discovering that the attack event utilizes a vulnerability of a Window operating system through threat intelligence, and discovering that an attacked target is a Linux operating system through asset detection data, it is inferred that the attack cannot occur at the asset at all, so the attack event can be marked as a false report;
4) Establishing an attack pattern library aiming at known attacks by utilizing expert knowledge; completing the missed report event according to the attack mode library; according to the description of the attack chain in the attack mode library, in the related multi-step attack, if one attack step is found to be omitted, whether the omitted attack step is a necessary step of the next attack step of the multi-step attack step is judged, if yes, the fact that the security equipment fails to report the attack event is inferred, and the multi-step attack event is completed; if not, directly entering the step 5);
5) And (4) carrying out map construction on the corrected data to form an attack mode map, a threat intelligence map and a network asset map, and accessing and storing the maps by adopting Neo4J as a basic database.
Step S106: by utilizing the constructed network security knowledge graph, some network anomalies and attacks can be effectively discovered, hidden relations and paths of security threats are excavated, and the attacks are predicted, as shown in FIG. 2, the specific steps are as follows:
1) Structured knowledge in a knowledge graph for network security can be represented as an undirected graphG=(V,E) Wherein
Figure 806156DEST_PATH_IMAGE001
Represents a collection of entity nodes in the graph,Erepresenting a collection of various relational edges between entities. Each triplet in the security knowledge graph is represented as
Figure 12009DEST_PATH_IMAGE002
In which
Figure 128870DEST_PATH_IMAGE003
And
Figure 27556DEST_PATH_IMAGE004
respectively representing linked head and tail entity nodes,
Figure 862657DEST_PATH_IMAGE005
representing the relationship between the two entity nodes. The embodiment of the invention adopts a TransE-based translation model to embed the heterogeneous network of the knowledge graph into a continuous low-dimensional vector space to form a low-dimensional vector.
2) On the basis of vectorization, the similarity of an attack event sequence detected by security equipment and an attack mode map in a constructed knowledge map is measured by using a similarity calculation algorithm, the hidden relation and the path of the attack event sequence are excavated, and the attack path and the target are predicted. Aiming at the problem that a constraint relation exists between attack event sequences and a certain constraint relation also exists between attack events and an attack target environment, if the distance between each node is measured directly according to the attribute and the relation of the node and a sub-graph structure, a plurality of useless attack modes can be obtained, and the similarity calculation complexity is increased along with the increase of the graph scale. For example, for a multi-step attack, a general attack flow includes target reconnaissance, tool making, tool delivery, attack penetration, installation implantation, command control and malicious activities, and when the security device detects the first several attack stages, it is possible to associate a plurality of attack modes from an attack mode map through similarity calculation, and it is difficult to determine the true attack intention of an attacker. Therefore, it is necessary to add corresponding constraint rules before calculating the similarity, such as a vulnerability to be utilized when an attack event sequence occurs, an asset attribute of an attack target, and a constraint condition that one of the attack events occurs must be converted into a Neo4J query statement Cypher after a certain attack is successfully executed, so as to obtain a candidate subgraph, and then perform similarity calculation on the candidate subgraph.
3) Aiming at the problem that an attack event sequence is not executed in every attack step, multiple attack means can be adopted in every attack stage, false alarm or missing report can exist in an intrusion detection system, and similarity calculation of the attack event sequence is carried outW(n) And describing the time corresponding relation between the test template and the reference template, and solving a regularization function corresponding to the minimum accumulated distance when the two templates are matched. By the inventionAfter the inventor creatively thinks, the shortest path is calculated by the vectorized attack event sequence and the attack mode subgraph filtered by the constraint condition based on the DTW algorithm, and the specific implementation steps are as follows:
step (1): vectorizing the attack event sequence and the attack mode subgraph filtered by the constraint condition;
step (2): calculating a distance matrix between the vectorized attack event sequence and each attack mode sequence in the attack mode subgraph;
and (3): and finding a path from the upper left corner to the lower right corner of the matrix, wherein if the sum of elements on the path is minimum, the path is an attack pattern subgraph matched with the attack event sequence.
4) Correcting the obtained attack pattern subgraph by depending on a domain expert;
5) And predicting the attack path and the attack target according to the obtained attack mode subgraph.
Fig. 3 shows an example of predicting an attack path and an attack target from the acquired attack pattern subgraph. The method comprises the steps of receiving an attack alarm sequence aiming at a certain asset from security equipment, carrying out attack mode matching by using a knowledge graph, finding a similar attack mode subgraph, finding that the asset also has the vulnerabilities and asset attributes according to the vulnerabilities and asset attributes associated with an attack mode, and recording a corresponding attack path by combining a time sequence if the attack spreads, so that early warning can be carried out in advance, the asset can start defense in advance, further attack is prevented, and meanwhile, if other assets directly connected with the asset have the same vulnerabilities and asset attributes, other assets can be early warned in advance, and attack diffusion is avoided.
Example 1
A network attack prediction method based on a knowledge graph comprises the following steps:
s101, acquiring data;
s102, preprocessing the acquired data;
s103, constructing a network security body facing to network attack;
s104, extracting data according to the defined knowledge expression model;
s105, fusing and correcting the extracted various data to construct a network security knowledge graph;
and S106, predicting the attack event by using the constructed network security knowledge graph.
Example 2
On the basis of embodiment 1, in step S101, the acquired data includes network asset detection data, vulnerability information data, threat intelligence data, and security device log data.
Example 3
On the basis of embodiment 1, in step S102, the preprocessing includes a data normalization process, a data deduplication and merging process, a data classification process, and a data spatiotemporal registration process.
Example 4
On the basis of embodiment 1, in step S103, the method includes the sub-steps of: and defining a knowledge expression model, and performing knowledge expression by adopting a triple.
Example 5
Based on embodiment 4, in step S104, various types of data triples are extracted according to the defined knowledge expression model.
Example 6
On the basis of embodiment 1, in step S105, the correction includes aggregation merging of security events, event reliability correction, mutual exclusion event correction, false alarm event removal, and false negative event completion; the network security knowledge map comprises an attack mode map, a threat intelligence map and a network asset map; the method specifically comprises the following substeps:
1) Merging the data of the same equipment;
2) Merging the consistent event data;
3) Tagging events based on threat information data, analyzing credibility, correcting exclusive events, and eliminating false alarm events;
4) Establishing an attack pattern library aiming at known attacks by utilizing expert knowledge; completing the missed report event according to the attack mode library; according to the description of the attack chain in the attack mode library, in the related multi-step attack, if one attack step is found to be omitted, whether the omitted attack step is a necessary step of the next attack step of the multi-step attack step is judged, if yes, the fact that the security equipment fails to report the attack event is inferred, and the multi-step attack event is completed; if not, directly entering the step 5);
5) Carrying out map construction on the corrected data to form an attack mode map, a threat information map and a network asset map; and accessing and storing the map by using the basic database.
Example 7
On the basis of example 6, the base database is the Neo4J database.
Example 8
On the basis of embodiment 1, in step S106, the following sub-steps are included:
1) Representing structured knowledge in a knowledge graph for network security as an undirected graphG=(V,E) Wherein
Figure 606622DEST_PATH_IMAGE001
Represents a collection of entity nodes in the graph,Erepresenting a collection of various relational edges between entities; each triplet in the network security knowledge graph is represented as
Figure 858216DEST_PATH_IMAGE002
Wherein
Figure 662224DEST_PATH_IMAGE003
And
Figure 125567DEST_PATH_IMAGE004
respectively representing linked head and tail entity nodes,
Figure 797857DEST_PATH_IMAGE005
representing a relationship between the two entity nodes; embedding a heterogeneous network of a network security knowledge graph into a low-dimensional vector space to form a low-dimensional vector;
2) On the basis of vectorization, adding a constraint rule condition, converting the constraint condition into a basic database query statement to obtain a candidate subgraph, performing similarity calculation on the candidate subgraph, measuring the similarity of an attack event sequence detected by security equipment and an attack mode map in a constructed knowledge map by using a similarity calculation algorithm, excavating a hidden relation and a path of the attack event sequence, and predicting an attack path and a target; the constraint rule conditions comprise a vulnerability to be utilized when an attack event sequence occurs, asset attributes of an attack target, and the premise that one attack event occurs is that after a certain attack is successfully executed;
3) Calculating the shortest path of the vectorized attack event sequence and the attack mode subgraph filtered by the constraint condition based on a DTW algorithm;
4) Correcting the obtained attack pattern subgraph by depending on a domain expert;
5) And predicting the attack path and the attack target according to the obtained attack mode subgraph.
Example 9
On the basis of embodiment 8, in step 3), the method comprises the sub-steps of:
step (1): vectorizing the attack event sequence and the attack mode subgraph filtered by the constraint condition;
step (2): calculating a distance matrix between the vectorized attack event sequence and each attack mode sequence in the attack mode subgraph;
and (3): and finding a path from the upper left corner to the lower right corner of the matrix, wherein if the sum of elements on the path is minimum, the path is an attack pattern subgraph matched with the attack event sequence.
Example 10
A device for predicting a network attack based on a knowledge graph, comprising a program storage unit and a program execution unit, wherein the method for predicting a network attack based on a knowledge graph according to any one of embodiments 1 to 9 is performed when a program in the program storage unit is loaded by the program execution unit.
The units described in the embodiments of the present invention may be implemented by software, or may be implemented by hardware, and the described units may also be disposed in a processor. Wherein the names of the elements do not in some way constitute a limitation on the elements themselves.
According to an aspect of the application, a computer program product or computer program is provided, comprising computer instructions, the computer instructions being stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device executes the method provided in the above-mentioned various alternative implementation modes.
As another aspect, the present application also provides a computer-readable medium, which may be contained in the electronic device described in the above embodiments; or may be separate and not incorporated into the electronic device. The computer readable medium carries one or more programs which, when executed by an electronic device, cause the electronic device to implement the method described in the above embodiments.
The parts not involved in the present invention are the same as or can be implemented using the prior art.
The above-described embodiments are intended to be illustrative only, and various modifications and variations such as those described in the above-described embodiments of the invention may be readily made by those skilled in the art based upon the teachings and teachings of the present invention without departing from the spirit and scope of the invention.
Other embodiments than the above examples may be devised by those skilled in the art based on the foregoing disclosure, or by adapting and using knowledge or techniques of the relevant art, and features of various embodiments may be interchanged or substituted and such modifications and variations that may be made by those skilled in the art without departing from the spirit and scope of the present invention are intended to be within the scope of the following claims.

Claims (10)

1. A network attack prediction method based on a knowledge graph is characterized by comprising the following steps:
s101, acquiring data;
s102, preprocessing the acquired data;
s103, constructing a network security body facing to network attack;
s104, extracting data according to the defined knowledge expression model;
s105, fusing and correcting the extracted various data to construct a network security knowledge graph;
and S106, predicting the attack event by using the constructed network security knowledge graph.
2. The method for predicting cyber attack based on the knowledge-graph according to claim 1, wherein in the step S101, the acquired data includes cyber asset detection data, vulnerability information data, threat intelligence data and security equipment log data.
3. The method of predicting cyber attacks according to claim 1, wherein in step S102, the preprocessing comprises a data normalization process, a data deduplication and merging process, a data classification process and a data spatiotemporal registration process.
4. The knowledge-graph-based network attack prediction method according to claim 1, comprising the sub-steps of, in step S103: and defining a knowledge expression model, and performing knowledge expression by adopting the triples.
5. The method of predicting cyber-attack based on knowledge-graph according to claim 4, wherein in step S104, each type of data triple is extracted according to a defined knowledge expression model.
6. The method according to claim 1, wherein in step S105, the modification includes aggregating merge of security events, event confidence modification, mutual exclusion event modification, false alarm event removal, and false negative event completion; the network security knowledge graph comprises an attack mode graph, a threat intelligence graph and a network asset graph, and specifically comprises the following substeps:
1) Merging the data of the same equipment;
2) Merging the consistent event data;
3) Tagging events based on threat information data, analyzing credibility, correcting exclusive events, and eliminating false alarm events;
4) Establishing an attack pattern library aiming at known attacks by utilizing expert knowledge; completing the missed report event according to the attack mode library; according to the description of the attack chain in the attack mode library, in the related multi-step attack, if one attack step is found to be omitted, whether the omitted attack step is a necessary step of the next attack step of the multi-step attack step is judged, if yes, the fact that the security equipment fails to report the attack event is inferred, and the multi-step attack event is completed; if not, directly entering step 5);
5) Constructing the corrected data to form an attack mode map, a threat information map and a network asset map; and accessing and storing the map by using the basic database.
7. The knowledge-graph-based cyber-attack prediction method according to claim 6, wherein the base database is a Neo4J database.
8. The knowledge-graph-based network attack prediction method according to claim 1, comprising the following sub-steps in step S106:
1) Representing structured knowledge in a knowledge graph for network security as an undirected graphG=(V,E) In which
Figure 220181DEST_PATH_IMAGE001
Represents a collection of entity nodes in the graph,Erepresenting a collection of various relational edges between entities; each triplet in the network security knowledge graph is represented as
Figure 606163DEST_PATH_IMAGE002
Wherein
Figure 979375DEST_PATH_IMAGE003
And
Figure 843426DEST_PATH_IMAGE004
respectively representing linked head and tail entity nodes,
Figure 251274DEST_PATH_IMAGE005
representing the relationship between the two entity nodes; embedding a heterogeneous network of the network security knowledge graph into a low-dimensional vector space to form a low-dimensional vector;
2) On the basis of vectorization, adding a constraint rule condition, converting the constraint condition into a basic database query statement to obtain a candidate subgraph, performing similarity calculation on the candidate subgraph, measuring the similarity of an attack event sequence detected by security equipment and an attack mode map in a constructed knowledge map by using a similarity calculation algorithm, excavating a hidden relation and a path of the attack event sequence, and predicting an attack path and a target; the constraint rule conditions comprise a vulnerability to be utilized when an attack event sequence occurs, asset attributes of an attack target, and the premise that one attack event occurs is that after a certain attack is successfully executed;
3) Calculating the shortest path of the vectorized attack event sequence and the attack mode subgraph filtered by the constraint condition based on a DTW algorithm;
4) Correcting the obtained attack mode subgraph by depending on a domain expert;
5) And predicting the attack path and the attack target according to the obtained attack mode subgraph.
9. The knowledge-graph-based cyber-attack prediction method according to claim 8, comprising, in the step 3), the sub-steps of:
step (1): vectorizing the attack event sequence and the attack mode subgraph filtered by the constraint condition;
step (2): calculating a distance matrix between the vectorized attack event sequence and each attack mode sequence in the attack mode subgraph;
and (3): and finding a path from the upper left corner to the lower right corner of the matrix, wherein if the sum of elements on the path is minimum, the path is an attack pattern subgraph matched with the attack event sequence.
10. A network attack prediction device based on a knowledge graph is characterized by comprising a program storage unit and a program running unit, wherein the network attack prediction method based on the knowledge graph according to any one of claims 1 to 9 is executed when a program in the program storage unit is loaded by the program running unit.
CN202211156094.2A 2022-09-22 2022-09-22 Network attack prediction method and device based on knowledge graph Active CN115296924B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211156094.2A CN115296924B (en) 2022-09-22 2022-09-22 Network attack prediction method and device based on knowledge graph

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211156094.2A CN115296924B (en) 2022-09-22 2022-09-22 Network attack prediction method and device based on knowledge graph

Publications (2)

Publication Number Publication Date
CN115296924A true CN115296924A (en) 2022-11-04
CN115296924B CN115296924B (en) 2023-01-31

Family

ID=83834393

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211156094.2A Active CN115296924B (en) 2022-09-22 2022-09-22 Network attack prediction method and device based on knowledge graph

Country Status (1)

Country Link
CN (1) CN115296924B (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115664860A (en) * 2022-12-26 2023-01-31 广东财经大学 Network security threat assessment method and system
CN115766258A (en) * 2022-11-23 2023-03-07 西安电子科技大学 Multi-stage attack trend prediction method and device based on causal graph and storage medium
CN115860117A (en) * 2023-02-22 2023-03-28 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) MDATA knowledge extraction method and system based on attack and defense behaviors
CN116318929A (en) * 2023-03-07 2023-06-23 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Attack strategy extraction method based on safety alarm data
CN116319074A (en) * 2023-05-12 2023-06-23 北京安博通科技股份有限公司 Method and device for detecting collapse equipment based on multi-source log and electronic equipment
CN116777634A (en) * 2023-06-25 2023-09-19 深圳征信服务有限公司 Financial data analysis system and method based on artificial intelligence
CN117040926A (en) * 2023-10-08 2023-11-10 北京网藤科技有限公司 Industrial control network security feature analysis method and system applying knowledge graph
CN117240632A (en) * 2023-11-16 2023-12-15 中国电子科技集团公司第十五研究所 Attack detection method and system based on knowledge graph
CN117478435A (en) * 2023-12-28 2024-01-30 中汽智联技术有限公司 Whole vehicle information security attack path generation method and system
US20240039944A1 (en) * 2022-07-30 2024-02-01 James Whitmore Automated Modeling and Analysis of Security Attacks and Attack Surfaces for an Information System or Computing Device

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109922075A (en) * 2019-03-22 2019-06-21 中国南方电网有限责任公司 Network security knowledge map construction method and apparatus, computer equipment
CN111163086A (en) * 2019-12-27 2020-05-15 北京工业大学 Multi-source heterogeneous network security knowledge graph construction and application method
CN111177417A (en) * 2020-04-13 2020-05-19 中国人民解放军国防科技大学 Security event correlation method, system and medium based on network security knowledge graph
CN111741023A (en) * 2020-08-03 2020-10-02 中国人民解放军国防科技大学 Attack studying and judging method, system and medium for network attack and defense test platform
CN112422537A (en) * 2020-11-06 2021-02-26 广州锦行网络科技有限公司 Behavior prediction method of network attack knowledge graph generated based on honeypot actual combat
CN113691550A (en) * 2021-08-27 2021-11-23 西北工业大学 Behavior prediction system of network attack knowledge graph
CN113872943A (en) * 2021-09-06 2021-12-31 深圳供电局有限公司 Network attack path prediction method and device
CN114579765A (en) * 2022-03-07 2022-06-03 四川大学 Network shooting range weapon base construction method based on open source information analysis

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109922075A (en) * 2019-03-22 2019-06-21 中国南方电网有限责任公司 Network security knowledge map construction method and apparatus, computer equipment
CN111163086A (en) * 2019-12-27 2020-05-15 北京工业大学 Multi-source heterogeneous network security knowledge graph construction and application method
CN111177417A (en) * 2020-04-13 2020-05-19 中国人民解放军国防科技大学 Security event correlation method, system and medium based on network security knowledge graph
CN111741023A (en) * 2020-08-03 2020-10-02 中国人民解放军国防科技大学 Attack studying and judging method, system and medium for network attack and defense test platform
CN112422537A (en) * 2020-11-06 2021-02-26 广州锦行网络科技有限公司 Behavior prediction method of network attack knowledge graph generated based on honeypot actual combat
CN113691550A (en) * 2021-08-27 2021-11-23 西北工业大学 Behavior prediction system of network attack knowledge graph
CN113872943A (en) * 2021-09-06 2021-12-31 深圳供电局有限公司 Network attack path prediction method and device
CN114579765A (en) * 2022-03-07 2022-06-03 四川大学 Network shooting range weapon base construction method based on open source information analysis

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
NIDHI RASTOGI: "Information Prediction using Knowledge Graphs for Contextual Malware Threat Intelligence", 《RESEARCHGATE》 *
孙澄; 胡浩; 杨英杰; 张红旗: "基于网络防御知识图谱的0day攻击路径预测方法", 《网络与信息安全学报》 *
高见等: "基于本体的网络威胁情报分析技术研究", 《计算机工程与应用》 *

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20240039944A1 (en) * 2022-07-30 2024-02-01 James Whitmore Automated Modeling and Analysis of Security Attacks and Attack Surfaces for an Information System or Computing Device
CN115766258A (en) * 2022-11-23 2023-03-07 西安电子科技大学 Multi-stage attack trend prediction method and device based on causal graph and storage medium
CN115766258B (en) * 2022-11-23 2024-02-09 西安电子科技大学 Multi-stage attack trend prediction method, equipment and storage medium based on causal relationship graph
CN115664860A (en) * 2022-12-26 2023-01-31 广东财经大学 Network security threat assessment method and system
CN115664860B (en) * 2022-12-26 2023-03-31 广东财经大学 Network security threat assessment method and system
CN115860117A (en) * 2023-02-22 2023-03-28 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) MDATA knowledge extraction method and system based on attack and defense behaviors
CN115860117B (en) * 2023-02-22 2023-05-09 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) MDTA knowledge extraction method and system based on attack and defense behaviors
CN116318929B (en) * 2023-03-07 2023-08-29 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Attack strategy extraction method based on safety alarm data
CN116318929A (en) * 2023-03-07 2023-06-23 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Attack strategy extraction method based on safety alarm data
CN116319074B (en) * 2023-05-12 2023-08-15 北京安博通科技股份有限公司 Method and device for detecting collapse equipment based on multi-source log and electronic equipment
CN116319074A (en) * 2023-05-12 2023-06-23 北京安博通科技股份有限公司 Method and device for detecting collapse equipment based on multi-source log and electronic equipment
CN116777634A (en) * 2023-06-25 2023-09-19 深圳征信服务有限公司 Financial data analysis system and method based on artificial intelligence
CN117040926A (en) * 2023-10-08 2023-11-10 北京网藤科技有限公司 Industrial control network security feature analysis method and system applying knowledge graph
CN117040926B (en) * 2023-10-08 2024-01-26 北京网藤科技有限公司 Industrial control network security feature analysis method and system applying knowledge graph
CN117240632A (en) * 2023-11-16 2023-12-15 中国电子科技集团公司第十五研究所 Attack detection method and system based on knowledge graph
CN117240632B (en) * 2023-11-16 2024-02-06 中国电子科技集团公司第十五研究所 Attack detection method and system based on knowledge graph
CN117478435A (en) * 2023-12-28 2024-01-30 中汽智联技术有限公司 Whole vehicle information security attack path generation method and system

Also Published As

Publication number Publication date
CN115296924B (en) 2023-01-31

Similar Documents

Publication Publication Date Title
CN115296924B (en) Network attack prediction method and device based on knowledge graph
US20220124108A1 (en) System and method for monitoring security attack chains
CN108933793B (en) Attack graph generation method and device based on knowledge graph
Einy et al. The anomaly-and signature-based IDS for network security using hybrid inference systems
CN112738015B (en) Multi-step attack detection method based on interpretable convolutional neural network CNN and graph detection
Dilek et al. Applications of artificial intelligence techniques to combating cyber crimes: A review
Li et al. Analysis framework of network security situational awareness and comparison of implementation methods
US10789367B2 (en) Pre-cognitive security information and event management
Paudel et al. Detecting dos attack in smart home iot devices using a graph-based approach
CN104539626A (en) Network attack scene generating method based on multi-source alarm logs
Mao et al. MIF: A multi-step attack scenario reconstruction and attack chains extraction method based on multi-information fusion
Manoharan et al. Revolutionizing Cybersecurity: Unleashing the Power of Artificial Intelligence and Machine Learning for Next-Generation Threat Detection
Al-Utaibi et al. Intrusion detection taxonomy and data preprocessing mechanisms
Sen et al. On using contextual correlation to detect multi-stage cyber attacks in smart grids
Haas et al. Efficient attack correlation and identification of attack scenarios based on network-motifs
Kim et al. “I know what you did before”: General framework for correlation analysis of cyber threat incidents
Mohamed et al. Alert correlation using a novel clustering approach
CN116938587A (en) Threat detection method and system based on trace-source diagram behavior semantic extraction
Prayote Knowledge based anomaly detection
Shah Understanding and study of intrusion detection systems for various networks and domains
Neshenko Illuminating Cyber Threats for Smart Cities: A Data-Driven Approach for Cyber Attack Detection with Visual Capabilities
Wang et al. An end-to-end method for advanced persistent threats reconstruction in large-scale networks based on alert and log correlation
Penmatsa et al. Web phishing detection: feature selection using rough sets and ant colony optimisation
KR102592624B1 (en) Threat hunting system and method for against social issue-based advanced persistent threat using artificial intelligence
Li et al. A threat recognition solution of edge data security in industrial internet

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant