CN115296924B - Network attack prediction method and device based on knowledge graph - Google Patents

Network attack prediction method and device based on knowledge graph Download PDF

Info

Publication number
CN115296924B
CN115296924B CN202211156094.2A CN202211156094A CN115296924B CN 115296924 B CN115296924 B CN 115296924B CN 202211156094 A CN202211156094 A CN 202211156094A CN 115296924 B CN115296924 B CN 115296924B
Authority
CN
China
Prior art keywords
attack
data
knowledge
network
graph
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211156094.2A
Other languages
Chinese (zh)
Other versions
CN115296924A (en
Inventor
饶志宏
刘方
徐锐
聂大成
陈剑锋
许卡
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CETC 30 Research Institute
Original Assignee
CETC 30 Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CETC 30 Research Institute filed Critical CETC 30 Research Institute
Priority to CN202211156094.2A priority Critical patent/CN115296924B/en
Publication of CN115296924A publication Critical patent/CN115296924A/en
Application granted granted Critical
Publication of CN115296924B publication Critical patent/CN115296924B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/288Entity relationship models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1433Vulnerability analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Computer Hardware Design (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Fuzzy Systems (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computer And Data Communications (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a network attack prediction method and a device based on a knowledge graph, belonging to the field of network security and comprising the following steps: s101, acquiring data; s102, preprocessing the acquired data; s103, constructing a network security body facing to network attack; s104, extracting data according to the defined knowledge expression model; s105, fusing and correcting the extracted various data to construct a network security knowledge graph; and S106, predicting the attack event by using the constructed network security knowledge graph. The invention improves the prediction accuracy of the attack behavior.

Description

Network attack prediction method and device based on knowledge graph
Technical Field
The invention relates to the field of network security, in particular to a network attack prediction method and device based on a knowledge graph.
Background
At present, networks have penetrated into people's lives from all corners, and various attack strategies are continuously emerging and renewed. The network malicious intrusion attack has been developed from single simple operation (password cracking, file damage, webpage tampering and the like) in the early stage to complex multiple means (vulnerability attack, virus propagation, domain name hijacking, denial of service, APT attack and the like). The possibility that the attack target can be threatened through the single-step attack behavior is very low, most attackers realize an action plan with a specific target through a series of steps and combined coordinated attacks, so that the network has increasingly serious security problems, and the network security presents an offensive and refractory situation. At present, network attack prediction is a key link for realizing active defense of network security. The method researches how to discover behavior and law of hacker intrusion by using massive network security data, predicts multi-step attack behaviors possibly suffered by a network system in the future, a final target of hacker intrusion and facilities and equipment possibly suffering from threats, and can take effective and targeted measures to defend and prevent.
At present, there are many methods for predicting network attacks, and according to mode classification of prediction methods, currently mainstream prediction methods are classified into prediction methods based on a neural network, prediction methods based on a game theory, prediction methods based on an attack graph, prediction methods based on data mining, and other methods.
The prediction method based on the neural network is based on an artificial neural network algorithm, has absolute advantages in learning the nonlinear characteristics of the network attack event sequence, has the characteristics of good fitting property, self-learning and self-memory of a target sample and the like, can obtain the characteristic mode of complex nonlinear data in the intelligent attack event, and has the typical work of Tiresias, BRNN-LSTM, ALEAP and the like. The prediction method based on the neural network is based on large-scale sample training, has high accuracy in mining the logical relation and the rule among network attack events, but has strong dependence on the quality of data samples, takes long training time, has high cost, is easy to fall into local minimum points, and is easy to generate overfitting so that the generalization capability is poor.
The prediction method based on the game theory is generally aimed at a confrontation environment with an attack and defense game, different game models are established according to the integrity of opponent information mastered by an attacker and a defender, and the prediction models work in a NashSVM algorithm, a double zero-sum static game, a random prediction game, a dynamic Bayesian game and the like. The method based on the game theory considers the income type strategic reasoning, can more deeply understand the intention of an attacker, including the attack target, the attack source, the relation among attack behaviors and the like, and describes the logical relationship among the behaviors, so as to play games and fight against the attacker and make more targeted decisions.
The prediction method based on the attack graph constructs a model by a graph network structure, such as a directed attack graph, a Markov chain, a Bayesian network graph and the like, and the representative work is a botnet dependency graph, an uncertainty perception attack graph, a double-layer attack and defense model combining the attack graph and a game theory and the like. The algorithm usually takes the identity as a node, an attack means as an edge of a graph network, different relations among entities are represented, the algorithm is better in small-scale data scene, and certain priori knowledge is needed as a basis.
Compared with the previous 3 prediction methods, data mining has stronger characterization capability on hidden features and internal modes of deep data, but is generally used as a technical means in the process, and representative work comprises an emotion analysis method, similarity sequence alignment and recommendation system construction. The prediction method based on data mining is used for mining rules among attack information by carrying out statistical analysis, rule association, classification induction and the like on a large number of prior knowledge such as attack alarms, detection results and the like, and classifying and predicting future attacks; or the method is combined with algorithm modeling prediction such as an attack graph and a game theory, and has good performance on the prediction of phishing websites and social network attacks.
The existing network attack prediction method has the following problems: (1) For some compound attacks, direct association may not be provided between multiple attack behaviors, or extraction of behavior characteristics is difficult, for example, encrypted router in-and-out traffic, deep data packets and the like, and for such attacks, the existing prediction method cannot associate attack events initiated by the same attacker, so that prediction errors occur; (2) Hidden features and internal modes of a general data deep layer can represent logical association between attack behaviors and complex attack intentions of attackers, and the existing method cannot reason a plurality of hidden features and implicit relations, so that the prediction accuracy is low; (3) The method aims at the situation that false alarm and false alarm exist in alarm information of an intrusion detection system, the alarm information is used as an important data source for network attack prediction, the attack path prediction is wrong due to wrong alarm information, the fault-tolerant capability of the existing attack prediction method is low, and the prediction accuracy rate in practical application is very low.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a network attack prediction method and device based on a knowledge graph, so that the prediction accuracy rate of attack behaviors is improved.
The purpose of the invention is realized by the following scheme:
a network attack prediction method based on knowledge graph includes the following steps:
s101, acquiring data;
s102, preprocessing the acquired data;
s103, constructing a network security body facing to network attack;
s104, extracting data according to the defined knowledge expression model;
s105, fusing and correcting the extracted various data to construct a network security knowledge graph;
and S106, predicting the attack event by using the constructed network security knowledge graph.
Further, in step S101, the acquired data includes network asset detection data, vulnerability information data, threat intelligence data, and security device log data.
Further, in step S102, the preprocessing includes data normalization processing, data deduplication and merging processing, data classification processing, and data spatio-temporal registration processing.
Further, in step S103, the sub-step of: and defining a knowledge expression model, and performing knowledge expression by adopting a triple.
Further, in step S104, various types of data triples are extracted according to the defined knowledge expression model.
Further, in step S105, the correcting includes merging aggregation of security events, correcting event reliability, correcting mutual exclusion events, removing false alarm events, and completing false alarm events; the network security knowledge graph comprises an attack mode graph, a threat intelligence graph and a network asset graph; the method specifically comprises the following substeps:
1) Merging the data of the same equipment;
2) Merging the consistent event data;
3) Tagging events based on threat information data, analyzing credibility, correcting exclusive events, and eliminating false alarm events;
4) Establishing an attack pattern library aiming at known attacks by utilizing expert knowledge; completing the missed report event according to the attack mode library; according to the description of the attack chain in the attack pattern library, in the associated multi-step attack, if one attack step is found to be omitted, whether the omitted attack step is a necessary step of the next attack step in the multi-step attack step is judged, if yes, the fact that the security equipment fails to report the attack event is deduced according to the judgment, and the multi-step attack event is completed; if not, directly entering step 5);
5) Carrying out map construction on the corrected data to form an attack mode map, a threat information map and a network asset map; and accessing and storing the map by using the basic database.
Further, the basic database is a Neo4J database.
Further, in step S106, the following sub-steps are included:
1) Representing structured knowledge in a knowledge graph for network security as an undirected graphG=(V,E) In which
Figure 471121DEST_PATH_IMAGE001
Represents a collection of entity nodes in the graph,Erepresenting a collection of various relational edges between entities; each triplet in the network security knowledge graph is represented as
Figure 66050DEST_PATH_IMAGE002
Wherein
Figure 733792DEST_PATH_IMAGE003
And
Figure 996146DEST_PATH_IMAGE004
respectively representing linked head and tail entity nodes,
Figure 458351DEST_PATH_IMAGE005
representing the relationship between the two entity nodes; embedding a heterogeneous network of a network security knowledge graph into a low-dimensional vector space to form a low-dimensional vector;
2) On the basis of vectorization, adding a constraint rule condition, converting the constraint condition into a basic database query statement to obtain a candidate sub-graph, performing similarity calculation on the candidate sub-graph, measuring the similarity of an attack event sequence detected by security equipment and an attack mode graph in a constructed knowledge graph by using a similarity calculation algorithm, excavating a hidden relation and a path of the attack event sequence, and predicting an attack path and a target; the constraint rule conditions comprise a vulnerability to be utilized when an attack event sequence occurs, asset attributes of an attack target, and the premise that one attack event occurs is that after a certain attack is successfully executed;
3) Calculating the shortest path of the vectorized attack event sequence and the attack mode subgraph filtered by the constraint condition based on a DTW algorithm;
4) Correcting the obtained attack pattern subgraph by depending on a domain expert;
5) And predicting the attack path and the attack target according to the obtained attack mode subgraph.
Further, in step 3), the method comprises the sub-steps of:
step (1): vectorizing the attack event sequence and the attack mode subgraph filtered by the constraint condition;
step (2): calculating a distance matrix between the vectorized attack event sequence and each attack mode sequence in the attack mode subgraph;
and (3): and finding a path from the upper left corner to the lower right corner of the matrix, wherein if the sum of elements on the path is minimum, the path is an attack pattern subgraph matched with the attack event sequence.
A network attack prediction device based on a knowledge-graph comprises a program storage unit and a program running unit, and when a program in the program storage unit is loaded by the program running unit, the network attack prediction device based on the knowledge-graph executes the network attack prediction method based on the knowledge-graph.
The beneficial effects of the invention include:
(1) The method comprises the steps of carrying out normalization, duplicate removal, cleaning, classification and space-time matching processing on multi-source heterogeneous data by collecting network asset detection data, vulnerability data, open source threat information data, security and protection equipment log data and the like to form standardized format data; and constructing a network security ontology based on the knowledge in the network security field. And (3) extracting knowledge of the network asset detection data, the vulnerability data and the like by combining a network security knowledge expression model, constructing network security knowledge maps such as an attack mode map, a threat intelligence map and a network asset map, and predicting the network attack based on the constructed knowledge maps.
(2) The invention combines the network security domain ontology to extract the knowledge of the network asset data, the vulnerability data, the threat information data and the security equipment log data, and the constructed network security ontology refers to the network security knowledge map description language at home and abroad, thereby improving the expansibility and compatibility of the security knowledge map.
(3) The embodiment of the invention utilizes the constructed network security knowledge graph, embeds the heterogeneous network of the knowledge graph into a continuous low-dimensional vector space based on a TransE translation model, and introduces constraint conditions to improve the efficiency of similarity calculation when calculating the similarity of sub-graphs; meanwhile, similarity calculation is carried out by combining a Dynamic Time Warping (DTW) algorithm, and matching accuracy can be improved under the condition that an alarm event sequence has false alarm and missed alarm, so that the technical effects of improving the accuracy of an attack path and attack target prediction are achieved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a flow chart of a method for predicting cyber-attack based on a knowledge-graph according to an embodiment of the present invention;
FIG. 2 is a flowchart of attack prediction with constraints according to an embodiment of the present invention;
fig. 3 is an example of attack event prediction.
Detailed Description
All features disclosed in all embodiments of the present specification, or all methods or process steps implicitly disclosed, may be combined and/or expanded, or substituted, in any way, except for mutually exclusive features and/or steps.
The invention provides a network attack prediction method and a network attack prediction device based on a knowledge graph, aiming at solving the technical problems in the background. The method comprises the following technical concepts: by collecting network asset detection data, vulnerability data, open source threat information data, security equipment log data and the like, carrying out normalization, duplicate removal, cleaning, classification and space-time matching processing on multi-source heterogeneous data; and constructing a network security ontology facing network attack behaviors based on network security domain knowledge. Based on a network security ontology expression model, network security knowledge maps such as an attack mode map, a threat intelligence map and a network asset map are constructed, and target implicit feature mining and implicit relation reasoning are carried out based on the constructed knowledge maps to realize prediction of network attack behaviors. According to the method, the attack behavior prediction accuracy is improved by constructing the network security knowledge maps such as the attack mode map, the threat intelligence map and the network asset map and mining the logic association between the attack behaviors and the attack intention of an attacker by using similar target discovery and implicit relation reasoning.
In order to make the objects, technical solutions and advantages of the present invention clearer and more obvious, the present invention is further described in detail below with reference to the accompanying drawings and technical solutions.
Fig. 1 is a flowchart of a method for predicting cyber-attack based on a knowledge-graph according to an embodiment of the present invention, including the following steps:
step S101: the network port scanning tool is utilized to detect the target network assets, so that network port scanning data, certificate data, DNS data, web website frame data and the like can be obtained; acquiring vulnerability information from vulnerability information bases, vulnerability forums, personal blogs, twitters, gitHub and other information sources through a web crawler technology; and acquiring security event information, IOC information and the like from security company security bulletins, hacker forums, security websites, third party threat information and the like by utilizing a web crawler technology. And obtaining the log data of security equipment such as firewall logs, intrusion detection system logs, sandbox logs and the like in a cooperative mode.
Step S102: preprocessing network asset detection data, vulnerability data, threat information data and log data of security equipment, wherein the processing flow is as follows:
1) Data normalization processing: carrying out homogenization treatment on multi-source heterogeneous data, unifying field structures, and then carrying out data format conversion, including data type conversion, date and time format conversion, chinese coding conversion and conversion from coding to name;
2) Data deduplication and merging: comparing the cleaned data with data in a database according to the key fields, judging whether redundant data exist or not, and directly storing the data into the database if the redundant data do not exist; if redundant data exists, judging whether the new data is completely identical to the existing data in each field, and if the new data is identical to the existing data in each field, discarding the new data. If not, judging whether the fields with different values have conflict. If no conflict exists, combining the new data with the existing data; and if the new data and the existing data fields have conflict, merging the data after the conflict is eliminated.
3) Data classification: and classifying the multi-source data by using a decision tree, and classifying the data in a multi-layer manner through layering.
4) Data space-time registration: and (3) associating and matching the basic data with time and space (mechanism and geographical position information) coordinates, and marking a space-time coordinate label on each piece of basic data.
Step S103: in terms of knowledge representation, embodiments of the present invention employ triples (entities, relationships, attributes) for knowledge representation. According to ontology modeling of network asset data, referring to CYBOX2.0, entities of the network asset data comprise IP, ports, protocols, equipment, operating systems, certificates, domain names, AS numbers and the like, relationships comprise has relationships, belong _ to relationships, ower relationships and the like, naming rules of the entities and the relationships refer to naming rules of the CYBOX2.0, and compatibility with network security knowledge graph description languages at home and abroad is facilitated. And (4) referring to the STIX standard aiming at triple definitions of vulnerability data, threat intelligence data and log data of security equipment. For vulnerability data, an operating system, hardware equipment, software, a protocol, a vulnerability, a utilization code and the like are defined as entity tags, and have a has relation, a cause relation and the like. Entities such as equipment, software, protocols, vulnerabilities, attack events, attack tools and the like are defined for threat information data, and have a has relation, a cause relation, a belong _ to relation and the like. Aiming at log data of security equipment, entities such as IP, ports and events are defined, and has a has relation, a cause relation, a belong _ to relation and the like.
Step S104: extracting various data triples according to a defined knowledge expression model, and the specific steps are as follows:
1) And (4) aiming at the fact that the network asset data belongs to the structured data, extracting the knowledge of the network asset data directly by adopting a D2R tool according to the knowledge expression model defined in the step S106.
2) Aiming at vulnerability information obtained by crawling a webpage by a web crawler, belonging to semi-structured data, extracting entities, relations and attributes by adopting a rule-based entity identification algorithm; for the vulnerability information obtained from the vulnerability database, because the vulnerability information is structured data, the vulnerability information can be directly extracted according to the knowledge expression model defined in step S103.
3) For threat intelligence data, the web page data acquired by the web crawler is utilized, knowledge extraction is carried out by adopting entity identification based on rules, and the structured threat intelligence data from a third party can be directly extracted according to the knowledge expression model defined in the step S103.
4) For log data of the security equipment, referring to attack mechanism mode classification, malicious code mode classification and hidden danger mode systems of foreign mainstream CAPEC and ATT & CK, establishing an attack mode library by using expert knowledge, and classifying the log data of the security equipment based on the attack mode library. Because the attack pattern library is structured data, direct extraction can be performed according to the knowledge expression model defined in step S103.
Step S105: and performing fusion correction on various extracted data, including aggregation merging of security events, event reliability correction, mutual exclusion event correction, removal of false alarm events, omission of alarm events and the like, and constructing network security knowledge maps such as an attack mode map, a threat information map, a network asset map and the like. The method comprises the following specific steps:
1) Merging the data of the same equipment;
2) Merging event data consistent with the source IP, the source port, the destination IP and the destination port;
3) And labeling the event based on the threat information data, analyzing the credibility, correcting the mutually exclusive event, and rejecting the false alarm event. For example, for an attack event, discovering that the attack event utilizes a vulnerability of a Window operating system through threat intelligence, and discovering that an attacked target is a Linux operating system through asset detection data, it is inferred that the attack cannot occur at the asset at all, so the attack event can be marked as a false report;
4) Establishing an attack pattern library aiming at known attacks by utilizing expert knowledge; completing the missed report event according to the attack mode library; according to the description of the attack chain in the attack mode library, in the related multi-step attack, if one attack step is found to be omitted, whether the omitted attack step is a necessary step of the next attack step of the multi-step attack step is judged, if yes, the fact that the security equipment fails to report the attack event is inferred, and the multi-step attack event is completed; if not, directly entering the step 5);
5) And constructing the corrected data to form an attack mode map, a threat intelligence map and a network asset map, and accessing and storing the maps by adopting Neo4J as a basic database.
Step S106: by utilizing the constructed network security knowledge graph, some network anomalies and attacks can be effectively discovered, hidden relations and paths of security threats are excavated, and the attacks are predicted, as shown in FIG. 2, the specific steps are as follows:
1) Structured knowledge in a knowledge graph for network security can be represented as an undirected graphG=(V,E) In which
Figure 806156DEST_PATH_IMAGE001
Represents a collection of entity nodes in the graph,Erepresenting a collection of various relational edges between entities. Each triplet in the security knowledge graph is represented as
Figure 12009DEST_PATH_IMAGE002
Wherein
Figure 128870DEST_PATH_IMAGE003
And
Figure 27556DEST_PATH_IMAGE004
respectively representing linked head and tail entity nodes,
Figure 862657DEST_PATH_IMAGE005
representing the relationship between the two entity nodes. The embodiment of the invention adopts a TransE-based translation model to embed the heterogeneous network of the knowledge graph into a continuous low-dimensional vector space to form a low-dimensional vector.
2) On the basis of vectorization, the similarity of an attack event sequence detected by security equipment and an attack mode map in a constructed knowledge map is measured by using a similarity calculation algorithm, the hidden relation and the path of the attack event sequence are excavated, and the attack path and the target are predicted. Aiming at the fact that constraint relation exists between attack event sequences and certain constraint relation also exists between attack events and attack target environments, if the distance between nodes is measured directly according to the attributes, the relation and the subgraph structure of the nodes, a plurality of useless attack modes can be obtained, and along with the increase of the graph scale, the calculation complexity of the similarity is increased. For example, for a multi-step attack, a general attack flow includes target reconnaissance, tool making, tool delivery, attack penetration, installation implantation, command control and malicious activities, and when the security device detects the first several attack stages, it is possible to associate a plurality of attack modes from an attack mode map through similarity calculation, and it is difficult to determine the true attack intention of an attacker. Therefore, before calculating the similarity, corresponding constraint rules are added, such as vulnerabilities required to be utilized when an attack event sequence occurs, asset attributes of an attack target, and the premise that one of the attack events occurs is that the constraint conditions need to be converted into a Neo4J query statement Cypher after a certain attack is successfully executed, so that candidate subgraphs are obtained, and then the similarity calculation is performed on the candidate subgraphs.
3) Aiming at the problem that an attack event sequence is not executed in every attack step, multiple attack means can be adopted in every attack stage, false alarm or missing report can exist in an intrusion detection system, and similarity calculation of the attack event sequence is carried outW(n) Describing the time corresponding relation between the test template and the reference template, and solving the regular function corresponding to the minimum accumulated distance when the two templates are matched. After the creative thinking of the inventor of the present invention, the shortest path is calculated by using the vectorized attack event sequence and the attack mode subgraph filtered by the constraint condition based on the DTW algorithm, and the specific implementation steps are as follows:
step (1): vectorizing the attack event sequence and the attack mode subgraph filtered by the constraint condition;
step (2): calculating a distance matrix between the vectorized attack event sequence and each attack mode sequence in the attack mode subgraph;
and (3): and finding a path from the upper left corner to the lower right corner of the matrix, wherein if the sum of elements on the path is minimum, the path is an attack pattern subgraph matched with the attack event sequence.
4) Correcting the obtained attack mode subgraph by depending on a domain expert;
5) And predicting the attack path and the attack target according to the obtained attack mode subgraph.
Fig. 3 gives an example of predicting the attack path and the attack target from the acquired attack pattern subgraph. The method comprises the steps of receiving an attack alarm sequence aiming at a certain asset from security equipment, carrying out attack mode matching by using a knowledge map, finding a similar attack mode sub-graph, finding that the asset also has the vulnerabilities and asset attributes according to vulnerabilities and asset attributes associated with an attack mode, and recording a corresponding attack path by combining a time sequence if the attack spreads, so that early warning can be carried out in advance, the asset can start defense in advance to prevent further attack, and other assets directly connected with the asset can be early warned in advance if the vulnerabilities and asset attributes exist in the other assets so as to avoid attack spreading.
Example 1
A network attack prediction method based on knowledge graph includes the following steps:
s101, acquiring data;
s102, preprocessing the acquired data;
s103, constructing a network security body facing to network attack;
s104, extracting data according to the defined knowledge expression model;
s105, fusing and correcting the extracted various data to construct a network security knowledge graph;
and S106, predicting the attack event by using the constructed network security knowledge graph.
Example 2
On the basis of embodiment 1, in step S101, the acquired data includes network asset detection data, vulnerability information data, threat intelligence data, and security equipment log data.
Example 3
On the basis of embodiment 1, in step S102, the preprocessing includes a data normalization process, a data deduplication and merging process, a data classification process, and a data spatiotemporal registration process.
Example 4
On the basis of embodiment 1, in step S103, the method includes the sub-steps of: and defining a knowledge expression model, and performing knowledge expression by adopting a triple.
Example 5
Based on embodiment 4, in step S104, various types of data triples are extracted according to the defined knowledge expression model.
Example 6
On the basis of embodiment 1, in step S105, the correction includes aggregation merging of security events, event reliability correction, mutual exclusion event correction, false alarm event removal, and false negative event completion; the network security knowledge map comprises an attack mode map, a threat intelligence map and a network asset map; the method specifically comprises the following substeps:
1) Merging the data of the same equipment;
2) Merging the consistent event data;
3) Tagging events based on threat information data, analyzing credibility, correcting exclusive events, and eliminating false alarm events;
4) Establishing an attack pattern library aiming at known attacks by utilizing expert knowledge; completing the missed report event according to the attack mode library; according to the description of the attack chain in the attack pattern library, in the associated multi-step attack, if one attack step is found to be omitted, whether the omitted attack step is a necessary step of the next attack step in the multi-step attack step is judged, if yes, the fact that the security equipment fails to report the attack event is deduced according to the judgment, and the multi-step attack event is completed; if not, directly entering the step 5);
5) Carrying out map construction on the corrected data to form an attack mode map, a threat information map and a network asset map; and accessing and storing the map by using the basic database.
Example 7
On the basis of example 6, the base database is the Neo4J database.
Example 8
On the basis of embodiment 1, in step S106, the following sub-steps are included:
1) Representing structured knowledge in a knowledge graph for network security as an undirected graphG=(V,E) Wherein
Figure 606622DEST_PATH_IMAGE001
Represents a collection of entity nodes in the graph,Erepresenting a collection of various relational edges between entities; each triplet in the network security knowledge graph is represented as
Figure 858216DEST_PATH_IMAGE002
Wherein
Figure 662224DEST_PATH_IMAGE003
And
Figure 125567DEST_PATH_IMAGE004
respectively representing linked head and tail entity nodes,
Figure 797857DEST_PATH_IMAGE005
representing the relationship between the two entity nodes; embedding a heterogeneous network of a network security knowledge graph into a low-dimensional vector space to form a low-dimensional vector;
2) On the basis of vectorization, adding a constraint rule condition, converting the constraint condition into a basic database query statement to obtain a candidate subgraph, performing similarity calculation on the candidate subgraph, measuring the similarity of an attack event sequence detected by security equipment and an attack mode map in a constructed knowledge map by using a similarity calculation algorithm, excavating a hidden relation and a path of the attack event sequence, and predicting an attack path and a target; the constraint rule conditions comprise a vulnerability to be utilized when an attack event sequence occurs, asset attributes of an attack target, and the premise that one attack event occurs is that after a certain attack is successfully executed;
3) Calculating the shortest path of the vectorized attack event sequence and the attack mode subgraph filtered by the constraint condition based on a DTW algorithm;
4) Correcting the obtained attack pattern subgraph by depending on a domain expert;
5) And predicting the attack path and the attack target according to the obtained attack mode subgraph.
Example 9
On the basis of embodiment 8, in step 3), the method comprises the sub-steps of:
step (1): vectorizing the attack event sequence and the attack mode subgraph filtered by the constraint condition;
step (2): calculating a distance matrix between the vectorized attack event sequence and each attack mode sequence in the attack mode subgraph;
and (3): and finding a path from the upper left corner to the lower right corner of the matrix, wherein if the sum of elements on the path is minimum, the path is an attack pattern subgraph matched with the attack event sequence.
Example 10
A device for predicting a network attack based on a knowledge graph, comprising a program storage unit and a program execution unit, wherein the method for predicting a network attack based on a knowledge graph according to any one of embodiments 1 to 9 is performed when a program in the program storage unit is loaded by the program execution unit.
The units described in the embodiments of the present invention may be implemented by software or hardware, and the described units may also be disposed in a processor. Wherein the names of the elements do not in some way constitute a limitation on the elements themselves.
According to an aspect of the application, there is provided a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the method provided in the various alternative implementations described above.
As another aspect, the present application also provides a computer-readable medium, which may be contained in the electronic device described in the above embodiments; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more programs, which when executed by one of the electronic devices, cause the electronic device to implement the method described in the above embodiments.
The parts not involved in the present invention are the same as or can be implemented using the prior art.
The above-described embodiment is only one embodiment of the present invention, and it will be apparent to those skilled in the art that various modifications and variations can be easily made based on the application and principle of the present invention disclosed in the present application, and the present invention is not limited to the method described in the above-described embodiment of the present invention, so that the above-described embodiment is only preferred, and not restrictive.
Other embodiments than the above examples may be devised by those skilled in the art based on the foregoing disclosure, or by adapting and using knowledge or techniques of the relevant art, and features of various embodiments may be interchanged or substituted and such modifications and variations that may be made by those skilled in the art without departing from the spirit and scope of the present invention are intended to be within the scope of the following claims.

Claims (8)

1. A network attack prediction method based on a knowledge graph is characterized by comprising the following steps:
s101, acquiring data;
s102, preprocessing the acquired data;
s103, constructing a network security body facing to network attack;
s104, extracting data according to the defined knowledge expression model;
s105, fusing and correcting the extracted various data to construct a network security knowledge graph; in step S105, the correcting includes aggregating and merging of security events, event reliability correcting, mutual exclusion event correcting, false alarm event removing, and missed alarm event completing; the network security knowledge graph comprises an attack mode graph, a threat intelligence graph and a network asset graph, and specifically comprises the following substeps:
1) Merging the data of the same equipment;
2) Merging the consistent event data;
3) Tagging events based on threat information data, analyzing credibility, correcting exclusive events, and eliminating false alarm events;
4) Establishing an attack pattern library aiming at known attacks by utilizing expert knowledge; completing the missed report event according to the attack mode library; according to the description of the attack chain in the attack pattern library, in the associated multi-step attack, if one attack step is found to be omitted, whether the omitted attack step is a necessary step of the next attack step in the multi-step attack is judged, if yes, the fact that the security equipment fails to report the attack event is deduced according to the judgment, and the multi-step attack is completed; if not, directly entering the step 5);
5) Carrying out map construction on the corrected data to form an attack mode map, a threat information map and a network asset map; accessing and storing the map by using a basic database;
s106, predicting the attack event by using the constructed network security knowledge graph, wherein in the step S106, the method comprises the following substeps:
1) Representing structured knowledge in a knowledge graph for network security as an undirected graphG=(V,E) In which
Figure DEST_PATH_IMAGE001
Represents a collection of entity nodes in the graph,Erepresenting a collection of various relational edges between entities; each triplet in the network security knowledge graph is represented as
Figure DEST_PATH_IMAGE002
Wherein
Figure DEST_PATH_IMAGE003
And
Figure DEST_PATH_IMAGE004
respectively representing linked head and tail entity nodes,
Figure DEST_PATH_IMAGE005
representing the relationship between the two entity nodes; embedding a heterogeneous network of the network security knowledge graph into a low-dimensional vector space to form a low-dimensional vector;
2) On the basis of vectorization, adding a constraint rule condition, converting the constraint condition into a basic database query statement to obtain a candidate subgraph, performing similarity calculation on the candidate subgraph, measuring the similarity of an attack event sequence detected by security equipment and an attack mode map in a constructed knowledge map by using a similarity calculation algorithm, excavating a hidden relation and a path of the attack event sequence, and predicting an attack path and a target; the constraint rule conditions comprise a vulnerability to be utilized when an attack event sequence occurs, asset attributes of an attack target, and the premise that one attack event occurs is that after a certain attack is successfully executed;
3) Calculating the shortest path of the vectorized attack event sequence and the attack mode subgraph filtered by the constraint condition based on a DTW algorithm;
4) Correcting the obtained attack pattern subgraph by depending on a domain expert;
5) And predicting the attack path and the attack target according to the obtained attack mode subgraph.
2. The method of predicting cyber attacks according to claim 1, wherein the acquired data includes cyber asset detection data, vulnerability information data, threat intelligence data, and security equipment log data in step S101.
3. The method of predicting cyber attacks according to claim 1, wherein in step S102, the preprocessing comprises a data normalization process, a data deduplication and merging process, a data classification process and a data spatiotemporal registration process.
4. The knowledge-graph-based network attack prediction method according to claim 1, comprising the sub-steps of, in step S103: and defining a knowledge expression model, and performing knowledge expression by adopting a triple.
5. The method of predicting cyber-attack based on knowledge-graph according to claim 4, wherein in step S104, each type of data triple is extracted according to a defined knowledge expression model.
6. The knowledge-graph-based cyber-attack prediction method according to claim 1, wherein the base database is a Neo4J database.
7. The knowledge-graph-based cyber-attack prediction method according to claim 1, comprising, in the step 3), the sub-steps of:
step (1): vectorizing the attack event sequence and the attack mode subgraph filtered by the constraint condition;
step (2): calculating a distance matrix between the vectorized attack event sequence and each attack mode sequence in the attack mode subgraph;
and (3): and finding a path from the upper left corner to the lower right corner of the matrix, wherein if the sum of elements on the path is minimum, the path is an attack pattern subgraph matched with the attack event sequence.
8. A network attack prediction device based on the knowledge graph is characterized by comprising a program storage unit and a program running unit, wherein the network attack prediction method based on the knowledge graph according to any one of claims 1 to 7 is executed when a program in the program storage unit is loaded by the program running unit.
CN202211156094.2A 2022-09-22 2022-09-22 Network attack prediction method and device based on knowledge graph Active CN115296924B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211156094.2A CN115296924B (en) 2022-09-22 2022-09-22 Network attack prediction method and device based on knowledge graph

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211156094.2A CN115296924B (en) 2022-09-22 2022-09-22 Network attack prediction method and device based on knowledge graph

Publications (2)

Publication Number Publication Date
CN115296924A CN115296924A (en) 2022-11-04
CN115296924B true CN115296924B (en) 2023-01-31

Family

ID=83834393

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211156094.2A Active CN115296924B (en) 2022-09-22 2022-09-22 Network attack prediction method and device based on knowledge graph

Country Status (1)

Country Link
CN (1) CN115296924B (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20240039944A1 (en) * 2022-07-30 2024-02-01 James Whitmore Automated Modeling and Analysis of Security Attacks and Attack Surfaces for an Information System or Computing Device
CN115766258B (en) * 2022-11-23 2024-02-09 西安电子科技大学 Multi-stage attack trend prediction method, equipment and storage medium based on causal relationship graph
CN115664860B (en) * 2022-12-26 2023-03-31 广东财经大学 Network security threat assessment method and system
CN115860117B (en) * 2023-02-22 2023-05-09 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) MDTA knowledge extraction method and system based on attack and defense behaviors
CN116318929B (en) * 2023-03-07 2023-08-29 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Attack strategy extraction method based on safety alarm data
CN116319074B (en) * 2023-05-12 2023-08-15 北京安博通科技股份有限公司 Method and device for detecting collapse equipment based on multi-source log and electronic equipment
CN116796310A (en) * 2023-06-14 2023-09-22 福州超人帮网络科技有限公司 Data attack processing method and system applied to intelligent cloud
CN116777634A (en) * 2023-06-25 2023-09-19 深圳征信服务有限公司 Financial data analysis system and method based on artificial intelligence
CN117040926B (en) * 2023-10-08 2024-01-26 北京网藤科技有限公司 Industrial control network security feature analysis method and system applying knowledge graph
CN117240632B (en) * 2023-11-16 2024-02-06 中国电子科技集团公司第十五研究所 Attack detection method and system based on knowledge graph
CN117478435B (en) * 2023-12-28 2024-04-09 中汽智联技术有限公司 Whole vehicle information security attack path generation method and system

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109922075A (en) * 2019-03-22 2019-06-21 中国南方电网有限责任公司 Network security knowledge map construction method and apparatus, computer equipment
CN111163086A (en) * 2019-12-27 2020-05-15 北京工业大学 Multi-source heterogeneous network security knowledge graph construction and application method
CN111177417A (en) * 2020-04-13 2020-05-19 中国人民解放军国防科技大学 Security event correlation method, system and medium based on network security knowledge graph
CN111741023A (en) * 2020-08-03 2020-10-02 中国人民解放军国防科技大学 Attack studying and judging method, system and medium for network attack and defense test platform
CN112422537A (en) * 2020-11-06 2021-02-26 广州锦行网络科技有限公司 Behavior prediction method of network attack knowledge graph generated based on honeypot actual combat
CN113872943A (en) * 2021-09-06 2021-12-31 深圳供电局有限公司 Network attack path prediction method and device
CN114579765A (en) * 2022-03-07 2022-06-03 四川大学 Network shooting range weapon base construction method based on open source information analysis

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113691550B (en) * 2021-08-27 2023-02-24 西北工业大学 Behavior prediction system of network attack knowledge graph

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109922075A (en) * 2019-03-22 2019-06-21 中国南方电网有限责任公司 Network security knowledge map construction method and apparatus, computer equipment
CN111163086A (en) * 2019-12-27 2020-05-15 北京工业大学 Multi-source heterogeneous network security knowledge graph construction and application method
CN111177417A (en) * 2020-04-13 2020-05-19 中国人民解放军国防科技大学 Security event correlation method, system and medium based on network security knowledge graph
CN111741023A (en) * 2020-08-03 2020-10-02 中国人民解放军国防科技大学 Attack studying and judging method, system and medium for network attack and defense test platform
CN112422537A (en) * 2020-11-06 2021-02-26 广州锦行网络科技有限公司 Behavior prediction method of network attack knowledge graph generated based on honeypot actual combat
CN113872943A (en) * 2021-09-06 2021-12-31 深圳供电局有限公司 Network attack path prediction method and device
CN114579765A (en) * 2022-03-07 2022-06-03 四川大学 Network shooting range weapon base construction method based on open source information analysis

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于本体的网络威胁情报分析技术研究;高见等;《计算机工程与应用》(第11期);全文 *

Also Published As

Publication number Publication date
CN115296924A (en) 2022-11-04

Similar Documents

Publication Publication Date Title
CN115296924B (en) Network attack prediction method and device based on knowledge graph
US20220124108A1 (en) System and method for monitoring security attack chains
Manoharan et al. Revolutionizing Cybersecurity: Unleashing the Power of Artificial Intelligence and Machine Learning for Next-Generation Threat Detection
CN108933793B (en) Attack graph generation method and device based on knowledge graph
Dilek et al. Applications of artificial intelligence techniques to combating cyber crimes: A review
CN112738015B (en) Multi-step attack detection method based on interpretable convolutional neural network CNN and graph detection
Einy et al. The anomaly‐and signature‐based IDS for network security using hybrid inference systems
Paudel et al. Detecting dos attack in smart home iot devices using a graph-based approach
CN104539626A (en) Network attack scene generating method based on multi-source alarm logs
Mao et al. MIF: A multi-step attack scenario reconstruction and attack chains extraction method based on multi-information fusion
KR20150091775A (en) Method and System of Network Traffic Analysis for Anomalous Behavior Detection
Aljumah Detection of distributed denial of service attacks using artificial neural networks
Sen et al. On using contextual correlation to detect multi-stage cyber attacks in smart grids
Pirozmand et al. Intrusion detection into cloud-fog-based iot networks using game theory
Mohamed et al. Alert correlation using a novel clustering approach
Wang et al. An end-to-end method for advanced persistent threats reconstruction in large-scale networks based on alert and log correlation
Prayote Knowledge based anomaly detection
Neshenko Illuminating Cyber Threats for Smart Cities: A Data-Driven Approach for Cyber Attack Detection with Visual Capabilities
KR102592624B1 (en) Threat hunting system and method for against social issue-based advanced persistent threat using artificial intelligence
Almulla Cyber-attack detection in network traffic using machine learning
Umamaheswaran et al. Smart intrusion detection system with balanced data in IoMT infra
Li et al. A threat recognition solution of edge data security in industrial internet
CN115051833B (en) Intercommunication network anomaly detection method based on terminal process
Saad et al. Context-aware intrusion alerts verification approach
Njogu et al. Network specific vulnerability based alert reduction approach

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant