CN114900364A - High-level continuous threat detection method based on tracing graph and heterogeneous graph neural network - Google Patents

High-level continuous threat detection method based on tracing graph and heterogeneous graph neural network Download PDF

Info

Publication number
CN114900364A
CN114900364A CN202210546970.6A CN202210546970A CN114900364A CN 114900364 A CN114900364 A CN 114900364A CN 202210546970 A CN202210546970 A CN 202210546970A CN 114900364 A CN114900364 A CN 114900364A
Authority
CN
China
Prior art keywords
graph
tracing
heterogeneous
nodes
heterogeneous graph
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210546970.6A
Other languages
Chinese (zh)
Other versions
CN114900364B (en
Inventor
黄永忠
欧阳规格
高一鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guilin University of Electronic Technology
Original Assignee
Guilin University of Electronic Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guilin University of Electronic Technology filed Critical Guilin University of Electronic Technology
Priority to CN202210546970.6A priority Critical patent/CN114900364B/en
Publication of CN114900364A publication Critical patent/CN114900364A/en
Application granted granted Critical
Publication of CN114900364B publication Critical patent/CN114900364B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of network security, in particular to a high-level continuous threat detection method based on a tracing graph and a heterogeneous graph neural network. Firstly, a good representation is learned for the tracing graph by using a heterogeneous graph representation learning technology, and preparation is made for a subsequent classification task. And then, performing layered pooling on the output heterogeneous graph vectors, gradually aggregating the original representation information of the heterogeneous graph, and judging whether the input tracing graph contains an attack behavior or not by using the information. And finally, checking the classification result represented by the heterogeneous graph through the real label of the tracing graph. The excessive dependence of the APT detection process on expert field knowledge is effectively reduced, the different network attack detection fields are conveniently expanded, meanwhile, a tracing graph structure of a cross-operating system is used for modeling the host activity, the host activity can be applied in a complex enterprise environment, and the workload of designing different tracing graphs for different operating systems is reduced.

Description

High-level continuous threat detection method based on tracing graph and heterogeneous graph neural network
Technical Field
The invention relates to the technical field of network security, in particular to a high-level continuous threat detection method based on a tracing graph and a heterogeneous graph neural network.
Background
With the continuous progress of informatization, the combination of network space and various aspects of industry, national defense and social life is increasingly deepened. Advanced Persistent Threat (APT) organizations attack enterprises or organizations in different areas for economic benefit, theft of confidential information, or political purposes. How to accurately detect the APT attack and quickly respond becomes a hot research problem in the field of network security.
The existing detection method based on the tracing graph mainly focuses on the aspects of label propagation algorithm, graph matching and the like, the technologies depend on algorithms, rules and the like designed by expert knowledge too much, a large amount of domain knowledge is needed, and the customized algorithms are difficult to adapt to various network environments and operating system environments to a certain extent. However, with the development of deep learning techniques, the importance of reducing human intervention in the APT detection process is increasing.
Disclosure of Invention
The invention aims to provide a high-level continuous threat detection method based on a tracing graph and a heterogeneous graph neural network, and aims to solve the problem that the existing detection method based on the tracing graph is difficult to adapt to various network environments and operating system environments.
In order to achieve the above purpose, the present invention provides a high-level persistent threat detection method based on a tracing graph and a heterogeneous graph neural network, which comprises the following steps:
defining a tracing frame spanning an operating system by taking the basic definition of a system level tracing diagram as a guiding principle;
under the tracing diagram framework, converting the logs generated by the host system into a tracing diagram capable of modeling the running state of the system;
learning one representation for the tracing graph by using a heterogeneous representation learning technology to obtain a heterogeneous graph;
performing hierarchical pooling through the vector of the new heterogeneous graph, aggregating original representation information of the new heterogeneous graph, and judging whether the tracing graph contains an attack behavior by using the original representation information;
and checking the classification result represented by the heterogeneous graph through the real label of the tracing graph.
The tracing graph is a directed heterogeneous attribute graph, and the provided attributes are symbolized and vectorized.
Wherein the step of gradually aggregating the original representation information of the new heterogeneous graph by performing hierarchical pooling on the vector of the new heterogeneous graph comprises:
mapping different types of nodes in the tracing graph to respective specific vector spaces;
calculating the head attention of the nodes and the importance of all domain nodes of the nodes to the nodes by utilizing the edges between the nodes;
calculating the importance degree of each node to the target node;
all message headers are combined to obtain a message vector;
aggregating the message to the target node, performing linear mapping on the target node, connecting the target node through a nonlinear activation function and a residual error, and mapping the target node to a specific space where the target node is located;
obtaining the information of the nodes in the abnormal graph by repeating the steps;
compressing the information in the heterogeneous graph through hierarchical pooling, and obtaining a vector representing the information of the heterogeneous graph after grouping.
Compared with the prior art, the advanced persistent threat detection method based on the tracing graph and the heterogeneous graph neural network has the beneficial effects that: the advanced continuous threat detection method based on the tracing graph and the heterogeneous graph neural network effectively reduces the excessive dependence of the APT detection process on expert field knowledge, and is convenient to expand to different network attack detection fields; modeling the host activity by using a tracing graph structure of a cross-operating system, so that the host activity can be applied in a complex enterprise environment, and the workload of designing different tracing graphs for different operating systems is reduced; by using the layered pooling model, the detection accuracy of the model is improved, and the problem of classification accuracy reduction caused by flattening of graphic data is avoided.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic step diagram of an advanced persistent threat detection method based on a tracing graph and a heterogeneous graph neural network according to an embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, shall fall within the scope of the present invention
Referring to fig. 1 and fig. 2, fig. 1 is a schematic diagram illustrating steps of a high-level persistent threat detection method based on a tracing graph and a heterogeneous graph neural network according to an embodiment of the present invention. Specifically, as shown in fig. 1, the advanced persistent threat detection method based on a traceback graph and a heterogeneous graph neural network may include the following steps:
s1, defining a tracing frame crossing an operating system by taking the basic definition of the system level tracing diagram as a guiding principle;
s2, under the tracing graph framework, converting the logs generated by the host system into a tracing graph capable of accurately modeling the system running state;
s3, learning one representation for the traceable graph by using a heterogeneous representation learning technology to obtain a heterogeneous graph;
s4, performing hierarchical pooling through the vector of the new heterogeneous graph, aggregating the original representation information of the new heterogeneous graph, and judging whether the tracing graph contains an attack behavior by using the original representation information;
and S5, checking the classification result represented by the heterogeneous graph through the real label of the tracing graph.
Specifically, a traceback graph architecture applicable to various operating systems is defined and nodes therein are divided into three types: a host object capable of initiating system activities, a carrier guest object of system activities, and a descriptive node object. The main object may be subdivided by its type attributes into process objects, thread objects, execution unit objects, and the like. The guest objects may be classified as file objects, memory objects, network stream objects, and the like. The descriptive node objects include dependency unit objects, executive user objects, as shown in table 1. The richness of the edge types of the tracing graph directly influences the semantic expression capability of the tracing graph, and the edges of the tracing graph comprise 56 types including event closing, object creating event, login event, mount event, reading event and the like. For the nodes and edges of the tracing graph, the expression capability of the nodes and edges can be enhanced by adding various attributes to the nodes and edges, so that the tracing graph can accurately depict the operation state and details of the system. The well-defined tracing frame can provide guidance for subsequently converting information such as logs generated by a host computer into a tracing graph.
Through a well-defined system level tracing diagram framework, the operation state and the details of the system can be described by means of rich semantic knowledge and strong abstract expression capability of the tracing diagram. Subsequent heterogeneous graph representation learning is prepared by symbolizing and vectorizing the information provided by the traceback graph.
For different types of nodes s and t in the tracing graph, the nodes s and t are mapped to respective specific vector space to obtain K i (s) and Q i (t), because the designed traceback graph is a heterogeneous graph, different types of nodes are mapped into different types of vector spaces, relative to mapping all nodes into one space, mainly to better capture the properties of the heterogeneous graph. And calculating the attention of the h heads of the nodes s and t by using the edge e between the nodes in the tracing graph and a formula 1, and calculating the importance of all the field nodes t of the s node to the node s by using a formula 2.
Figure BDA0003649884650000041
Figure BDA0003649884650000042
Herein, the<τ(v t ),φ(e),τ(v s )>A meta-relationship between the nodes is represented,
Figure BDA0003649884650000043
is an edge-based matrix whose effect is to make the model capture different semantic relationships between the same node pairs, h represents the total number of heads of attention, and i represents the ith head of attention.
Due to the fact that different relations have different degrees of importance on the target node, the prior tensor mu is added to be used for zooming the attention size. The importance degree of each node to the target node can be calculated through formula 1 and formula 2.
After mapping different types of source nodes to different spaces, for nodes s and t and edge e between them, their multi-headed messages are computed, by the matrix in equation 3
Figure BDA0003649884650000044
And finally, combining all message headers by using a formula 4 to obtain a message vector, and finishing message aggregation.
Figure BDA0003649884650000045
Is a linear mapping that maps the source node s into the ith header message vector.
Figure BDA0003649884650000046
Figure BDA0003649884650000047
Message HGT (s, e, t) incorporates all the headers in equation 3And | means MSG-head generated in the formula 3 i (s, e, t) are spliced together.
After the message is aggregated to the target node by the formula 5, A-Linear is carried out on the target node τ(t) After linear mapping, the non-linear activation function and residual connection of formula 6 are used to map the non-linear activation function back to the specific space where the target node t is located.
Figure BDA0003649884650000048
Figure BDA0003649884650000049
Figure BDA00036498846500000410
Refers to a weighted average between the attention weight and the message header.
At this time, each node in the graph aggregates the context representation of the domain node information, and the nodes expand the message propagation distance by repeating the steps, so that the information of most nodes in the graph can be obtained. And finally, compressing all information in the graph through hierarchical pooling to obtain a vector capable of representing the information of the whole graph after grouping, and judging whether the graph comprises an APT attack behavior or not by using the vector.
The advanced continuous threat detection method based on the tracing graph and the heterogeneous graph neural network effectively reduces the excessive dependence of the APT detection process on expert field knowledge, and is convenient to expand to different network attack detection fields. Meanwhile, the host activity is modeled by using the tracing graph structure of the cross-operating system, so that the host activity can be applied in a complex enterprise environment, and the workload of designing different tracing graphs for different operating systems is reduced. In addition, the layered pooling model is used, so that the detection accuracy of the model is improved, and the problem of reduced classification accuracy caused by flattening of the graphic data is solved.
While the invention has been described with reference to a preferred embodiment, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (3)

1. A high-level persistent threat detection method based on a tracing graph and a heterogeneous graph neural network is characterized by comprising the following steps:
defining a tracing frame spanning an operating system by taking the basic definition of a system level tracing diagram as a guiding principle;
under the tracing graph framework, converting logs generated by a host system into a tracing graph capable of modeling the running state of the system;
learning a representation for the tracing graph by using a heterogeneous graph representation learning technology to obtain a new heterogeneous graph representation;
performing hierarchical pooling through the vector of the new heterogeneous graph, aggregating original representation information of the new heterogeneous graph, and judging whether the tracing graph contains an attack behavior by using the original representation information;
and checking the classification result represented by the heterogeneous graph through the real label of the tracing graph.
2. The advanced persistent threat detection method based on a traceback graph and a heterogeneous graph neural network according to claim 1,
the tracing graph is a directed heterogeneous attribute graph, and the provided attributes are symbolized and vectorized.
3. The advanced persistent threat detection method based on a traceback graph and a heterogeneous graph neural network according to claim 2, wherein the step of gradually aggregating the original representation information of the new heterogeneous graph through hierarchical pooling of vectors of the new heterogeneous graph comprises:
mapping different types of nodes in the tracing graph to respective specific vector spaces;
calculating the head attention of the nodes and the importance of all neighborhood nodes of the nodes to the nodes by utilizing the edge vectors among the nodes;
calculating the importance degree of each node to the target node;
all message headers are combined to obtain a message vector;
aggregating the message to the target node, performing linear mapping on the target node, connecting the target node through a nonlinear activation function and a residual error, and mapping the target node back to a specific space where the target node is located;
because the above steps need to be repeated, all the nodes need to be mapped back to the specific space of the node type;
obtaining the information of the nodes in the abnormal graph by repeating the steps;
compressing the information in the abnormal composition picture through layering pooling to finally obtain a vector representing the information of the abnormal composition picture.
CN202210546970.6A 2022-05-18 2022-05-18 Advanced continuous threat detection method based on traceability graph and heterogeneous graph neural network Active CN114900364B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210546970.6A CN114900364B (en) 2022-05-18 2022-05-18 Advanced continuous threat detection method based on traceability graph and heterogeneous graph neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210546970.6A CN114900364B (en) 2022-05-18 2022-05-18 Advanced continuous threat detection method based on traceability graph and heterogeneous graph neural network

Publications (2)

Publication Number Publication Date
CN114900364A true CN114900364A (en) 2022-08-12
CN114900364B CN114900364B (en) 2024-03-08

Family

ID=82723117

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210546970.6A Active CN114900364B (en) 2022-05-18 2022-05-18 Advanced continuous threat detection method based on traceability graph and heterogeneous graph neural network

Country Status (1)

Country Link
CN (1) CN114900364B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116074092A (en) * 2023-02-07 2023-05-05 电子科技大学 Attack scene reconstruction system based on heterogram attention network

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110958220A (en) * 2019-10-24 2020-04-03 中国科学院信息工程研究所 Network space security threat detection method and system based on heterogeneous graph embedding
US20200137083A1 (en) * 2018-10-24 2020-04-30 Nec Laboratories America, Inc. Unknown malicious program behavior detection using a graph neural network
CN111737551A (en) * 2020-05-26 2020-10-02 国家计算机网络与信息安全管理中心 Dark network cable detection method based on special-pattern attention neural network
CN112465066A (en) * 2020-12-14 2021-03-09 西安交通大学 Graph classification method based on clique matching and hierarchical pooling
CN112528275A (en) * 2020-11-23 2021-03-19 浙江工业大学 APT network attack detection method based on meta-path learning and sub-graph sampling
WO2021179838A1 (en) * 2020-03-10 2021-09-16 支付宝(杭州)信息技术有限公司 Prediction method and system based on heterogeneous graph neural network model
CN113676484A (en) * 2021-08-27 2021-11-19 绿盟科技集团股份有限公司 Attack tracing method and device and electronic equipment
CN113935028A (en) * 2021-11-12 2022-01-14 绿盟科技集团股份有限公司 Method and device for identifying attack behaviors
US20220150268A1 (en) * 2019-03-27 2022-05-12 British Telecommunications Public Limited Company Pre-emptive computer security

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200137083A1 (en) * 2018-10-24 2020-04-30 Nec Laboratories America, Inc. Unknown malicious program behavior detection using a graph neural network
US20220150268A1 (en) * 2019-03-27 2022-05-12 British Telecommunications Public Limited Company Pre-emptive computer security
CN110958220A (en) * 2019-10-24 2020-04-03 中国科学院信息工程研究所 Network space security threat detection method and system based on heterogeneous graph embedding
WO2021179838A1 (en) * 2020-03-10 2021-09-16 支付宝(杭州)信息技术有限公司 Prediction method and system based on heterogeneous graph neural network model
CN111737551A (en) * 2020-05-26 2020-10-02 国家计算机网络与信息安全管理中心 Dark network cable detection method based on special-pattern attention neural network
CN112528275A (en) * 2020-11-23 2021-03-19 浙江工业大学 APT network attack detection method based on meta-path learning and sub-graph sampling
CN112465066A (en) * 2020-12-14 2021-03-09 西安交通大学 Graph classification method based on clique matching and hierarchical pooling
CN113676484A (en) * 2021-08-27 2021-11-19 绿盟科技集团股份有限公司 Attack tracing method and device and electronic equipment
CN113935028A (en) * 2021-11-12 2022-01-14 绿盟科技集团股份有限公司 Method and device for identifying attack behaviors

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
MARK CHEUNG等: "《Pooling in Graph Convolutional Neural Networks》", 2019 53RD ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS, AND COMPUTERS, pages 462 - 466 *
黄易: "《基于图神经网络的APT攻击报告分析方法研究》", 硕士电子期刊, vol. 2022, no. 5, pages 35 - 49 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116074092A (en) * 2023-02-07 2023-05-05 电子科技大学 Attack scene reconstruction system based on heterogram attention network
CN116074092B (en) * 2023-02-07 2024-02-20 电子科技大学 Attack scene reconstruction system based on heterogram attention network

Also Published As

Publication number Publication date
CN114900364B (en) 2024-03-08

Similar Documents

Publication Publication Date Title
Yu et al. Networking for big data: A survey
Yu et al. Temporally factorized network modeling for evolutionary network analysis
US11113293B2 (en) Latent network summarization
CN111885040A (en) Distributed network situation perception method, system, server and node equipment
WO2022142001A1 (en) Target object evaluation method based on multi-score card fusion, and related device therefor
Raja et al. Combined analysis of support vector machine and principle component analysis for IDS
CN115796229A (en) Graph node embedding method, system, device and storage medium
Li et al. Study on the interaction between big data and artificial intelligence
CN115630374A (en) Testing method and device of credible numerical control system, computer equipment and storage medium
CN114900364A (en) High-level continuous threat detection method based on tracing graph and heterogeneous graph neural network
Terumalasetti et al. A comprehensive study on review of AI techniques to provide security in the digital world
CN107391443B (en) Sparse data anomaly detection method and device
US20230396641A1 (en) Adaptive system for network and security management
CN117272195A (en) Block chain abnormal node detection method and system based on graph convolution attention network
Wanjau et al. Network intrusion detection systems: A systematic literature review of hybrid deep learning approaches
Shen et al. Threat prediction of abnormal transaction behavior based on graph convolutional network in blockchain digital currency
Wang et al. Hierarchical graph convolutional network for data evaluation of dynamic graphs
Khan et al. Anomalous node detection in attributed social networks using dual variational autoencoder with generative adversarial networks
Wang et al. Research on network behavior risk measurement method based on traffic analysis
Jiang et al. Ai and machine learning for industrial security with level discovery method
Sun et al. GAME-BC: A Graph Attention Model for Exploring Bitcoin Crime
Yevseiev et al. Development of a multiloop security system of information interactions in socio-cyberphysical systems
Dhamdhere et al. Peer Group Analysis in Identity and Access Management to Identify Anomalies
Akoramurthy et al. 10 Digital Linked
Xue et al. Branch and Bound for Sigmoid-Like Neural Network Verification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant