CN113094707B - Lateral movement attack detection method and system based on heterogeneous graph network - Google Patents

Lateral movement attack detection method and system based on heterogeneous graph network Download PDF

Info

Publication number
CN113094707B
CN113094707B CN202110347685.7A CN202110347685A CN113094707B CN 113094707 B CN113094707 B CN 113094707B CN 202110347685 A CN202110347685 A CN 202110347685A CN 113094707 B CN113094707 B CN 113094707B
Authority
CN
China
Prior art keywords
user
login
host
path
graph
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110347685.7A
Other languages
Chinese (zh)
Other versions
CN113094707A (en
Inventor
卢志刚
王天
姜波
刘俊荣
刘松
董璞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Information Engineering of CAS
Original Assignee
Institute of Information Engineering of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Information Engineering of CAS filed Critical Institute of Information Engineering of CAS
Priority to CN202110347685.7A priority Critical patent/CN113094707B/en
Publication of CN113094707A publication Critical patent/CN113094707A/en
Application granted granted Critical
Publication of CN113094707B publication Critical patent/CN113094707B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/566Dynamic detection, i.e. detection performed at run-time, e.g. emulation, suspicious activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/30Authentication, i.e. establishing the identity or authorisation of security principals
    • G06F21/44Program or device authentication
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/552Detecting local intrusion or implementing counter-measures involving long-term monitoring or reporting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Virology (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Computer And Data Communications (AREA)

Abstract

The invention relates to a lateral movement attack detection method and system based on a heterogeneous graph network. The method is based on an authentication log of an intranet, a login behavior diagram between a user and a host is structured, a user login diagram and a source host path diagram are constructed, and then two-stage anomaly detection is carried out on the diagram. The first stage is based on a user login diagram, a graph neural network algorithm with maximized mutual information is used for learning a behavior mode of a host, and a partial abnormal sample is obtained through calculation of a partial abnormal factor algorithm; and the second stage is based on the source host path diagram and the labeled sample obtained in the first stage, and performs semi-supervised learning by using a heterogeneous diagram attention network algorithm to detect the lateral movement attack behavior. The method can simply and effectively detect the lateral movement attack behavior under the condition of no sample label, has the effect exceeding that of most supervised learning methods, and has high recall rate and low false alarm rate.

Description

Lateral movement attack detection method and system based on heterogeneous graph network
Technical Field
The invention relates to the field of computer network security, and is used for resisting transverse movement attack behaviors implemented in advanced persistent threats, in particular to a transverse movement attack detection method and system based on a heterogeneous graph network.
Background
In recent years, with the rapid development of the internet, the network environment becomes increasingly complex, and network attacks increasingly present a high-frequency situation. Among other things, advanced persistent threats (ADVANCED PERSISTENT THREAT, APT) benefit from advances in attack methodology and improvements in attack organization, with attacks being increasingly frequent. APT attacks have a longer latency period and greater destructive power than other attacks. The attack method is comprehensive, and the customized attack tool can be developed through long-term observation of the target, so that the threat is huge. Therefore, detection and protection against APT attacks has become a major issue in current network security.
The transverse movement is an extremely important ring for APT attack, and is a main process of implementing attack after an attacker enters an intranet. According to the ATT & CK framework, lateral movement consists of the technology used by an attacker to access and control remote systems on the network. After an attacker successfully invades the network and establishes a foothold, the attacker usually moves transversely in the network for the next attack and collection of information of the target network, finally obtains the control right of the whole network, and achieves the purposes of destroying the target network or infrastructure, stealing confidential data or core intellectual property rights and the like, thus being huge in harm.
At present, the detection of the transverse moving attack is still in a relatively preliminary stage, and the research on the detection of the transverse moving attack mainly converts the detection of the transverse moving attack into the detection of an abnormal user or host in an intranet, and the abnormal performance exceeding a threshold value is detected by modeling the behavior of the user or host. The detection targets can be classified into a moving target type and a moving path type according to their difference. The mobile target method mainly detects a user or a host which is attacked by an attacker in the transverse mobile attack; the moving path method uses the moving path generated in the lateral moving attack as the detection target. Many existing research efforts are focused on moving object type lateral movement attack detection, and the research on the moving path of the lateral movement attack is less.
In summary, the lateral movement attack generally camouflage the normal user for operation by stealing the user credentials, and has high concealment and difficult detection. The existing lateral movement attack detection research method generally converts the detection method into detection of an abnormal user or host in an intranet, but the following disadvantages and shortcomings still exist: firstly, the massive multi-source security logs enable the false alarm rate of the existing method to be high. Secondly, in the actual network environment, a few abnormal users or hosts cannot be observed or can be observed, and the abnormal users or hosts are not fully utilized; thirdly, the intranet is essentially a correlation diagram consisting of users and a host computer, and the detection of the transverse mobile attack on the diagram is yet to be studied.
Disclosure of Invention
In order to solve the above-mentioned problems, a two-stage lateral mobile attack detection method HGLM (Lateral Movement detection using Heterogeneous Graph) based on a heterogeneous graph network is proposed herein.
The principle of the invention is as follows: based on the authentication log of the intranet, the login behavior diagram between the user and the host is structured, a user login diagram and a source host path diagram are constructed, and then two-stage anomaly detection is carried out on the diagram. The first stage is based on a user login diagram, a graph neural network algorithm with maximized mutual information is used for learning a behavior mode of a host, and a partial abnormal sample is obtained through calculation of a partial abnormal factor algorithm; and the second stage is based on the source host path diagram and the labeled sample obtained in the first stage, and performs semi-supervised learning by using a heterogeneous diagram attention network algorithm to detect the lateral movement attack behavior.
In order to achieve the purpose, the invention adopts the following specific technical scheme:
a method for detecting a lateral mobile attack based on a heterogeneous graph network, comprising the steps of:
1) And (5) extracting the data set. Because the lateral movement attack involves login authentication behavior between the user and the host, the data set is extracted, namely, authentication logs generated by the intranet equipment are collected, and the data set is constructed.
2) The security log graph is structured. And constructing a user login diagram and a source host path diagram by using the extracted data set.
3) Abnormal login behavior detection based on unsupervised learning: based on the user login graph, abnormal login behavior detection based on unsupervised learning is performed. This part is the first stage of HGLM two-stage anomaly detection.
4) Lateral movement attack detection based on semi-supervised learning: and performing lateral movement attack detection based on the semi-supervised learning based on the source host path diagram and a small number of labeled samples in the first stage. The part is the second stage of HGLM two-stage anomaly detection.
Further, the security log graph structuring mainly comprises three parts, namely data preprocessing, user login graph construction and source host path graph construction.
A) The first step of log graph structuring is to preprocess an authentication log of an intranet. The authentication log typically contains attributes such as authentication time, source user, target user, source host, target host, and authentication status. The original log information is redundant heterogeneous and therefore needs to be processed into a format that complies with the landscape mobile attack scenario. First, since an attacker typically moves laterally from one host to another with a trapped user, we only need to pay attention to the same authentication event for the source and target users. Second, a lateral movement attack involves at least two hosts, so we need to filter the same authentication events for the source host and the target host. In summary, the pretreatment flow is as follows: and traversing each authentication event in the authentication log data set D, and screening out events which are the same as the source user and the target user and are different from the source host and the target host to obtain a processed data set D 1.
B) The user login graph (User Authentication Graph, UAG) is an undirected homogeneous graph showing the login behavior pattern of the user between hosts within a certain period of time. Define graph G u = (V, E, F), where node V represents the hosts and edge E represents the login connection of the user between the hosts. The user login map network with the characteristics is obtained by giving the login times of the user on the host under the sliding window as the characteristics F to the nodes in the map, giving no characteristics to the sides in the map, and only representing the connection relation. Specifically, given the data set D 1, the user u, and the sliding window length L, the authentication event belonging to the user u is first screened out in D 1 to obtain the data set D u. And secondly, dividing the data into a plurality of time windows according to the sliding window length L, and calculating login times characteristics F of users on the host under different windows. And finally, traversing each authentication event in D u, adding the source host and the target host to a node V in the graph, adding an edge E (the node and the edge are ignored if the addition is repeated) of the source host and the target host, which is connected to the graph, adding one to the login times of the source host and the target host in F under the corresponding window, and obtaining a user login graph G u = (V, E, F) with characteristics of the user u after the traversing is finished.
C) The source Host path map (Host PATH GRAPH, HPG) is a directed heterogeneous map, which represents the association between the user's login path to the target Host and the source Host. Defining a graph G p = (V, E, F), wherein two types of nodes are defined in the graph, one type of nodes represents a source host V src, one type of nodes represents a login path V path from a user to a target host, two types of sides also exist, one type of sides is a transmitting side E send, a login path node pointing from the source host node to the user to the target host represents that the user logs in from the source host to the target host; the other type is a depending edge E on, which points from the user's login path node to the target host to the source host node, indicating that the user's login path to the target host occurs on the source host, and the two types of edges are symmetrical. The occurrence times of the login path under the sliding window on the source host and the statistical characteristic F statistic are endowed to the nodes, and the edges only represent the connection relationship, so that a source host path diagram network is obtained. Specifically, given dataset D 1, sliding window length L, and statistics F statistic, each event in D 1 is traversed, V src is added to the source host, the user and target hosts are spliced into a login path as a node to V path, the connection edge pointed to the login path by the source host is added to E send, the connection edge pointed to the source host by the login path is symmetrically added to E on, and the sliding window login number characteristics are calculated as in the user login map. Finally, traversing the node V path of the login path type in the graph, adding the statistical feature F statistic to the node, and simultaneously endowing the source host node V src with a single-hot coding feature to obtain a source host path graph G p = (V, E, F) with the feature. The statistical characteristics used include the number of successful and failed authentications of the user to the target host, the ratio of the number of authentications of the user to the target host to the total number of authentications of the user, and the minimum, maximum and average values of time intervals between the user and the occurrence of authentication events of the target host.
Further, the abnormal login behavior detection based on the unsupervised learning includes: based on a user login graph, firstly, a graph neural network algorithm (DEEP GRAPH Infomax, DGI) with maximized mutual information is used for learning a behavior mode of a host, namely, hidden layer characteristic representation of a sample is obtained through mutual information training of a local characteristic h and a global characteristic s of the maximized sample, specifically, in the graph, the characteristic vector of each node is the local characteristic h of the node, training learning is carried out through a graph convolution kernel encoder, and the global characteristic s is obtained through an average readout function. And then, a negative sample is obtained by applying random disorder disturbance to the nodes, a discriminator is used for scoring a sample pair consisting of h and s, and finally, the hidden layer representation of the nodes is obtained. And detecting by using a local anomaly factor algorithm (Local Outlier Factor, LOF) based on the sample characteristic representation learned by the DGI, obtaining a small number of labeled host samples by setting a threshold value, combining the labeled host samples with corresponding users to form a login path from the user to the target host, and using the login path for semi-supervised learning in the second stage.
Further, the lateral movement attack detection based on semi-supervised learning includes: based on the source host path graph and a small number of labeled samples in the first stage, semi-supervised learning on the graph is performed by using a heterogeneous graph attention network algorithm (Heterogeneous graph Attention Network, HAN), and more lateral movement attack behaviors are detected by learning the association between login path nodes. The HAN introduces attention mechanisms into the heterograms, including node-level attention and semantic-level attention. By defining meta-paths (meta-paths) on the graph, node level attention primarily learns the weights of neighboring nodes on its meta-paths, while semantic level attention learning is based on the weights of different meta-paths. And finally, obtaining the final node representation through corresponding aggregation operation. Specifically in the figure, two meta-paths are defined: meta-path p 1(vpath,eon,vsrc from path node to source host node) and meta-path p 2(vpath,eon,vsrc,esend,vpath from path node to source host node to path node). Based on the two element paths, node-level attention and semantic-level attention characteristics are calculated, labeled samples in the first stage are used, semi-supervised learning is performed by taking a cross entropy loss function as a target, and lateral movement attack behaviors are detected.
Based on the same inventive concept, the invention also provides a system for detecting the transverse movement attack based on the heterogeneous graph network, which comprises:
The data acquisition module is used for collecting authentication logs generated by the intranet equipment and constructing a data set;
And the security log diagram structuring module is used for constructing a user login diagram and a source host machine path diagram by utilizing the data set.
The abnormal login behavior detection module based on the unsupervised learning is used for detecting the abnormal login behavior based on the unsupervised learning based on the user login graph;
and the lateral movement attack detection module is used for carrying out lateral movement attack detection based on the semi-supervised learning based on the source host path diagram and the labeled sample obtained by the detection of the abnormal login behavior based on the non-supervised learning.
Compared with the prior art, the invention has the beneficial effects that:
The method can simply and effectively detect the transverse movement attack behavior under the condition of no sample label, the AUC value on the related public dataset CMCS EVENTS exceeds 95%, the TPR of partial users reaches 100%, the FPR is 0, and the effect exceeds that of most supervised learning methods, and has high recall rate and low false alarm rate.
Drawings
Fig. 1 is an overall flow chart of the present invention for detecting a lateral mobile attack based on a heterogeneous graph network. Wherein X represents the initial characteristic of the positive sample, X 'represents the initial characteristic of the negative sample after disturbance, H represents the hidden layer characteristic of the positive sample after graph convolution, H' represents the hidden layer characteristic of the negative sample after graph convolution, D represents a classifier, R represents an average Readout function, S represents the global characteristic obtained by calculation of the average Readout function, Z 1~Zp represents the hidden layer characteristic obtained by node level attentiveness, and Z represents the hidden layer characteristic obtained by semantic level attentiveness.
Fig. 2 is a flowchart of the construction of a user login diagram in the present invention.
FIG. 3 is a flow chart of the construction of a source host path graph in the present invention.
FIG. 4 is a flow chart of abnormal login behavior detection based on unsupervised learning in the present invention.
Fig. 5 is a flow chart of lateral movement attack detection based on semi-supervised learning in the present invention.
Fig. 6 is a graph of the detection performance results of the method HGLM of the present invention for different users on the public dataset CMCS EVENTS.
Detailed Description
In order to better understand the technical solution in the embodiments of the present invention and make the objects, features and advantages of the present invention more obvious and understandable, the technical core of the present invention will be further described in detail below with reference to the accompanying drawings and examples.
The invention discloses a method for detecting transverse movement attack based on heterogeneous graph network, as shown in figure 1, the method mainly comprises four parts of data acquisition, security log graph structuring, abnormal login behavior detection based on unsupervised learning and transverse movement attack detection based on semi-supervised learning, and the main steps are as follows:
Step 100 is data set extraction, that is, collecting authentication logs generated by intranet devices for a period of time, to form a data set.
Step 200 is security log graph structuring, and mainly comprises three parts of data preprocessing, user login graph construction and source host path graph construction.
The construction of the user login diagram is shown in fig. 2.
Step 210, for a given dataset D 1, first define a user u and a sliding window length L for screening authentication events belonging to user u and calculating the user login frequency feature F on the host under different windows in D 1.
Step 220, traversing the dataset D 1.
Step 230, screening out authentication event belonging to user u in D 1.
Step 240, adding the source host and the target host of the authentication event to the node V in the graph, and adding an edge E (node and edge are ignored if added repeatedly) of the source host and the target host connected to the graph.
Step 250, calculating the login frequency characteristic F of the user on the host under different windows, and adding one to the login frequency of the source host and the target host under the corresponding windows.
And after the traversal is finished, obtaining a user login graph G u = (V, E, F) with the characteristics of the user u.
The construction of the source host path graph is shown in fig. 3.
At step 260, given data set D 1, a sliding window length L and extracted statistics F statistic are first defined.
Step 270, traversing each event in D 1, adding V src to the source host, splicing the user and the target host into a login path as a node to add to V path, adding the connection edge pointed to by the source host to the login path to E send, and symmetrically adding the connection edge pointed to by the login path to the source host to E on.
Step 280, calculating the occurrence frequency characteristic F of the login path under different windows, and adding one to the occurrence frequency of the login path under the corresponding window.
Step 290, adding the corresponding statistical feature F statistic to the login path node.
After the traversal is finished, the source host node V src in the graph is endowed with the unique thermal coding feature, and the source host path graph G p = (V, E, F) with the feature is obtained.
Step 300 is two-stage anomaly detection, wherein the first stage is anomaly log-in behavior detection based on unsupervised learning, and the second stage is lateral movement attack detection based on semi-supervised learning.
Abnormal login behavior detection based on unsupervised learning is shown in fig. 4.
Step 310, based on the user login diagram, firstly, learning a behavior mode of the host by using the DGI, and performing node disturbance by using a random disorder method to obtain a negative sample.
In step 320, the feature vector of each node in the graph is the local feature h of the node, training learning is performed by the graph convolution kernel encoder, and the global feature s is obtained by averaging readout functions. And (3) taking the maximization of the mutual information of the local features and the global features as a target, and scoring positive and negative 'sample pairs' consisting of h and s by using a discriminator to obtain the hidden layer representation of the node.
Step 330, detecting by using a local anomaly factor algorithm based on the sample feature representation obtained by DGI, and obtaining a small amount of labeled host samples by setting a threshold.
And finally, combining the labeled host computer sample and the corresponding user into a login path training sample from the user to the target host computer for semi-supervised learning of the second stage.
Lateral movement attack detection based on semi-supervised learning is shown in fig. 5.
Step 340, semi-supervised learning on the graph using the HAN based on the source host path graph and the small number of labeled samples of the first phase. First, two meta-paths are defined: meta-path p 1(vpath,eon,vsrc from path node to source host node) and meta-path p 2(vpath,eon,vsrc,esend,vpath from path node to source host node to path node).
Step 350, calculating node level attention and semantic level attention characteristics based on the two element paths, performing semi-supervised learning by using the labeled sample of the first stage and taking the cross entropy loss function as a target, and detecting the transverse movement attack behavior.
And finally, combining the abnormal samples detected in the first stage and the second stage, namely, the result of the transverse movement attack behavior detected by the HGLM model.
Experiments on a public dataset CMCS EVENTS can find that the AUC value of the detection result of the method HGLM disclosed herein exceeds 95%, the TPR of partial users reaches 100%, the FPR is 0, and the effect exceeds that of most supervised learning methods, and the method has high recall rate and low false alarm rate. The experimental results are shown in table 1, and compared with the existing methods, the method HGLM provided herein is simple and effective, does not require sample tags, and can exceed most of the supervised detection methods. In addition, the detection performance of the model on different users is shown in fig. 6, and it can be found that the recall rate of the model can exceed 95% and the false alarm rate is lower than 5% for most users.
TABLE 1 comparison of Performance of lateral-movement attack detection models
Based on the same inventive concept, another embodiment of the present invention provides a system for detecting a lateral movement attack based on a heterogeneous graph network, comprising:
The data acquisition module is used for collecting authentication logs generated by the intranet equipment and constructing a data set;
And the security log diagram structuring module is used for constructing a user login diagram and a source host machine path diagram by utilizing the data set.
The abnormal login behavior detection module based on the unsupervised learning is used for detecting the abnormal login behavior based on the unsupervised learning based on the user login graph;
and the lateral movement attack detection module is used for carrying out lateral movement attack detection based on the semi-supervised learning based on the source host path diagram and the labeled sample obtained by the detection of the abnormal login behavior based on the non-supervised learning.
Based on the same inventive concept, another embodiment of the present invention provides an electronic device (computer, server, smart phone, etc.) comprising a memory storing a computer program configured to be executed by the processor, and a processor, the computer program comprising instructions for performing the steps in the inventive method.
Based on the same inventive concept, another embodiment of the present invention provides a computer readable storage medium (e.g., ROM/RAM, magnetic disk, optical disk) storing a computer program which, when executed by a computer, implements the steps of the inventive method.
Parts of the invention not described in detail, such as the local anomaly factor algorithm, are within the knowledge of those skilled in the art.
Finally, it should be noted that the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail by using examples, it should be understood by those skilled in the art that modifications and equivalents may be made to the technical solution of the present invention without departing from the spirit and scope of the technical solution of the present invention, and all such modifications and equivalents are intended to be encompassed in the scope of the claims of the present invention.

Claims (9)

1. The transverse movement attack detection method based on the heterogeneous graph network is characterized by comprising the following steps of:
Collecting an authentication log generated by intranet equipment, and constructing a data set;
constructing a user login diagram and a source host path diagram by utilizing the data set;
based on the user login graph, abnormal login behavior detection based on unsupervised learning is performed;
performing lateral movement attack detection based on semi-supervised learning based on the source host path diagram and the labeled sample obtained by the detection of the abnormal login behavior based on the non-supervised learning;
The source host path diagram is a directed heterogeneous diagram and represents the association relationship between a login path from a user to a target host and the source host; two types of nodes are defined in the source host path diagram, one type of nodes represents a source host V src, and the other type of nodes represents a login path V path from a user to a target host; the edges also have two types, one is a sending edge E send, and a login path node pointing from a source host node to a user to a target host represents that the user logs in from the source host to the target host; the other type is a depending edge E on, a login path node from a user to a target host points to a source host node, and the two types of edges are symmetrical, wherein the login path from the user to the target host is represented to occur on the source host; the occurrence times of the login path under the sliding window on the source host and the statistical characteristic F statistic are endowed to the nodes, and the edges only represent the connection relationship, so that a source host path diagram network is obtained.
2. The method of claim 1, wherein data preprocessing is performed prior to constructing the user log-in graph and the source host path graph; the data preprocessing comprises the following steps: and traversing each authentication event in the authentication log data set D, and screening out events which are the same as the source user and the target user and are different from the source host and the target host to obtain a processed data set D 1.
3. The method of claim 2, wherein the user login pattern is an undirected homogeneous pattern representing a user's login behavior pattern between hosts over a period of time; the construction process of the user login graph comprises the following steps: screening authentication events belonging to a user u from a data set D 1, the user u and a sliding window length L in the D 1 to obtain a data set D u; dividing the data into a plurality of time windows according to the sliding window length L, and calculating login times characteristics F of users on a host under different windows; traversing each authentication event in D u, adding a source host and a target host to a node V in the graph, adding an edge E of the source host and the target host, which is connected to the graph, and simultaneously adding one to the login times of the source host and the target host in F under a corresponding window, and obtaining a user login graph G u = (V, E, F) with characteristics of a user u after the traversing is finished.
4. The method of claim 1, wherein the statistical feature F statistic comprises: the number of successful and failed authentications of the user to the target host, the ratio of the number of authentications of the user to the target host to the total number of authentications of the user, the minimum, maximum, and average of time intervals in which authentication events of the user to the target host occur.
5. The method of claim 1, wherein the unsupervised learning-based abnormal login behavior detection comprises:
Based on a user login graph, a graph neural network algorithm with maximized mutual information is used for learning a behavior mode of a host, namely, hidden layer characteristic representation of a sample is obtained through mutual information training of a local characteristic h and a global characteristic s of the maximized sample;
obtaining a negative sample by applying random disorder disturbance to the nodes, and scoring a sample pair consisting of h and s by using a discriminator to obtain hidden layer representation of the nodes;
Based on sample characteristic representation learned by the graph neural network algorithm, detecting by using a local anomaly factor algorithm, obtaining a small number of labeled host samples by setting a threshold value, combining the labeled host samples with corresponding users into a login path from the user to the target host, and using the login path for semi-supervised learning in the second stage.
6. The method of claim 1, wherein the semi-supervised learning based lateral movement attack detection comprises:
Two element paths are defined: a meta path from the path node to the source host node and a meta path from the path node to the source host node to the path node;
Based on the two element paths, calculating node-level attention and semantic-level attention characteristics, performing semi-supervised learning by using the labeled sample obtained by the detection of the abnormal login behavior based on the unsupervised learning and taking the cross entropy loss function as a target, and detecting the transverse movement attack behavior.
7. A heterogeneous graph network-based lateral mobile attack detection system employing the method of any of claims 1-6, comprising:
The data acquisition module is used for collecting authentication logs generated by the intranet equipment and constructing a data set;
the security log diagram structuring module is used for constructing a user login diagram and a source host path diagram by utilizing the data set;
The abnormal login behavior detection module based on the unsupervised learning is used for detecting the abnormal login behavior based on the unsupervised learning based on the user login graph;
and the lateral movement attack detection module is used for carrying out lateral movement attack detection based on the semi-supervised learning based on the source host path diagram and the labeled sample obtained by the detection of the abnormal login behavior based on the non-supervised learning.
8. An electronic device comprising a memory and a processor, the memory storing a computer program configured to be executed by the processor, the computer program comprising instructions for performing the method of any of claims 1-6.
9. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program which, when executed by a computer, implements the method of any of claims 1-6.
CN202110347685.7A 2021-03-31 2021-03-31 Lateral movement attack detection method and system based on heterogeneous graph network Active CN113094707B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110347685.7A CN113094707B (en) 2021-03-31 2021-03-31 Lateral movement attack detection method and system based on heterogeneous graph network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110347685.7A CN113094707B (en) 2021-03-31 2021-03-31 Lateral movement attack detection method and system based on heterogeneous graph network

Publications (2)

Publication Number Publication Date
CN113094707A CN113094707A (en) 2021-07-09
CN113094707B true CN113094707B (en) 2024-05-14

Family

ID=76671616

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110347685.7A Active CN113094707B (en) 2021-03-31 2021-03-31 Lateral movement attack detection method and system based on heterogeneous graph network

Country Status (1)

Country Link
CN (1) CN113094707B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230088676A1 (en) * 2021-09-20 2023-03-23 International Business Machines Corporation Graph neural network (gnn) training using meta-path neighbor sampling and contrastive learning
CN114020593B (en) * 2021-11-08 2024-05-14 山东理工大学 Heterogeneous process log sampling method and system based on track clustering
CN114741688A (en) * 2022-03-14 2022-07-12 北京邮电大学 Unsupervised host intrusion detection method and system
CN115913616A (en) * 2022-09-23 2023-04-04 清华大学 Method and device for detecting transverse mobile attack based on heterogeneous graph abnormal link discovery
CN115604032B (en) * 2022-12-01 2023-04-28 南京南瑞信息通信科技有限公司 Method and system for detecting complex multi-step attack of power system

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110519276A (en) * 2019-08-29 2019-11-29 中国科学院信息工程研究所 A method of detection Intranet transverse shifting attack
CN111967271A (en) * 2020-08-19 2020-11-20 北京大学 Analysis result generation method, device, equipment and readable storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110519276A (en) * 2019-08-29 2019-11-29 中国科学院信息工程研究所 A method of detection Intranet transverse shifting attack
CN111967271A (en) * 2020-08-19 2020-11-20 北京大学 Analysis result generation method, device, equipment and readable storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王天.基于异质图网络的横向移动攻击检测方法.jcs.iie.ac.cn/xxaqxb/ch/reader/view_abstract.aspx?flag=2&file_no=202102010000001&journal_id=xxaqxb.2021,第1页. *

Also Published As

Publication number Publication date
CN113094707A (en) 2021-07-09

Similar Documents

Publication Publication Date Title
CN113094707B (en) Lateral movement attack detection method and system based on heterogeneous graph network
Aljawarneh et al. Anomaly-based intrusion detection system through feature selection analysis and building hybrid efficient model
CN112738015B (en) Multi-step attack detection method based on interpretable convolutional neural network CNN and graph detection
Palmieri et al. A distributed approach to network anomaly detection based on independent component analysis
Meng et al. Design of intelligent KNN‐based alarm filter using knowledge‐based alert verification in intrusion detection
Peng et al. Network intrusion detection based on deep learning
Catak et al. Distributed denial of service attack detection using autoencoder and deep neural networks
Selvarajan et al. Mining of intrusion attack in SCADA network using clustering and genetically seeded flora‐based optimal classification algorithm
Lutsiv et al. Deep Semisupervised Learning-Based Network Anomaly Detection in Heterogeneous Information Systems.
CN111049680A (en) Intranet transverse movement detection system and method based on graph representation learning
Jia et al. A novel real‐time ddos attack detection mechanism based on MDRA algorithm in big data
CN117216660A (en) Method and device for detecting abnormal points and abnormal clusters based on time sequence network traffic integration
Juvonen et al. An efficient network log anomaly detection system using random projection dimensionality reduction
Vani Towards efficient intrusion detection using deep learning techniques: a review
Niu et al. Uncovering APT malware traffic using deep learning combined with time sequence and association analysis
Al-Fawa'reh et al. Detecting stealth-based attacks in large campus networks
Hong et al. Abnormal access behavior detection of ideological and political MOOCs in colleges and universities
CN116633682A (en) Intelligent identification method and system based on security product risk threat
Xin et al. Research on feature selection of intrusion detection based on deep learning
CN109063721A (en) A kind of method and device that behavioural characteristic data are extracted
CN113542222A (en) Zero-day multi-step threat identification method based on dual-domain VAE
Pao et al. Statistical learning methods for information security: fundamentals and case studies
Selim et al. DAE-BILSTM: A Fog-Based Intrusion Detection Model Using Deep Learning for IoT
Veena A survey on network intrusion detection
CN117579324B (en) Intrusion detection method based on gating time convolution network and graph

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant