CN115913616A - Method and device for detecting transverse mobile attack based on heterogeneous graph abnormal link discovery - Google Patents

Method and device for detecting transverse mobile attack based on heterogeneous graph abnormal link discovery Download PDF

Info

Publication number
CN115913616A
CN115913616A CN202211163410.9A CN202211163410A CN115913616A CN 115913616 A CN115913616 A CN 115913616A CN 202211163410 A CN202211163410 A CN 202211163410A CN 115913616 A CN115913616 A CN 115913616A
Authority
CN
China
Prior art keywords
link
decoder
meta
path
neighbor node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211163410.9A
Other languages
Chinese (zh)
Inventor
杨家海
孙晓晴
李城龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN202211163410.9A priority Critical patent/CN115913616A/en
Publication of CN115913616A publication Critical patent/CN115913616A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a method and a device for detecting a lateral mobile attack based on discovery of an abnormal link of a heterogeneous graph, which relate to the technical field of network security and comprise the following steps: acquiring log information, determining a network entity according to the log information, and constructing a heterogeneous user authentication graph, wherein the heterogeneous user authentication graph comprises the network entity and a relationship between the network entities; processing a heterogeneous user authentication graph according to a random walk neighbor node sampling strategy based on a meta-path, and determining a neighbor node set; performing feature aggregation on the neighbor node set according to a meta-path attention mechanism to obtain a characterization vector of a login link; and calculating the relative reconstruction error of the characterization vector, and identifying the login link according to the relative reconstruction error. The method is based on the association between the random walk neighbor node sampling strategy of the meta-path and the attention mechanism processing node, automatically completes the transverse movement identification according to the relative reconstruction error, does not need to manually set an abnormal detection threshold value, is easy to deploy and implement in an actual network scene, and improves the efficiency.

Description

Method and device for detecting transverse mobile attack based on heterogeneous graph abnormal link discovery
Technical Field
The present application relates to the field of network security technologies, and in particular, to a method and an apparatus for detecting a lateral mobile attack based on heterogeneous graph abnormal link discovery.
Background
Advanced Persistent Threat (APT) attacks have the characteristics of complex process, long duration, high concealment, strong destructiveness and the like, and seriously threaten the interests of organizations and the privacy of individual users. The APT lifecycle comprises six stages of intelligence reconnaissance and attack tool construction, attack tool delivery and initial intrusion, command and control (C & C) communication, lateral movement, network asset and data discovery, and final attack targeting. The transverse movement is the key of an attacker to go deep into the network, expand the threat range and realize the final attack target, accurately detects and blocks the transverse movement behavior, can effectively defend APT attack and prevent major security events and economic loss.
The existing transverse movement detection method based on the user authentication graph model has problems in the aspects of detection accuracy and feasibility of an actual scheme. Firstly, in the aspect of detection effect, the current method is limited by the expression capability of a same composition or bipartite graph model, omits rich multi-source heterogeneous information among various network entities, and does not fully mine internal network scenes. Secondly, in terms of feasibility, the current method has an idealized requirement on training data set construction and model deployment that is difficult to achieve in practical scenarios. In particular, supervised learning methods require large-scale labeled datasets for model training, but label information is difficult to acquire in real scenes and often the samples are not uniform. The unsupervised learning method needs to complete normal behavior modeling by relying on pure benign data, and in practice, due to the existence of noise and attack samples, it is difficult to ensure that all links in the current graph data are normal links. In addition, the method needs to set a threshold value manually for abnormality identification, and the threshold value can greatly influence the detection effect. Manually setting the detection system threshold based on expert experience is difficult to operate in practical applications.
Disclosure of Invention
Aiming at the problems, a method and a device for detecting the lateral mobile attack based on the discovery of the abnormal link of the heterogeneous graph are provided.
The application provides a method for detecting a lateral mobile attack based on heterogeneous graph abnormal link discovery in a first aspect, which comprises the following steps:
acquiring log information, determining a network entity according to the log information, and constructing a heterogeneous user authentication graph, wherein the heterogeneous user authentication graph comprises the network entity and a relationship between the network entities;
processing the heterogeneous user authentication graph according to a random walk neighbor node sampling strategy based on a meta-path, and determining a neighbor node set;
performing feature aggregation on the neighbor node set according to a meta-path attention mechanism to obtain a characterization vector of a login link;
and calculating the relative reconstruction error of the characterization vector, and identifying the login link according to the relative reconstruction error.
Optionally, the log information includes one or more of a user authentication event log, a file access log, a process log, and a network flow log.
Optionally, the processing the heterogeneous user authentication graph according to the meta-path-based random walk neighbor node sampling policy to determine a neighbor node set includes:
for a given heterogeneous user authentication graph G =<V,E,X,T V ,T E >Sum element path
Figure BDA0003861121540000028
The transition probability of the ith step in the random walk process under the constraint of the meta-path node type is as follows:
Figure BDA0003861121540000021
wherein
Figure BDA0003861121540000022
Indicates that node v is of type->
Figure BDA0003861121540000023
A set of adjacent nodes;
and selecting nodes with the access times ranking within a preset range to form the neighbor node set.
Optionally, before performing feature aggregation on the neighbor node set according to the meta-path attention mechanism, the method includes:
obtaining a feature aggregation expression of the meta-path attention mechanism, and formulating as:
Figure BDA0003861121540000024
Figure BDA0003861121540000025
wherein, V A Is a set of nodes of type A, P A Is a symmetrical element path set with A type nodes as start and stop nodes, W A , A And alpha A Respectively, a weight matrix, a bias vector and an attention coefficient,
Figure BDA0003861121540000026
is node v via meta-path p j And acquiring the characterization vector.
Optionally, the performing feature aggregation on the neighbor node set according to the meta-path attention mechanism to obtain a characterization vector of the login link includes:
logging in the attribute information X of the link in the neighbor node set through a full-connection network e Encoding to vector h e
Processing the neighbor node set according to the graph neural network to obtain a characterization vector h of the user node u And a characterization vector h for the device node g
According to the h e H is described u And h is as described d Determining a characterization vector h for the logged-in link A Wherein, the
Figure BDA0003861121540000027
Optionally, the calculating a relative reconstruction error of the characterization vector, and identifying the entry link according to the relative reconstruction error includes:
inputting the characterization vector into a white decoder and a gray decoder, and acquiring the reconstruction errors of the characterization vector on the white decoder and the gray decoder;
if the reconstruction error obtained at the white decoder is greater than the reconstruction error obtained at the gray decoder, the login link is considered an abnormal login;
the login link is considered as a normal login if the reconstruction error obtained at the white decoder is not greater than the reconstruction error obtained at the gray decoder.
Optionally, the white decoder and the gray decoder include:
the white decoder is trained on normal sign-in link samples with a loss function of:
Figure BDA0003861121540000031
wherein the content of the first and second substances,
Figure BDA0003861121540000032
represents the normal login link sample, D white Represents the white decoder, is present>
Figure BDA0003861121540000033
A characterization vector representing the normal sign-on link samples;
the white decoder and the gray decoder are trained on unlabeled sign-on link samples with a loss function of:
Figure BDA0003861121540000034
wherein the content of the first and second substances,
Figure BDA0003861121540000035
representing said unmarked sign-on link samples, D eray Represents the gray decoder>
Figure BDA0003861121540000036
A characterization vector representing the unlabeled sign-in link sample.
The second aspect of the present application provides a lateral mobile attack detection apparatus based on heterogeneous graph abnormal link discovery, including:
the building module is used for obtaining log information, determining a network entity according to the log information and building a heterogeneous user authentication graph;
the sampling module is used for processing the heterogeneous user authentication graph according to a random walk neighbor node sampling strategy based on a meta-path and determining a neighbor node set;
the processing module is used for carrying out feature aggregation on the nodes in the neighbor node set according to the meta-path attention mechanism to obtain a characterization vector of the login link;
and the identification module calculates the relative reconstruction error of the characterization vector and identifies the login link according to the relative reconstruction error.
In a third aspect of the present application, a computer device is proposed, which comprises a memory, a processor and a computer program stored on the memory and executable on the processor, and when the processor executes the computer program, the method according to any of the first aspect is implemented.
In a fourth aspect of the present application, a non-transitory computer-readable storage medium is presented, on which a computer program is stored, which computer program, when executed by a processor, performs the method according to any of the first aspect described above.
The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects:
the relevance between the random walk neighbor node sampling strategy based on the meta-path and the attention mechanism processing node is automatically completed according to the relative reconstruction error, the abnormal detection threshold value does not need to be manually set, the deployment and implementation in an actual network scene are easy, and the efficiency is improved.
Additional aspects and advantages of the present application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the present application.
Drawings
The above and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
fig. 1 is a flowchart illustrating a method for detecting a lateral mobile attack based on heterogeneous graph abnormal link discovery according to an exemplary embodiment of the present application;
fig. 2 is a block diagram illustrating a lateral mobile attack detection apparatus based on heterogeneous map abnormal link discovery according to an exemplary embodiment of the present application;
fig. 3 is a block diagram of an electronic device.
Detailed Description
Reference will now be made in detail to the embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative and intended to explain the present application and should not be construed as limiting the present application.
Fig. 1 is a flowchart illustrating a method for detecting a lateral mobile attack based on heterogeneous graph abnormal link discovery according to an exemplary embodiment of the present application, where as shown in fig. 1, the method includes:
step 101, obtaining log information, determining a network entity according to the log information, and constructing a heterogeneous user authentication graph, wherein the heterogeneous user authentication graph comprises the network entity and a relationship between the network entities.
In the embodiment of the application, various types of log information such as a user authentication event log, a file access log, a process log, a network flow log and the like are analyzed, a network entity is extracted, and a heterogeneous user authentication graph containing the relation among a user, equipment, a file and a process is constructed. The relationship among various entities can be obtained according to the heterogeneous user authentication graph, and the relationship among various types of network entities is as follows:
logging in R L The adjacency matrix is M L If the user i tries to authenticate the identity of the equipment j, M L(i,j) =1, otherwise M L(i,j) =0;
Operation R O The adjacency matrix is M O If user i operates using device j, M O(i,j) =1, otherwise M O(i,j) =0;
Operation R R The adjacency matrix is M R If process i runs on device j, M R(i,j) =1, otherwise M R(i,j) =0;
Communication R Cn The adjacency matrix is M Cn If there is a network traffic flow between devices i and j, then M Cn(i,j) =1, otherwise M Cn(i,j) =0;
Control of R Co The adjacency matrix is M Co If user i manipulates process j, then M Co(i,j) =1, otherwise M Co(i,j) =0;
R A The adjacency matrix is M A If the process i accesses the file j in the running process, M A(i,j) =1, otherwise M A(i,j) =0。
In order to express the complex association relationship among various types of network entities more clearly, a plurality of symmetrical element paths which take users and equipment as start and stop nodes are designed according to a heterogeneous user authentication graph, as shown in the table:
Figure BDA0003861121540000051
the above table clearly illustrates the meta path detailed information obtained by the heterogeneous user authentication graph.
And 102, processing the heterogeneous user authentication graph according to the random walk neighbor node sampling strategy based on the meta-path, and determining a neighbor node set.
In order to reduce the dependence on label information, in the encoding process in the abnormal login link detection, firstly, a random walk neighbor node sampling strategy based on a meta-path is adopted to process a heterogeneous user authentication graph, and a neighbor node set is determined. Specific examples of the treatment process include: for a given heterogeneous user authentication graph G =<V,E,X,T V ,T E >Sum element path
Figure BDA0003861121540000052
Figure BDA0003861121540000053
The transition probability of the ith step in the random walk process under the constraint of the meta-path node type is as follows:
Figure BDA0003861121540000054
wherein
Figure BDA0003861121540000061
Indicates that node v is of type->
Figure BDA0003861121540000062
The set of neighboring nodes.
And after the adjacent node set is obtained, selecting the nodes with the access times ranking within the preset range to form the adjacent node set.
In one possible embodiment, the node with the access times ranked at the top 5 is selected to form a neighbor node set.
And 103, performing feature aggregation on the neighbor node set according to the meta-path attention mechanism, and acquiring a characterization vector of the login link.
In the embodiment of the application, after determining the neighbor node set, considering that different meta-paths play different roles in the lateral movement detection, the invention adopts a meta-path attention mechanism to complete node feature aggregation, and the feature aggregation expression operation is defined as follows:
Figure BDA0003861121540000063
Figure BDA0003861121540000064
wherein, V A Is a set of nodes of type A, P A Is a symmetrical element path set with A type node as start-stop node, W A ,b A And alpha A Respectively, a weight matrix, a bias vector and an attention coefficient,
Figure BDA0003861121540000065
is node v via meta-path p j And acquiring the characterization vectors.
According to the invention, the relation between the fully-connected network and the nodes in the neighbor node aggregation is aggregated according to the neural network, and the specific conditions are as follows:
attribute information X of logging-in link in neighbor node set through full-connection network e Encoding to vector h e
Processing the neighbor node set according to the graph neural network to obtain a characterization vector h of the user node u And a characterization vector h for the device node d
Finally, the attribute of the login link is combined with the information of the user node and the equipment node to obtain a characteristic vector h of the login link A Wherein, in the step (A),
Figure BDA0003861121540000066
and 104, calculating a relative reconstruction error of the characterization vector, and identifying the login link according to the relative reconstruction error.
In this embodiment, a token vector h to be registered in a link A After the error is input into the decoder part, the relative reconstruction error of the login link is calculated to complete the identification of the abnormal login link.
The invention adopts a double-decoder structure, comprising a white decoder and a gray decoder, and the training processes are as follows:
firstly, training a white decoder by using partial normal login link samples as weak supervision information, wherein a loss function is as follows:
Figure BDA0003861121540000067
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003861121540000068
indicating a normal login link attribute, D white Representing a white decoder.
Then, for an untagged login link, if the reconstruction error calculated by the link on the white decoder is larger than that on the gray decoder, the link is considered as abnormal login; if the reconstruction error obtained at the white decoder is not greater than the reconstruction error obtained at the gray decoder, the sign-on link is considered a normal sign-on.
Accordingly, the loss function for the white decoder and the gray decoder trained via the unmarked sign-on link data is:
Figure BDA0003861121540000071
wherein the content of the first and second substances,
Figure BDA0003861121540000072
indicating an unmarked sign-on link sample, D grat Represents a gray decoder>
Figure BDA0003861121540000073
A characterization vector representing an unlabeled sign-in link sample.
In addition, after the logged links are identified, normal and abnormal logged links can be further distinguished by maximizing mutual information between the abnormal link and its characterizing information. The method takes an abnormal login link detected by a dual decoder as a positive sample, takes a pre-marked and detected normal login link as a negative sample, uses a bilinear binary classifier as a discriminator, and realizes the maximization of mutual information between the abnormal link and the characterization information thereof based on Jenson-Shannon divergence:
Figure BDA0003861121540000074
to summarize, the training loss function of the detector is: loss = Loss normal +Loss unlabel -λLoss rrg Wherein λ is a super parameter for adjusting the importance of the mutual information regularization term.
According to the method and the device, based on a neighbor node sampling strategy and an attention mechanism of random walk of heterogeneous primitive paths, internal network scenes are fully mined, transverse movement recognition is automatically completed according to relative reconstruction errors through a double-decoder structure and mutual information regularization operation, the requirement on training data set label information is lowered, dependence on manually set abnormal detection threshold values is eliminated, deployment and implementation in actual network scenes are easy, and efficiency is improved.
Fig. 2 is a block diagram 200 of a lateral mobile attack detection apparatus based on heterogeneous map abnormal link discovery according to an exemplary embodiment of the present application, as shown in fig. 2, including: a construction module 210, a sampling module 220, a processing module 230, and an identification module 240.
The building module 210 is configured to obtain log information, determine a network entity according to the log information, and build a heterogeneous user authentication graph;
the sampling module 220 is used for processing the heterogeneous user authentication graph according to the meta-path-based random walk neighbor node sampling strategy and determining a neighbor node set;
the processing module 230 performs feature aggregation on nodes in the neighbor node set according to the meta-path attention mechanism to obtain a characterization vector of the login link;
and the identification module 240 calculates a relative reconstruction error of the characterization vector and identifies the login link according to the relative reconstruction error.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
FIG. 3 illustrates a schematic block diagram of an example electronic device 300 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 3, the apparatus 300 includes a computing unit 301 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 302 or a computer program loaded from a storage unit 308 into a Random Access Memory (RAM) 303. In the RAM 303, various programs and data required for the operation of the device 300 can also be stored. The computing unit 301, the ROM 302, and the RAM 303 are connected to each other via a bus 304. An input/output (I/O) interface 305 is also connected to bus 304.
Various components in device 300 are connected to I/O interface 305, including: an input unit 306 such as a keyboard, a mouse, or the like; an output unit 307 such as various types of displays, speakers, and the like; a storage unit 308 such as a magnetic disk, optical disk, or the like; and a communication unit 309 such as a network card, modem, wireless communication transceiver, etc. The communication unit 309 allows the device 300 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.
Computing unit 301 may be a variety of general and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 301 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The calculation unit 301 executes the respective methods and processes described above, such as the voice instruction response method. For example, in some embodiments, the voice instruction response method may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 308. In some embodiments, part or all of the computer program may be loaded onto and/or installed onto device 300 via ROM 302 and/or communications unit 309. When the computer program is loaded into RAM 303 and executed by computing unit 301, one or more steps of the voice instruction response method described above may be performed. Alternatively, in other embodiments, the computing unit 301 may be configured to perform the voice instruction response method by any other suitable means (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), the Internet, and blockchain networks.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The Server can be a cloud Server, also called a cloud computing Server or a cloud host, and is a host product in a cloud computing service system, so as to solve the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service ("Virtual Private Server", or simply "VPS"). The server may also be a server of a distributed system, or a server incorporating a blockchain.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, and are not limited herein as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved.
The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the protection scope of the present disclosure.

Claims (10)

1. A method for detecting a lateral mobile attack based on abnormal link discovery of a heterogeneous graph is characterized by comprising the following steps:
acquiring log information, determining a network entity according to the log information, and constructing a heterogeneous user authentication graph, wherein the heterogeneous user authentication graph comprises the network entity and a relationship between the network entities;
processing the heterogeneous user authentication graph according to a random walk neighbor node sampling strategy based on a meta-path, and determining a neighbor node set;
performing feature aggregation on the neighbor node set according to a meta-path attention mechanism to obtain a characterization vector of a login link;
and calculating the relative reconstruction error of the characterization vector, and identifying the login link according to the relative reconstruction error.
2. The method of claim 1, wherein the log information comprises one or more of a user authentication event log, a file access log, a process log, and a network flow log.
3. The method of claim 1, wherein processing the heterogeneous user authentication graph according to a meta-path based random walk neighbor node sampling policy to determine a set of neighbor nodes comprises:
for a given heterogeneous user authentication graph G =<V,E,X,T V ,T E >Sum element path
Figure FDA0003861121530000011
The transition probability of the ith step in the random walk process under the constraint of the meta-path node type is as follows:
Figure FDA0003861121530000012
wherein
Figure FDA0003861121530000013
Indicates that node v is of type->
Figure FDA0003861121530000014
A set of adjacent nodes;
and selecting nodes with access times ranking within a preset range to form the neighbor node set.
4. The method of claim 1, prior to the feature aggregation of the set of neighbor nodes according to a meta-path attention mechanism, comprising:
obtaining a feature aggregation expression of the meta-path attention mechanism, and formulating as:
Figure FDA0003861121530000015
Figure FDA0003861121530000016
wherein, V A Is a set of nodes of type A, P A Is a symmetrical element path set with A type node as start-stop node, W A ,b A And alpha A Respectively, a weight matrix, a bias vector and an attention coefficient,
Figure FDA0003861121530000017
is node v via meta-path p j And acquiring the characterization vector.
5. The method according to claim 1, wherein the performing feature aggregation on the neighbor node set according to a meta-path attention mechanism to obtain a token vector of a logged-in link comprises:
logging attribute information X of the link in the neighbor node set through a full-connection network e Encoding to vector h e
Processing the data according to a graph neural networkNeighbor node set to obtain the characterization vector h of the user node u And a characterization vector h for the device node d
According to the h e H is described u And h is as described d Determining a characterization vector h for the logged-in link A Wherein, the
Figure FDA0003861121530000021
6. The method of claim 1, wherein the calculating a relative reconstruction error of the characterization vector from which the logged link is identified comprises:
inputting the characterization vector into a white decoder and a gray decoder, and acquiring the reconstruction errors of the characterization vector on the white decoder and the gray decoder;
if the reconstruction error obtained at the white decoder is greater than the reconstruction error obtained at the gray decoder, the login link is considered an abnormal login;
the check-in link is considered as a normal check-in if the reconstruction error obtained at the white decoder is not greater than the reconstruction error obtained at the gray decoder.
7. The method of claim 6, wherein the white decoder and the gray decoder comprise:
the white decoder is trained on normal sign-in link samples with a loss function of:
Figure FDA0003861121530000022
wherein the content of the first and second substances,
Figure FDA0003861121530000023
represents the normal sign-in link sample, D white Represents the white decoder, <' > or>
Figure FDA0003861121530000024
A characterization vector representing the normal sign-in link sample;
the white decoder and the gray decoder are trained on unlabeled sign-on link samples with a loss function of:
Figure FDA0003861121530000025
wherein, the first and the second end of the pipe are connected with each other,
Figure FDA0003861121530000026
representing said unmarked sign-on link samples, D gray Represents the gray decoder>
Figure FDA0003861121530000027
A characterization vector representing the unlabeled sign-in link sample.
8. A lateral movement attack detection device based on heterogeneous graph abnormal link discovery is characterized by comprising:
the building module is used for obtaining log information, determining a network entity according to the log information and building a heterogeneous user authentication graph;
the sampling module is used for processing the heterogeneous user authentication graph according to a random walk neighbor node sampling strategy based on a meta-path and determining a neighbor node set;
the processing module is used for carrying out feature aggregation on the nodes in the neighbor node set according to the meta-path attention mechanism to obtain a characterization vector of the login link;
and the identification module calculates the relative reconstruction error of the characterization vector and identifies the login link according to the relative reconstruction error.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method of any one of claims 1-7 when executing the computer program.
10. A non-transitory computer-readable storage medium having stored thereon a computer program, wherein the computer program, when executed by a processor, implements the method of any one of claims 1-7.
CN202211163410.9A 2022-09-23 2022-09-23 Method and device for detecting transverse mobile attack based on heterogeneous graph abnormal link discovery Pending CN115913616A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211163410.9A CN115913616A (en) 2022-09-23 2022-09-23 Method and device for detecting transverse mobile attack based on heterogeneous graph abnormal link discovery

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211163410.9A CN115913616A (en) 2022-09-23 2022-09-23 Method and device for detecting transverse mobile attack based on heterogeneous graph abnormal link discovery

Publications (1)

Publication Number Publication Date
CN115913616A true CN115913616A (en) 2023-04-04

Family

ID=86484804

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211163410.9A Pending CN115913616A (en) 2022-09-23 2022-09-23 Method and device for detecting transverse mobile attack based on heterogeneous graph abnormal link discovery

Country Status (1)

Country Link
CN (1) CN115913616A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112291272A (en) * 2020-12-24 2021-01-29 鹏城实验室 Network threat detection method, device, equipment and computer readable storage medium
CN113094707A (en) * 2021-03-31 2021-07-09 中国科学院信息工程研究所 Transverse mobile attack detection method and system based on heterogeneous graph network
US20210243212A1 (en) * 2020-02-04 2021-08-05 The George Washington University Method and system for detecting lateral movement in enterprise computer networks
CN114861863A (en) * 2021-12-11 2022-08-05 西北工业大学 Heterogeneous graph representation learning method based on meta-path multi-level graph attention network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210243212A1 (en) * 2020-02-04 2021-08-05 The George Washington University Method and system for detecting lateral movement in enterprise computer networks
CN112291272A (en) * 2020-12-24 2021-01-29 鹏城实验室 Network threat detection method, device, equipment and computer readable storage medium
CN113094707A (en) * 2021-03-31 2021-07-09 中国科学院信息工程研究所 Transverse mobile attack detection method and system based on heterogeneous graph network
CN114861863A (en) * 2021-12-11 2022-08-05 西北工业大学 Heterogeneous graph representation learning method based on meta-path multi-level graph attention network

Similar Documents

Publication Publication Date Title
WO2021077642A1 (en) Network space security threat detection method and system based on heterogeneous graph embedding
CN113408743B (en) Method and device for generating federal model, electronic equipment and storage medium
US10609057B2 (en) Digital immune system for intrusion detection on data processing systems and networks
WO2015160367A1 (en) Pre-cognitive security information and event management
CN113094707B (en) Lateral movement attack detection method and system based on heterogeneous graph network
CN110855648B (en) Early warning control method and device for network attack
Patil et al. S-DDoS: Apache spark based real-time DDoS detection system
CN111598711A (en) Target user account identification method, computer equipment and storage medium
CN111709022B (en) Hybrid alarm association method based on AP clustering and causal relationship
Chen et al. A mutual information based federated learning framework for edge computing networks
CN114726823B (en) Domain name generation method, device and equipment based on generation countermeasure network
Shahraki et al. An outlier detection method to improve gathered datasets for network behavior analysis in IoT
CN115632874A (en) Method, device, equipment and storage medium for detecting threat of entity object
Chen et al. Anomaly detection on dynamic bipartite graph with burstiness
CN114157480A (en) Method, device, equipment and storage medium for determining network attack scheme
Ni et al. rFedFW: Secure and trustable aggregation scheme for Byzantine-robust federated learning in Internet of Things
CN113569657A (en) Pedestrian re-identification method, device, equipment and storage medium
CN114726634B (en) Knowledge graph-based hacking scene construction method and device
CN115913616A (en) Method and device for detecting transverse mobile attack based on heterogeneous graph abnormal link discovery
CN115208604B (en) AMI network intrusion detection method, device and medium
KR102307632B1 (en) Unusual Insider Behavior Detection Framework on Enterprise Resource Planning Systems using Adversarial Recurrent Auto-encoder
Chen et al. Dynamic threshold strategy optimization for security protection in Internet of Things: An adversarial deep learning‐based game‐theoretical approach
CN115801366A (en) Attack detection method and device, electronic equipment and computer readable storage medium
CN110197066B (en) Virtual machine monitoring method and system in cloud computing environment
KR102595383B1 (en) A hybrid anomaly detection method combining signature-based and behavior-based anomaly detection methods

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination