WO2020042024A1 - 一种基于图算法的节点异常检测方法、装置及存储装置 - Google Patents

一种基于图算法的节点异常检测方法、装置及存储装置 Download PDF

Info

Publication number
WO2020042024A1
WO2020042024A1 PCT/CN2018/103052 CN2018103052W WO2020042024A1 WO 2020042024 A1 WO2020042024 A1 WO 2020042024A1 CN 2018103052 W CN2018103052 W CN 2018103052W WO 2020042024 A1 WO2020042024 A1 WO 2020042024A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
feature
attribute
graph
nodes
Prior art date
Application number
PCT/CN2018/103052
Other languages
English (en)
French (fr)
Inventor
袁振南
朱鹏新
Original Assignee
区链通网络有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 区链通网络有限公司 filed Critical 区链通网络有限公司
Priority to CN201880002427.1A priority Critical patent/CN109844749B/zh
Priority to PCT/CN2018/103052 priority patent/WO2020042024A1/zh
Publication of WO2020042024A1 publication Critical patent/WO2020042024A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the present application relates to the field of network communication technologies, and in particular, to a method, a device, and a storage device for detecting anomalies in a node based on a graph algorithm.
  • some malicious nodes may perform port scanning and sniffing, attacks, illegal requests, or masquerading requests on other nodes in the cluster, resulting in overall cluster performance degradation, large-scale data leakage, and large-scale failures that make the system unavailable. Risk.
  • the inventors of the present application found that, in an open cluster, the access environment of the nodes is complex, and the behavior of the nodes is dynamic and uncontrollable. Detection techniques based on rule matching and supervised learning are difficult to detect in a timely and effective manner. Unknown abnormal behavior pattern.
  • the technical problem mainly solved by this application is to provide a node anomaly detection method, device and storage device based on graph algorithm, which can quickly and efficiently detect nodes with abnormal behavior.
  • a technical solution adopted in the present application is to provide a method for detecting anomalies in a node based on a graph algorithm, wherein the method includes: acquiring attribute characteristics of each node in a network cluster within a predetermined time period, and using the attribute characteristics The similarity measures of the nodes are used to establish edge connections, and the nodes are connected to form an undirected graph.
  • the feature relationship operator is used to calculate the attribute characteristics to obtain the feature vectors of the attribute edges.
  • the different measures of each node are calculated to obtain a set of feature vectors for each node.
  • a technical solution adopted in the present application is to provide a node anomaly detection device based on a graph algorithm, wherein the device includes a processor, and the processor is configured to obtain attributes of nodes in a network cluster within a predetermined time period.
  • the processor is also used to calculate the feature features using the feature relationship operator to obtain the feature vector of the attribute edges; the processor is also used to calculate The different metrics of each node are used to obtain a set of feature vectors of each node; the processor is also used to use the feature vectors of each node as different feature channels, and use a predetermined training algorithm to train the feature vectors of each node to obtain the A set of feature representations; the processor is also used to calculate a reconstruction error using a predetermined self-encoding model to obtain an abnormal offset value of a set of feature vectors of each node, and determine whether the node has an abnormality based on the abnormal offset value.
  • another technical solution adopted in the present application is to provide a node anomaly detection device based on a graph algorithm, wherein the device includes: an acquisition module for acquiring each node of a network cluster within a predetermined time period.
  • Attribute characteristics using the similarity measures of attribute characteristics to establish edge connections, and connecting nodes to form an undirected graph; a first calculation module, which is used to calculate attribute characteristics using a characteristic relationship operator to obtain a feature vector of the attribute edges; second calculation Module for calculating different metrics of each node to obtain a set of feature vectors for each node; training module for using the feature vectors of each node as different feature channels, and using a predetermined training algorithm to train the feature vectors of each node To obtain a set of feature representations of each node; an offset calculation module is used to calculate a reconstruction error using a predetermined self-encoding model, to obtain an abnormal offset value of a set of feature vectors of each node, and determine whether the node exists according to the abnormal offset value abnormal.
  • another technical solution adopted in the present application is to provide a device having a storage function, wherein the device stores a program, and when the program is executed, the node abnormality detection based on the graph algorithm is implemented method.
  • this application provides a method, device and storage device for detecting node anomalies based on a graph algorithm. Based on the graph algorithm, this application calculates various types of nodes based on their characteristic attributes. Calculate the metric, compare the metric of the node with the metric of other nodes, calculate the offset, get the metric of the abnormal value, and then detect whether there are abnormal nodes.
  • FIG. 1 is a schematic flowchart of a first embodiment of a node anomaly detection method based on a graph algorithm of the present application
  • FIG. 2 is a schematic flowchart of a second embodiment of a node anomaly detection method based on a graph algorithm of the present application
  • FIG. 3 is a schematic structural diagram of a first embodiment of a node anomaly detection device based on a graph algorithm of the present application
  • FIG. 4 is a schematic structural diagram of a second embodiment of a node anomaly detection device based on a graph algorithm of the present application
  • FIG. 5 is a schematic structural diagram of a first embodiment of a device with a storage function according to the present application.
  • the present application provides a node anomaly detection method, device, and storage device based on a graph algorithm.
  • the graph structure of different levels is formed by dividing different attributes and different granularity features, that is, a multi-level graph structure.
  • Feature representations and anomalies are extracted at each level.
  • the feature representations of each level are connected to the feature representations and outliers of the training overall, which can achieve the purpose of quickly and efficiently detecting abnormal behavior nodes in each feature dimension.
  • FIG. 1 is a schematic flowchart of a first embodiment of a node anomaly detection method based on a graph algorithm of the present application.
  • the method includes the following steps:
  • S101 Obtain the attribute characteristics of each node in the network cluster within a predetermined period of time, establish edge connections with similarity measures of the attribute characteristics, and connect the nodes to form an undirected graph.
  • this application is based on the graph algorithm for node anomaly detection.
  • the graph in the algorithm is an extension of the tree.
  • the tree is a top-down data structure.
  • Each node has a parent node (except the root node), which is arranged from top to bottom.
  • the graph does not have the concept of parent-child nodes.
  • the nodes in the graph are all equal.
  • Graphs can be divided into undirected graphs (simple connections), directed graphs (connected with directions), weighted graphs (connected with weights), weighted directed graphs (connected with both directions and weights), and so on.
  • This application uses undirected graphs for related calculations.
  • the attribute characteristics of each node are obtained, and the obtained characteristic data is composed into a graph structure according to related request dependencies or connection properties.
  • the connection of edges is established with some similarity measure of the attribute characteristics to form the attribute edges.
  • the attributes of the nodes are equal, and the distribution of the attributes of the nodes is similar; for example, the IP attributes of the nodes are on the same IP segment; there is a network connection or an action connection between the nodes (when there is an action connection between the nodes, two nodes Will generate the same event, and you can assign the same value to this event, that is, the two attribute characteristics are equal) and so on.
  • the attribute characteristics of each node may be different and change at different points in time, so the composition of the graph structure is dynamic.
  • S102 Calculate the attribute features by using a feature relationship operator to obtain a feature vector of the attribute edges.
  • different nodes are connected by attribute edges.
  • the interaction can be used as the similarity measure of the attribute characteristics to establish the connection of the edges; or if the two nodes have the same or the same phase Similar features can also use these similarity measures as attribute features to establish edge connections; that is, the attribute edges connecting two nodes can be multi-attribute.
  • the feature relationship operator to calculate the attribute features of edges to obtain the feature vectors of attribute edges.
  • the operator is a mapping from function space to function space. In a broad sense, any operation on any function can be considered as an operator, such as exponentiation, square root, and logarithm. Is an operator.
  • S103 Calculate different metrics of each node to obtain a set of feature vectors of each node.
  • a node may be connected with multiple attribute edges, and according to the feature vector of the relevant attribute edge, different metrics of each node are calculated, and then represented as the basic representation vector of each node. That is, the attributes of different nodes need to be converted into numerical representations.
  • S104 Use a predetermined training algorithm to train the feature vectors of each node to obtain a set of feature representations of each node.
  • Deep learning algorithms are used for training.
  • the simplest method of deep learning is to use the characteristics of artificial neural networks.
  • Artificial neural networks are systems with a hierarchical structure. If a neural network is given, we assume its output It is the same as the input, and then train and adjust its parameters to get the weights in each layer. Naturally, we get several different representations of the input I (each layer represents a representation), and these representations are features. Deep learning is a very accurate recognition ability through a deep network.
  • S105 Calculate a reconstruction error by using a predetermined self-encoding model to obtain an abnormal offset value of a set of feature vectors of each node, and determine whether the node has an abnormality according to the abnormal offset value.
  • the automatic encoder is a neural network that reproduces the input signal as much as possible, and can also be understood as a system that tries to restore its original input.
  • the basic principle of its training is to minimize the reconstruction error (defined as the mean square error between the model output value and the original input), so that a depth can be trained without supervision (in fact, the input data is used as a supervised signal). Learning Network.
  • Reconstruction refers to recovering the original data from the transformed data. Specifically, the input data is multiplied by a matrix to obtain the result after the dimensionality reduction, and then the data after the dimensionality reduction is multiplied by the transposition of the previous weight matrix to restore an approximate original image. In this process, we hope that the more similar the image between the input layer and the output layer, the better. If the similarity is not good, an offset occurs, that is, an abnormal offset value is obtained, and whether the node has an abnormality is determined according to the abnormal offset value.
  • the undirected graph is a multi-level graph structure, and feature vectors of different levels are used as different feature granularities.
  • the method further includes: encoding of each level
  • the connection training is performed to obtain the overall encoding model, and the reconstruction error is calculated using the overall encoding model to obtain the overall offset of each node.
  • an undirected graph it includes a node set, an edge set, a subgraph structure, and an overall graph structure. Among them, the edge set, the subgraph structure, and the overall graph structure belong to different levels.
  • the hierarchy of the overall graph structure is greater than the subgraph structure.
  • the level of the subgraph structure is greater than that of the edge set, that is, the graph structure is multilevel.
  • the feature representations with different granularities are trained for connection to obtain the overall coding, and the connection here may be similar to the residual connection in a deep residual network.
  • the connection here may be similar to the residual connection in a deep residual network.
  • the purpose of feature dimension is to quickly and efficiently detect abnormal behavior nodes.
  • the overall offset is compared with a preset threshold. If the overall offset is greater than the preset threshold, it is determined that the node is abnormal.
  • the preset threshold may be any value from 0.1 to 1.0, and is specifically set according to the abnormal tolerance of the node.
  • the network cluster includes a plurality of servers, and each server is used as a node, and acquiring the attribute characteristics of each node of the network cluster within a predetermined period of time includes: acquiring physical hardware fingerprint data, network environment data, and nodes of each server. Log running status data or interaction data between nodes.
  • the physical hardware fingerprint data is that each server has the same server version / chip model, etc .
  • the network environment data is the IP segment of the server, etc .
  • the node log operation status data is the node operation status, etc .
  • the interaction data between nodes is the inter-node network. Requests, task assignments between nodes, etc. Then based on these attribute characteristics, a multi-attribute dynamic undirected graph is formed.
  • feature relationship operators are used in undirected graphs at each level to convert different attribute features of edges into numerical representations.
  • the feature relationship operator is: sum the attribute features by time zone, equal the attribute features, or log the attribute features.
  • the attribute edge is a multi-attribute edge, and a feature relationship operator is used to calculate the attribute characteristics.
  • the feature vector of the attribute edge includes: calculating different attribute features of the attribute edge under their respective feature relationship operators, and calculating The result and attribute features form the feature vector of the attribute edge.
  • a graph-related metric algorithm is used to calculate different metrics of each node.
  • graph-related metrics of various nodes can be used, such as weighted metrics of edges, subgraph structure metrics such as egonet, and overall graph structure representation metrics such as The community is subordinated to represent the basic representation vector of each node.
  • using a predetermined training algorithm to train the feature vectors of each node includes: using a deep graph node embedding (Deep Graph Embedding) training algorithm to train the feature vectors of each node to obtain one of each node. Group feature representation.
  • a deep graph node embedding (Deep Graph Embedding) training algorithm to train the feature vectors of each node to obtain one of each node. Group feature representation.
  • the models with more reconstructions used in deep learning are mainly an autoencoder and a restricted Boltzmann machine (RBM). Both models are trained on the basis of minimizing reconstruction errors.
  • the former training uses Value-based reconstruction error minimization; the latter training uses Distribution-based reconstruction error minimization.
  • a reconstruction error is calculated using a deep self-coding model to obtain an abnormal offset value of a set of feature vectors of each node.
  • FIG. 2 is a schematic flowchart of a second embodiment of a node anomaly detection method based on a graph algorithm in this application.
  • the method uses a multi-attribute, multi-level dynamic graph algorithm to perform node anomaly detection. Among them, first obtain the attribute features; then compose the graph data according to the relevant request dependency or connection properties; then divide the graph results into corresponding subgraph structures according to the properties of node attributes or similar connections (such as using a matrix decomposition algorithm); Finally, according to the feature attributes of the node, the subgraph structure to which the node belongs, and the original overall graph structure, various statistical measures of the node (such as the number of k-cores, etc.) are calculated. The metrics of other nodes are compared with the metrics of other nodes in the overall graph structure, and the offset is calculated to obtain the metrics of outliers.
  • a multi-attribute, multi-level dynamic graph algorithm to perform node anomaly detection. Among them, first obtain the attribute features; then compose the graph data according
  • the nodes a and b and the attribute edges connecting a and b are used as examples for description.
  • the attribute characteristics of each node at each level are obtained. For example, if node a initiates a network request to node b, you can use a and b as nodes and network request actions as attributes. Create nodes a, b and attribute edges in the attribute graph e ab . Attribute edges can be multi-attribute. For example, there can be multiple attribute features such as task allocation actions between nodes a and b. When there are more nodes and more attribute edges, the graph structure is also connected with related connection properties. The flowchart of two levels is shown in FIG. 2 (the flow of level 1 is S201-S204, and the flow of level 2 is S201'-S204 '). In other embodiments, the two levels are not limited. It is an arbitrary multilayer.
  • feature relation operators are used in undirected graphs at each level to convert different attribute features of edges into numerical representations.
  • Eigenrelation operators can be summation by period, equality, logarithm, etc. Taking the action attribute edge requested by server node a to b as an example, the network request action, task allocation action between nodes a and b, and their respective operation results under the characteristic relationship operator constitute the feature vector representation of the attribute edge ( ⁇ 1 , ⁇ 2 , ..., ⁇ n ).
  • various statistical metrics of the node are calculated according to the characteristic attributes of the node, the subgraph structure to which the node belongs, and the original overall graph structure.
  • graph-related metrics of various nodes are used for nodes at various levels, such as weighted metrics for edges, sub-graph structure metrics such as egonet, and overall graph structure representation metrics such as community dependencies, which are represented as the basic representation vectors of each node.
  • the attribute edge e ab as an example, according to the feature vector representation ( ⁇ 1 , ⁇ 2 , ..., ⁇ n ) of the attribute edge e ab , the different metrics of the nodes are calculated, and a set of feature vectors of node a (or node b) can be obtained. That is, a node will correspond to a set of multiple feature vectors.
  • S205 Perform joint training on the coding of each level to obtain a comprehensive feature representation and an offset value.
  • the feature vector representation of each level is regarded as different feature granularity, and the encoding of each level is connected to train the overall encoding model, such as the offset of the first level Offset from the second level The connection is performed, and the reconstruction error from the overall training is regarded as the overall offset.
  • the calculated offset is compared with a preset threshold. If the overall offset is greater than the preset threshold, it is determined that the node is abnormal.
  • the present application also provides a node anomaly detection device based on a graph algorithm.
  • FIG. 3 is a schematic structural diagram of a first embodiment of a node anomaly detection device based on a graph algorithm according to the present application.
  • the node anomaly detection device 30 includes a processor 301.
  • the processor 301 is configured to obtain attribute characteristics of each node in a network cluster within a predetermined period of time, establish edge connections with similar measures of the attribute characteristics, and connect each node to form Direct graph; processor 301 is also used to calculate characteristic features using feature relationship operators to obtain feature vectors of attribute edges; processor 301 is also used to calculate different metrics for each node to obtain a set of feature vectors for each node; processing The processor 301 is also used to use the feature vectors of each node as different feature channels, and uses a predetermined training algorithm to train the feature vectors of each node to obtain a set of feature representations of each node; the processor 301 is also used to use a predetermined self-encoding The model calculates the reconstruction error, obtains the abnormal offset value of a set of feature vectors of each node, and determines whether the node has an abnormality based on the abnormal offset value.
  • the undirected graph is a multi-level graph structure, and feature vectors of different levels are used as different feature granularities.
  • the processor 301 is also used to train the coding of each level to obtain an overall coding model.
  • the coding model calculates the reconstruction error to obtain the overall offset of each node.
  • the processor is further configured to compare the overall offset with a predetermined threshold, and if the overall offset is greater than a preset threshold, determine that the node is abnormal.
  • the node anomaly detection device 30 can be used to execute the above-mentioned graph-based algorithm for detecting anomalies in nodes, and has corresponding beneficial effects.
  • the device may be an independent device independent of the server, or may be a module or a processing unit in the server.
  • FIG. 4 is a schematic structural diagram of a second embodiment of a node anomaly detection device based on a graph algorithm of the present application.
  • the node abnormality detection device 40 is a certain module in the server, and specifically includes an acquisition module 401, a first calculation module 402, a second calculation module 403, a training module 404, and an offset calculation module 405.
  • the obtaining module 401 is used to obtain the attribute characteristics of each node of the network cluster within a predetermined period of time, establish an edge connection with the similarity measure of the attribute characteristics, and connect the nodes to form an undirected graph.
  • the first calculation module 402 is configured to calculate a feature characteristic by using a feature relationship operator to obtain a feature vector of an attribute edge.
  • the second calculation module 403 is configured to calculate different metrics of each node to obtain a set of feature vectors of each node.
  • the training module 404 is configured to use the feature vectors of each node as different feature channels, and use a predetermined training algorithm to train the feature vectors of each node to obtain a set of feature representations of each node.
  • the offset calculation module 405 is configured to calculate a reconstruction error using a predetermined self-encoding model, obtain an abnormal offset value of a set of feature vectors of each node, and determine whether the node has an abnormality according to the abnormal offset value.
  • the undirected graph has a multi-level graph structure, and feature vectors of different levels are used as different feature granularities.
  • the node anomaly detection device further includes: an overall offset calculation module for performing coding at each level. Connect the training to get the overall coding model, use the overall coding model to calculate the reconstruction error, and get the overall offset of each node.
  • the node abnormality detection device further includes a comparison module configured to compare the overall offset with a predetermined threshold. If the overall offset is greater than a preset threshold, it is determined that the node is abnormal.
  • the node anomaly detection device 40 may be configured to execute the above-mentioned graph-based algorithm for detecting anomalies in nodes, and has corresponding beneficial effects. For specific processes, refer to the description of the foregoing embodiments, and details are not described herein again.
  • FIG. 5 is a schematic structural diagram of a first embodiment of a device with a storage function according to the present application.
  • the storage device 50 stores a program 501, and when the program 501 is executed, the above-mentioned node abnormality detection method based on the graph algorithm is implemented.
  • the specific working process is the same as in the above method embodiment, so it is not repeated here.
  • the device having a storage function may be a portable storage medium such as a U disk, an optical disk, a mobile hard disk, a read-only memory (ROM, Read-Only Memory), a random access memory (RAM, Random Access Memory), or a magnetic disk.
  • the medium storing the program code may also be a terminal, a server, or the like.
  • the present application provides a node anomaly detection method, device, and storage device based on graph algorithms.
  • Feature representation connects the overall feature representation and outliers of the training, which can achieve the purpose of quickly and efficiently detecting abnormal behavior nodes in each feature dimension, ensuring the performance and security of the cluster.
  • the disclosed systems, devices, and methods may be implemented in other ways.
  • the device implementations described above are only schematic.
  • the division of the modules or units is only a logical function division.
  • multiple units or components may be divided.
  • the combination can either be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, which may be electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objective of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each of the units may exist separately physically, or two or more units may be integrated into one unit.
  • the above integrated unit may be implemented in the form of hardware or in the form of software functional unit.
  • the integrated unit When the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it may be stored in a computer-readable storage medium.
  • the technical solution of the present application is essentially a part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, which is stored in a storage medium. It includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) or a processor to perform all or part of the steps of the method described in each embodiment of the present application.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Virology (AREA)
  • Debugging And Monitoring (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种基于图算法的节点异常检测方法、装置及存储装置,所述方法包括:获取预定时间段内网络集群各节点的属性特征,以属性特征的相似度量建立边的连接,连接各节点组成无向图(S101);利用特征关系算子对属性特征进行计算,得到属性边的特征向量(S102);计算各节点的不同度量,得到各节点的一组特征向量(S103);利用预定训练算法,对各节点的特征向量进行训练,得到各节点的一组特征表示(S104);利用预定自编码模型计算重构误差,得到各节点一组特征向量的异常偏移值,根据异常偏移值判断节点是否存在异常(S105)。通过上述方式,该方法能够快速高效的检测出具有异常行为的节点。

Description

一种基于图算法的节点异常检测方法、装置及存储装置 【技术领域】
本申请涉及网络通信技术领域,特别是涉及一种基于图算法的节点异常检测方法、装置及存储装置。
【背景技术】
在开放式网络集群中,会存在部分恶意节点对集群中其他节点进行端口扫描嗅探、攻击、违规请求或伪装请求的行为,导致集群整体性能下降、大规模数据泄露、大规模失败以致系统不可用的风险。本申请的发明人在长期的研究中,发现由于在开放式集群中,节点的接入环境复杂,节点的行为动态多变不可控,基于规则匹配和监督学习的检测技术难以有效及时的检测出未知的异常行为模式。
【发明内容】
本申请主要解决的技术问题是提供一种基于图算法的节点异常检测方法、装置及存储装置,能够快速高效的检测出具有异常行为的节点。
为解决上述技术问题,本申请采用的一个技术方案是:提供一种基于图算法的节点异常检测方法,其中,所述方法包括:获取预定时间段内网络集群各节点的属性特征,以属性特征的相似度量建立边的连接,连接各节点组成无向图;利用特征关系算子对属性特征进行计算,得到属性边的特征向量;计算各节点的不同度量,得到各节点的一组特征向量;利用预定训练算法,对各节点的特征向量进行训练,得到各节点的一组特征表示;利用预定自编码模型计算重构误差,得到各节点一组特征向量的异常偏移值,根据异常偏移值判断节点是否存在异常。
为解决上述技术问题,本申请采用的一个技术方案是:提供一种基于图算法的节点异常检测装置,其中,所述装置包括处理器,处理器用于获取预定时间段内网络集群各节点的属性特征,以属性特征的相似度量建立边的连接,连接各节点组成无向图;处理器还用于利用特征关系算子对属性特征进行计算,得到属性边的特征向量;处理器还用于计算 各节点的不同度量,得到各节点的一组特征向量;处理器还用于将各节点的特征向量分别作为不同特征通道,利用预定训练算法,对各节点的特征向量进行训练,得到各节点的一组特征表示;处理器还用于利用预定自编码模型计算重构误差,得到各节点一组特征向量的异常偏移值,根据异常偏移值判断节点是否存在异常。
为解决上述技术问题,本申请采用的另一个技术方案是:提供一种基于图算法的节点异常检测装置,其中,所述装置包括:获取模块,用于获取预定时间段内网络集群各节点的属性特征,以属性特征的相似度量建立边的连接,连接各节点组成无向图;第一计算模块,用于利用特征关系算子对属性特征进行计算,得到属性边的特征向量;第二计算模块,用于计算各节点的不同度量,得到各节点的一组特征向量;训练模块,用于将各节点的特征向量分别作为不同特征通道,利用预定训练算法,对各节点的特征向量进行训练,得到各节点的一组特征表示;偏移量计算模块,用于利用预定自编码模型计算重构误差,得到各节点一组特征向量的异常偏移值,根据异常偏移值判断节点是否存在异常。
为解决上述技术问题,本申请采用的另一个技术方案是:提供一种具有存储功能的装置,其中,所述装置存储有程序,所述程序被执行时实现上述的基于图算法的节点异常检测方法。
本申请的有益效果是:区别于现有技术的情况,本申请提供一种基于图算法的节点异常检测方法、装置及存储装置,本申请基于图算法根据节点的特征属性,计算节点的各种统计度量,并将节点的度量与其他节点的度量进行比较,计算偏移量,得出异常值的度量,进而检测是否存在异常节点。
【附图说明】
图1是本申请基于图算法的节点异常检测方法第一实施方式的流程示意图;
图2是本申请基于图算法的节点异常检测方法第二实施方式的流程示意图;
图3是本申请基于图算法的节点异常检测装置第一实施方式的结构示意图;
图4是本申请基于图算法的节点异常检测装置第二实施方式的结构示意图;
图5是本申请具有存储功能的装置第一实施方式的结构示意图。
【具体实施方式】
为使本申请的目的、技术方案及效果更加清楚、明确,以下参照附图并举实施例对本申请进一步详细说明。
本申请提供一种基于图算法的节点异常检测方法、装置及存储装置,通过划分不同属性和不同粒度特征组成不同层级的图结构,即多层级的图结构;分别在各个层级提取特征表示和异常值;同时将各个层级的特征表示连接训练整体的特征表示和异常值,可以达到在各个特征维度进行快速高效检测出异常行为节点的目的。
请参阅图1,图1是本申请基于图算法的节点异常检测方法第一实施方式的流程示意图;在该实施方式中,该方法包括如下步骤:
S101:获取预定时间段内网络集群各节点的属性特征,以属性特征的相似度量建立边的连接,连接各节点组成无向图。
其中,本申请是基于图算法进行节点异常检测,图在算法中是树的拓展,树是从上向下的数据结构,节点都有一个父节点(根节点除外),从上向下排列。而图没有了父子节点的概念,图中的节点都是平等关系。图可以分为无向图(简单连接),有向图(连接有方向),加权图(连接带权值),加权有向图(连接既有方向又有权值)等。本申请采用无向图进行相关计算。获取各节点的属性特征,将获取的特征数据依据相关请求依赖或连接性质组成图结构。具体地,以属性特征的某种相似度量建立边的连接形成属性边。如可以是节点属性特征相等、节点属性特征的分布相似等;例如节点的IP属性在同一个IP段上;节点之间有网络连接或动作连接(当节点之间有动作连接时,两个节点上会产生相同的事件,可以对这个事件赋予相同的值,即两个属性特征相等)等。其中,各节点 的属性特征在不同时间点可能是不同的、变化的,所以组成的图结构是动态的。
S102:利用特征关系算子对属性特征进行计算,得到属性边的特征向量。
其中,不同节点间通过属性边进行连接,具体地,若两个节点之间有交互动作,可以将这个交互动作作为属性特征的相似度量建立边的连接;或者若两个节点间有相同或相类似的特征,也可以将这些作为属性特征的相似度量建立边的连接;也就是说连接两个节点的属性边可以是多属性的。
在该方法中,我们需要将边的不同属性特征转换为数值表示(如用特征向量表示),具体地,可以利用特征关系算子对边的属性特征进行计算,得到属性边的特征向量。其中,算子是一个函数空间到函数空间上的映射,广义的讲,对任何函数进行某一项操作都可以认为是一个算子,例如求幂次,开方,求对数等都可以认为是一个算子。
S103:计算各节点的不同度量,得到所述各节点的一组特征向量。
其中,一个节点可能连接有多条属性边,根据相关属性边的特征向量,计算各节点的不同度量,进而来表示为各个节点的基础表示向量。即需要将不同节点的属性转换为数值表示。
S104:利用预定训练算法,对各节点的特征向量进行训练,得到各节点的一组特征表示。
其中,利用深度学习算法进行训练,深度学习最简单的一种方法是利用人工神经网络的特点,人工神经网络(ANN)本身就是具有层次结构的系统,如果给定一个神经网络,我们假设其输出与输入是相同的,然后训练调整其参数,得到每一层中的权重,自然地,我们就得到了输入I的几种不同表示(每一层代表一种表示),这些表示就是特征。深度学习是通过很深层次的网络实现准确率非常高的识别能力。
S105:利用预定自编码模型计算重构误差,得到各节点一组特征向量的异常偏移值,根据异常偏移值判断节点是否存在异常。
其中,自动编码器就是一种尽可能复现输入信号的神经网络,也可 以理解为一个试图去还原其原始输入的系统。其训练的基本原理就是使得重构误差(定义为模型输出值与原始输入之间的均方误差)最小化,从而可以无监督(实际上是使用了输入数据做监督信号)地训练出一个深度学习网络。
其中,重构(Reconstruction)是指从经过变换的数据中恢复出原始数据。具体地,将输入的数据乘以一个矩阵得到降维之后的结果,之后再将降维之后的数据乘以之前权重矩阵的转置,恢复得到近似的原始图像。在这个过程中,我们希望输入层与输出层的图像之间越相似越好。如果相似性不好,则会出现偏移,即得到异常偏移值,根据异常偏移值判断节点是否存在异常。
其中,在一实施方式中,无向图为多层级图结构,将不同层级的特征向量作为不同特征粒度,在得到各节点一组特征向量的异常偏移值之后还包括:将各个层级的编码进行连接训练,得到整体编码模型,利用整体编码模型计算重构误差,得到各节点的整体偏移量。具体地,在无向图中,包括节点集,边集,子图结构,整体图结构等,其中边集,子图结构,整体图结构属于不同的层级,整体图结构的层级大于子图结构的层级,子图结构的层级大于边集的层级,即图结构为多层级的。
具体地,将不同粒度的特征表示进行连接训练,得到整体编码,这里的连接可以类似于深度残差网络中的残差连接。在该实施方式中,通过将不同粒度特征组成不同层级的图结构;分别在各个层级提取特征表示和异常值;同时将各个层级的特征表示连接训练整体的特征表示和异常值,可以达到在各个特征维度进行快速高效检测出异常行为节点的目的。
其中,在一实施方式中,将整体偏移量与预设阈值进行比较,若整体偏移量大于预设阈值,则判定节点存在异常。其中预设阈值可以是0.1~1.0的任意值,具体根据对节点的异常容忍度进行设置。
其中,在一实施方式中,网络集群包括多个服务器,并以各服务器作为节点,获取预定时间段内网络集群各节点的属性特征包括:获取各服务器的物理硬件指纹数据、网络环境数据、节点日志运行状态数据或 节点间的交互动作数据。其中,物理硬件指纹数据为各服务器拥有相同的服务器版本/芯片型号等;网络环境数据为服务器的IP段等;节点日志运行状态数据为节点操作状态等;节点间的交互动作数据为节点间网络请求、节点间任务分配等。然后根据这些属性特征组成多属性动态的无向图。
其中,在一实施方式中,在各个层级的无向图中分别使用特征关系算子,将边的不同属性特征转换为数值表示。特征关系算子为:将属性特征按时间区段求和、属性特征相等、或将属性特征求对数等。其中,属性边为多属性边,利用特征关系算子对属性特征进行计算,得到属性边的特征向量包括:将属性边的不同属性特征分别在各自的特征关系算子下进行计算,并将计算结果及属性特征组成属性边的特征向量。
其中,在一实施方式中,利用图相关度量算法计算各节点的不同度量,例如可使用各种节点的图相关度量如:边的加权度量,子图结构度量如egonet,整体图结构表示度量如社群从属,来表示为各个节点的基础表示向量。
其中,在一实施方式中,利用预定训练算法,对各节点的特征向量进行训练包括:利用深度图结点嵌入(Deep Graph Embedding)训练算法对各节点的特征向量进行训练,得到各节点的一组特征表示。
其中,在一实施方式中,深度学习中用到重构比较多的模型主要是自动编码机(Autoencoder)和限制玻尔兹曼机(RBM)。这两种模型训练的基础都是基于重构误差最小化。而且,前者的训练使用的是Value-based重构误差最小化;而后者训练使用的是Distribution-based重构误差最小化。在该实施方式中,利用深度自编码模型计算重构误差,得到各节点一组特征向量的异常偏移值。
请参阅图2,图2是本申请基于图算法的节点异常检测方法第二实施方式的流程示意图;在该实施方式中,该方法利用多属性、多层级的动态图算法进行节点异常检测。其中,先获取属性特征;然后将特征数据依据相关请求依赖或连接性质组成图结构;再对图结果按照节点属性或者相似连接等性质进行划分为相应的子图结构(如利用矩阵分解算 法);最后根据节点的特征属性、节点所属的子图结构以及原有整体图结构计算节点的各种统计度量(如k-core数等),并将节点的度量与邻居节点的度量、所在子图结构其他节点的度量以及整体图结构中其他节点的度量进行比较,计算偏移量,得出异常值的度量。
其中,在一个应用场景中,以节点a、b以及连接a、b的属性边为例进行说明。
S201:获取各节点的属性特征,根据连接性质组成图结构。
其中,分别获取各节点在各层级的属性特征,例如节点a向节点b发起网络请求,则可以以a和b为节点,网络请求动作为属性,建立属性图中的节点a、b和属性边e ab。属性边可以是多属性的,如节点a、b之间还可以有任务分配动作等多个属性特征。当有更多节点、更多属性边时,也以相关连接性质进行连接组成图结构。图2中示出了两个层级的流程图(层级1的流程为S201-S204,层级2的流程为S201’-S204’),在其他实施方式中,并不以两个层级为限定,可以是任意多层级。
S202:提取节点的边属性图特征。
其中,在各个层级的无向图中分别使用特征关系算子,将边的不同属性特征转换为数值表示。特征关系算子可以为按时段求和、相等、求对数等。以服务器节点a向b请求的动作属性边为例,将节点a、b之间的网络请求动作、任务分配动作,及其各自在特征关系算子下的运算结果组成该属性边的特征向量表示(υ 12,…,υ n)。
S203:计算节点的特征属性及其相关统计度量。
其中,在各个层级中,根据节点的特征属性、节点所属的子图结构以及原有整体图结构计算节点的各种统计度量。具体地,对各个层级的节点使用各种节点的图相关度量如:边的加权度量,子图结构度量如egonet,整体图结构表示度量如社群从属,来表示为各个节点的基础表示向量。以属性边e ab为例,根据属性边e ab的特征向量表示(υ 12,…,υ n)计算节点的不同度量,可得到节点a(或节点b)的一组特征向量
Figure PCTCN2018103052-appb-000001
即一个节点会对应一组多个特征向量。
S204:对节点表示进行训练,得到节点特征向量的异常偏移值。
其中,将各个层级的图节点的不同特征向量表示,分别作为不同特征通道,用于深度图结点嵌入(Deep Graph Embedding)训练算法,进行训练。如以特征向量
Figure PCTCN2018103052-appb-000002
为特征通道进行训练得到节点的特征表示
Figure PCTCN2018103052-appb-000003
分别对其他特征向量进行训练,得到一个节点的一组特征表示
Figure PCTCN2018103052-appb-000004
然后再利用深度自编码模型(Deep AutoEncoder)计算重构误差作为特征表示的偏移量
Figure PCTCN2018103052-appb-000005
即为该组特征向量的异常偏移值。
S205:将各层级的编码进行连接训练得到综合特征表示和偏移值。
其中,将各个层级的特征向量表示视为不同特征粒度,将各个层级的编码进行连接训练整体的编码模型,例如将第一层级的偏移量
Figure PCTCN2018103052-appb-000006
和第二层级的偏移量
Figure PCTCN2018103052-appb-000007
进行连接,整体训练出来的重构误差视为整体偏移量。
将计算得出的偏移量与预设阈值进行比较,若整体偏移量大于预设阈值,则判定节点存在异常。
以上方案,通过将不同粒度特征组成不同层级的图结构;分别在各个层级提取特征表示和异常值;同时将各个层级的特征表示连接训练整体的特征表示和异常值,可以达到在各个特征维度进行快速高效检测出异常行为节点的目的,保障集群的性能和安全。
基于上述方法,本申请还提供一种基于图算法的节点异常检测装置,请参阅图3,图3是本申请基于图算法的节点异常检测装置第一实施方式的结构示意图。在该实施方式中,节点异常检测装置30包括处理器301,处理器301用于获取预定时间段内网络集群各节点的属性特征,以属性特征的相似度量建立边的连接,连接各节点组成无向图;处理器301还用于利用特征关系算子对属性特征进行计算,得到属性边的特征向量;处理器301还用于计算各节点的不同度量,得到各节点的一组特征向量;处理器301还用于将各节点的特征向量分别作为不同特征通道,利用预定训练算法,对各节点的特征向量进行训练,得到各节点 的一组特征表示;处理器301还用于利用预定自编码模型计算重构误差,得到各节点一组特征向量的异常偏移值,根据异常偏移值判断节点是否存在异常。
其中,在一实施方式中,无向图为多层级图结构,将不同层级的特征向量作为不同特征粒度,处理器301还用于将各个层级的编码进行连接训练,得到整体编码模型,利用整体编码模型计算重构误差,得到各节点的整体偏移量。
其中,在一实施方式中,处理器还用于将整体偏移量与预定阈值进行比较,若整体偏移量大于预设阈值,则判定节点存在异常。
以上,该节点异常检测装置30可用于执行上述基于图算法的节点异常检测方法,对节点进行检测,且具有相应的有益效果,具体过程请参阅上述实施方式的描述,在此不再赘述。其中该装置可以是独立于服务器的独立装置,也可以是服务器中的某一模块,或某一处理单元。
请参阅图4,图4是本申请基于图算法的节点异常检测装置第二实施方式的结构示意图。在该实施方式中,节点异常检测装置40为服务器中的某一模块,具体包括获取模块401、第一计算模块402、第二计算模块403、训练模块404和偏移量计算模块405。
获取模块401用于获取预定时间段内网络集群各节点的属性特征,以属性特征的相似度量建立边的连接,连接各节点组成无向图。
第一计算模块402用于利用特征关系算子对属性特征进行计算,得到属性边的特征向量。
第二计算模块403用于计算各节点的不同度量,得到各节点的一组特征向量。
训练模块404用于将各节点的特征向量分别作为不同特征通道,利用预定训练算法,对各节点的特征向量进行训练,得到各节点的一组特征表示。
偏移量计算模块405用于利用预定自编码模型计算重构误差,得到各节点一组特征向量的异常偏移值,根据异常偏移值判断节点是否存在异常。
其中,在一实施方式中,无向图为多层级图结构,将不同层级的特征向量作为不同特征粒度,节点异常检测装置还包括:整体偏移量计算模块,用于将各个层级的编码进行连接训练,得到整体编码模型,利用整体编码模型计算重构误差,得到各节点的整体偏移量。
其中,在一实施方式中,节点异常检测装置还包括:比较模块,用于将整体偏移量与预定阈值进行比较,若整体偏移量大于预设阈值,则判定节点存在异常。该节点异常检测装置40可用于执行上述基于图算法的节点异常检测方法,对节点进行检测,且具有相应的有益效果,具体过程请参阅上述实施方式的描述,在此不再赘述。
本申请还提供一种具有存储功能的装置,请参阅图5,图5是本申请具有存储功能的装置第一实施方式的结构示意图。在该实施方式中,存储装置50存储有程序501,程序501被执行时实现上述基于图算法的节点异常检测方法。具体工作过程与上述方法实施例中一致,故在此不再赘述,详细请参阅以上对应方法步骤的说明。其中具有存储功能的装置可以是便携式存储介质如U盘、光盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟等各种可以存储程序代码的介质,也可以是终端、服务器等。
以上方案,本申请提供一种基于图算法的节点异常检测方法、装置及存储装置,通过将不同粒度特征组成不同层级的图结构;分别在各个层级提取特征表示和异常值;同时将各个层级的特征表示连接训练整体的特征表示和异常值,可以达到在各个特征维度进行快速高效检测出异常行为节点的目的,保障集群的性能和安全。
在本申请所提供的几个实施方式中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施方式仅仅是示意性的,例如,所述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械 或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施方式方案的目的。
另外,在本申请各个实施方式中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器(processor)执行本申请各个实施方式所述方法的全部或部分步骤。
以上所述仅为本申请的实施方式,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。

Claims (15)

  1. 一种基于图算法的节点异常检测方法,其中,所述方法包括:
    获取预定时间段内网络集群各节点的属性特征,以所述属性特征的相似度量建立边的连接,连接所述各节点组成无向图;
    利用特征关系算子对所述属性特征进行计算,得到属性边的特征向量;
    计算所述各节点的不同度量,得到所述各节点的一组特征向量;
    利用预定训练算法,对所述各节点的特征向量进行训练,得到所述各节点的一组特征表示;
    利用预定自编码模型计算重构误差,得到所述各节点一组特征向量的异常偏移值,根据所述异常偏移值判断所述节点是否存在异常。
  2. 根据权利要求1所述的基于图算法的节点异常检测方法,其中,所述无向图为多层级图结构,将不同层级的特征向量作为不同特征粒度,所述得到各节点一组特征向量的异常偏移值之后还包括:
    将各个层级的编码进行连接训练,得到整体编码模型,利用所述整体编码模型计算重构误差,得到所述各节点的整体偏移量。
  3. 根据权利要求2所述的基于图算法的节点异常检测方法,其中,所述得到各节点的整体偏移量之后还包括:
    将所述整体偏移量与预设阈值进行比较,若所述整体偏移量大于所述预设阈值,则判定所述节点存在异常。
  4. 根据权利要求1所述的基于图算法的节点异常检测方法,其中,所述属性边为多属性边,所述利用特征关系算子对所述属性特征进行计算,得到所述属性边的特征向量包括:
    将所述属性边的不同属性特征分别在各自的特征关系算子下进行计算,并将计算结果及所述属性特征组成所述属性边的特征向量。
  5. 根据权利要求1所述的基于图算法的节点异常检测方法,其中,所述特征关系算子包括:将所述属性特征按时间区段求和、所述属性特征相等、或将所述属性特征求对数。
  6. 根据权利要求1所述的基于图算法的节点异常检测方法,其中,所述网络集群包括多个服务器,并以各服务器作为节点,所述获取预定时间段内网络集群各节点的属性特征包括:
    获取所述各服务器的物理硬件指纹数据、网络环境数据、节点日志运行状态数据或节点间的交互动作数据。
  7. 根据权利要求1所述的基于图算法的节点异常检测方法,其中,所述计算各节点的不同度量,得到所述各节点的一组特征向量包括:
    利用图相关度量算法计算所述各节点的不同度量,得到所述各节点的一组特征向量。
  8. 根据权利要求7所述的基于图算法的节点异常检测方法,其中,所述图相关度量算法包括:属性边的加权度量、子图结构度量或整体图结构度量。
  9. 根据权利要求1所述的基于图算法的节点异常检测方法,其中,所述利用预定训练算法,对所述各节点的特征向量进行训练包括:
    利用深度图结点嵌入训练算法对所述各节点的特征向量进行训练,得到所述各节点的一组特征表示。
  10. 根据权利要求1所述的基于图算法的节点异常检测方法,其中,所述利用预定自编码模型计算重构误差包括:
    利用深度自编码模型计算重构误差,得到所述各节点一组特征向量的异常偏移值。
  11. 一种基于图算法的节点异常检测装置,其中,所述装置包括处理器,所述处理器用于获取预定时间段内网络集群各节点的属性特征,以所述属性特征的相似度量建立边的连接,连接所述各节点组成无向图;
    所述处理器还用于利用特征关系算子对所述属性特征进行计算,得到属性边的特征向量;
    所述处理器还用于计算所述各节点的不同度量,得到所述各节点的一组特征向量;
    所述处理器还用于将所述各节点的特征向量分别作为不同特征通道,利用预定训练算法,对所述各节点的特征向量进行训练,得到所述 各节点的一组特征表示;
    所述处理器还用于利用预定自编码模型计算重构误差,得到所述各节点一组特征向量的异常偏移值,根据所述异常偏移值判断所述节点是否存在异常。
  12. 根据权利要求11所述的基于图算法的节点异常检测装置,其中,所述无向图为多层级图结构,将不同层级的特征向量作为不同特征粒度,所述处理器还用于将各个层级的编码进行连接训练,得到整体编码模型,利用所述整体编码模型计算重构误差,得到所述各节点的整体偏移量。
  13. 根据权利要求12所述的基于图算法的节点异常检测装置,所述处理器还用于将所述整体偏移量与预定阈值进行比较,若所述整体偏移量大于所述预设阈值,则判定所述节点存在异常。
  14. 一种基于图算法的节点异常检测装置,其中,所述装置包括:
    获取模块,用于获取预定时间段内网络集群各节点的属性特征,以所述属性特征的相似度量建立边的连接,连接所述各节点组成无向图;
    第一计算模块,用于利用特征关系算子对所述属性特征进行计算,得到属性边的特征向量;
    第二计算模块,用于计算所述各节点的不同度量,得到所述各节点的一组特征向量;
    训练模块,用于将所述各节点的特征向量分别作为不同特征通道,利用预定训练算法,对所述各节点的特征向量进行训练,得到所述各节点的一组特征表示;
    偏移量计算模块,用于利用预定自编码模型计算重构误差,得到所述各节点一组特征向量的异常偏移值。
  15. 一种具有存储功能的装置,其中,所述装置存储有程序,所述程序被执行时实现权利要求1至10任一项所述的基于图算法的节点异常检测方法。
PCT/CN2018/103052 2018-08-29 2018-08-29 一种基于图算法的节点异常检测方法、装置及存储装置 WO2020042024A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201880002427.1A CN109844749B (zh) 2018-08-29 2018-08-29 一种基于图算法的节点异常检测方法、装置及存储装置
PCT/CN2018/103052 WO2020042024A1 (zh) 2018-08-29 2018-08-29 一种基于图算法的节点异常检测方法、装置及存储装置

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2018/103052 WO2020042024A1 (zh) 2018-08-29 2018-08-29 一种基于图算法的节点异常检测方法、装置及存储装置

Publications (1)

Publication Number Publication Date
WO2020042024A1 true WO2020042024A1 (zh) 2020-03-05

Family

ID=66883766

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/103052 WO2020042024A1 (zh) 2018-08-29 2018-08-29 一种基于图算法的节点异常检测方法、装置及存储装置

Country Status (2)

Country Link
CN (1) CN109844749B (zh)
WO (1) WO2020042024A1 (zh)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111612300A (zh) * 2020-04-16 2020-09-01 国网甘肃省电力公司信息通信公司 一种基于深度混合云模型的场景异常感知指标计算方法及系统
CN112837078A (zh) * 2021-03-03 2021-05-25 万商云集(成都)科技股份有限公司 一种基于集群的用户异常行为检测方法
US20220116782A1 (en) * 2020-10-08 2022-04-14 Qatar Foundation For Education, Science And Community Development Compromised mobile device detection system and method
CN114401136A (zh) * 2022-01-14 2022-04-26 天津大学 一种针对多个属性网络的快速异常检测方法
CN115278687A (zh) * 2022-07-27 2022-11-01 联通(山东)产业互联网有限公司 一种基于时空网络和图算法的电话号码诈骗检测的方法
CN115908574A (zh) * 2023-02-28 2023-04-04 深圳联和智慧科技有限公司 基于无人机监测的河堤侵占定位推送方法及系统
US11640388B2 (en) 2021-04-30 2023-05-02 International Business Machines Corporation Cognitive data outlier pre-check based on data lineage
CN116760583A (zh) * 2023-06-02 2023-09-15 四川大学 一种增强图节点行为表征及其异常图节点检测方法
CN117851959A (zh) * 2024-03-07 2024-04-09 中国人民解放军国防科技大学 基于fhgs的动态网络子图异常检测方法、装置和设备
CN117851959B (zh) * 2024-03-07 2024-05-28 中国人民解放军国防科技大学 基于fhgs的动态网络子图异常检测方法、装置和设备

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110473083B (zh) * 2019-07-08 2023-07-14 创新先进技术有限公司 树状风险账户识别方法、装置、服务器及存储介质
CN110826914A (zh) * 2019-11-07 2020-02-21 陕西师范大学 基于差异性的学习小组分组方法
CN110933105B (zh) * 2019-12-13 2021-10-22 中国电子科技网络信息安全有限公司 一种Web攻击检测方法、系统、介质和设备
CN111107107B (zh) * 2019-12-31 2022-03-29 奇安信科技集团股份有限公司 网络行为的检测方法、装置、计算机设备和存储介质
CN111770047B (zh) * 2020-05-07 2022-09-23 拉扎斯网络科技(上海)有限公司 异常群体的检测方法、装置及设备
CN111885000B (zh) * 2020-06-22 2022-06-21 网宿科技股份有限公司 一种基于图神经网络的网络攻击检测方法、系统及装置
CN111953535B (zh) * 2020-07-31 2023-06-09 鹏城实验室 一种网络故障定位方法、终端及存储介质
CN112202630A (zh) * 2020-09-16 2021-01-08 中盈优创资讯科技有限公司 一种基于无监督模型的网路质量异常检测方法及装置
CN113190790B (zh) * 2021-03-30 2023-05-30 桂林电子科技大学 一种基于多移位算子的时变图信号重构方法
CN114445639A (zh) * 2022-01-06 2022-05-06 深圳市检验检疫科学研究院 一种基于双重自注意的动态图异常检测方法
WO2023178467A1 (en) * 2022-03-21 2023-09-28 Qualcomm Incorporated Energy-efficient anomaly detection and inference on embedded systems

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102158372A (zh) * 2011-04-14 2011-08-17 哈尔滨工程大学 一种分布式系统异常检测方法
CN103888304A (zh) * 2012-12-19 2014-06-25 华为技术有限公司 一种多节点应用的异常检测方法及相关装置
CN106254175A (zh) * 2016-07-26 2016-12-21 北京蓝海讯通科技股份有限公司 一种集群异常节点检测方法、应用和计算设备
CN107786388A (zh) * 2017-09-26 2018-03-09 西安交通大学 一种基于大规模网络流数据的异常检测系统
WO2018131219A1 (ja) * 2017-01-11 2018-07-19 株式会社東芝 異常検知装置、異常検知方法、および記憶媒体
CN108345901A (zh) * 2018-01-17 2018-07-31 同济大学 一种基于自编码神经网络的符号图节点分类方法

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103713628B (zh) * 2013-12-31 2017-01-18 上海交通大学 基于符号有向图和数据重构的故障诊断方法
CN107340456B (zh) * 2017-05-25 2019-12-03 国家电网有限公司 基于多特征分析的配电网工况智能识别方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102158372A (zh) * 2011-04-14 2011-08-17 哈尔滨工程大学 一种分布式系统异常检测方法
CN103888304A (zh) * 2012-12-19 2014-06-25 华为技术有限公司 一种多节点应用的异常检测方法及相关装置
CN106254175A (zh) * 2016-07-26 2016-12-21 北京蓝海讯通科技股份有限公司 一种集群异常节点检测方法、应用和计算设备
WO2018131219A1 (ja) * 2017-01-11 2018-07-19 株式会社東芝 異常検知装置、異常検知方法、および記憶媒体
CN107786388A (zh) * 2017-09-26 2018-03-09 西安交通大学 一种基于大规模网络流数据的异常检测系统
CN108345901A (zh) * 2018-01-17 2018-07-31 同济大学 一种基于自编码神经网络的符号图节点分类方法

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111612300B (zh) * 2020-04-16 2023-10-27 国网甘肃省电力公司信息通信公司 一种基于深度混合云模型的场景异常感知指标计算方法及系统
CN111612300A (zh) * 2020-04-16 2020-09-01 国网甘肃省电力公司信息通信公司 一种基于深度混合云模型的场景异常感知指标计算方法及系统
US20220116782A1 (en) * 2020-10-08 2022-04-14 Qatar Foundation For Education, Science And Community Development Compromised mobile device detection system and method
CN112837078A (zh) * 2021-03-03 2021-05-25 万商云集(成都)科技股份有限公司 一种基于集群的用户异常行为检测方法
CN112837078B (zh) * 2021-03-03 2023-11-03 万商云集(成都)科技股份有限公司 一种基于集群的用户异常行为检测方法
US11640388B2 (en) 2021-04-30 2023-05-02 International Business Machines Corporation Cognitive data outlier pre-check based on data lineage
CN114401136A (zh) * 2022-01-14 2022-04-26 天津大学 一种针对多个属性网络的快速异常检测方法
CN114401136B (zh) * 2022-01-14 2023-05-05 天津大学 一种针对多个属性网络的快速异常检测方法
CN115278687A (zh) * 2022-07-27 2022-11-01 联通(山东)产业互联网有限公司 一种基于时空网络和图算法的电话号码诈骗检测的方法
CN115278687B (zh) * 2022-07-27 2023-08-15 联通(山东)产业互联网有限公司 一种基于时空网络和图算法的电话号码诈骗检测的方法
CN115908574B (zh) * 2023-02-28 2023-05-09 深圳联和智慧科技有限公司 基于无人机监测的河堤侵占定位推送方法及系统
CN115908574A (zh) * 2023-02-28 2023-04-04 深圳联和智慧科技有限公司 基于无人机监测的河堤侵占定位推送方法及系统
CN116760583A (zh) * 2023-06-02 2023-09-15 四川大学 一种增强图节点行为表征及其异常图节点检测方法
CN116760583B (zh) * 2023-06-02 2024-02-13 四川大学 一种增强图节点行为表征及其异常图节点检测方法
CN117851959A (zh) * 2024-03-07 2024-04-09 中国人民解放军国防科技大学 基于fhgs的动态网络子图异常检测方法、装置和设备
CN117851959B (zh) * 2024-03-07 2024-05-28 中国人民解放军国防科技大学 基于fhgs的动态网络子图异常检测方法、装置和设备

Also Published As

Publication number Publication date
CN109844749B (zh) 2023-06-20
CN109844749A (zh) 2019-06-04

Similar Documents

Publication Publication Date Title
WO2020042024A1 (zh) 一种基于图算法的节点异常检测方法、装置及存储装置
Zhu et al. Network latency estimation for personal devices: A matrix completion approach
CN108205570B (zh) 一种数据检测方法和装置
WO2011140293A2 (en) System and method for determining application dependency paths in a data center
CN103838803A (zh) 一种基于节点Jaccard相似度的社交网络社团发现方法
CN104360924A (zh) 一种在云数据中心环境下对虚拟机进行监控等级划分的方法
CN104391879B (zh) 层次聚类的方法及装置
Mall et al. Representative subsets for big data learning using k-NN graphs
US20190250950A1 (en) Dynamically configurable operation information collection
KR20220143766A (ko) 데이터 품질 문제들의 동적 발견 및 수정
CN104835174B (zh) 基于超图模式搜索的鲁棒模型拟合方法
Ren et al. Integrated defense for resilient graph matching
CN110309154B (zh) 基于图谱的实体特征选择方法、装置、设备和存储介质
Zhang DBSCAN Clustering Algorithm Based on Big Data Is Applied in Network Information Security Detection
Hadi et al. Dynamic Evolving Cauchy Possibilistic Clustering Based on the Self-Similarity Principle (DECS) for Enhancing Intrusion Detection System.
CN105228185A (zh) 一种用于识别通信网络中模糊冗余节点身份的方法
Wang et al. Incremental causal graph learning for online root cause analysis
Diao et al. Clustering by detecting density peaks and assigning points by similarity-first search based on weighted K-nearest neighbors graph
CN111401412B (zh) 一种基于平均共识算法的物联网环境下分布式软聚类方法
CN113515519A (zh) 图结构估计模型的训练方法、装置、设备及存储介质
Ye et al. GCplace: geo-cloud based correlation aware data replica placement
Yuan et al. Research on the fusion method of spatial data and multimedia information of multimedia sensor networks in cloud computing environment
Rafailidis et al. Network completion via joint node clustering and similarity learning
CN115118525A (zh) 一种物联网安全防护系统及其防护方法
Zhao et al. Parallel algorithms for anomalous subgraph detection

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18932298

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18932298

Country of ref document: EP

Kind code of ref document: A1