CN109844749B - Node abnormality detection method and device based on graph algorithm and storage device - Google Patents

Node abnormality detection method and device based on graph algorithm and storage device Download PDF

Info

Publication number
CN109844749B
CN109844749B CN201880002427.1A CN201880002427A CN109844749B CN 109844749 B CN109844749 B CN 109844749B CN 201880002427 A CN201880002427 A CN 201880002427A CN 109844749 B CN109844749 B CN 109844749B
Authority
CN
China
Prior art keywords
node
attribute
graph
nodes
calculating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201880002427.1A
Other languages
Chinese (zh)
Other versions
CN109844749A (en
Inventor
袁振南
朱鹏新
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Quliantong Network Co ltd
Original Assignee
Quliantong Network Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Quliantong Network Co ltd filed Critical Quliantong Network Co ltd
Publication of CN109844749A publication Critical patent/CN109844749A/en
Application granted granted Critical
Publication of CN109844749B publication Critical patent/CN109844749B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The application discloses a method, a device and a storage device for detecting node abnormality based on a graph algorithm, wherein the method comprises the following steps: acquiring attribute characteristics of all nodes in an intra-network cluster in a preset time period, and establishing edge connection by using similarity measurement of the attribute characteristics to connect all the nodes to form an undirected graph; calculating attribute characteristics by utilizing a characteristic relation operator to obtain a characteristic vector of an attribute side; calculating different metrics of each node to obtain a group of feature vectors of each node; training the feature vector of each node by utilizing a preset training algorithm to obtain a group of feature representations of each node; and calculating a reconstruction error by using a preset self-coding model to obtain an abnormal offset value of a group of feature vectors of each node, and judging whether the node is abnormal or not according to the abnormal offset value. Through the mode, the node with abnormal behaviors can be detected rapidly and efficiently.

Description

Node abnormality detection method and device based on graph algorithm and storage device
Technical Field
The present invention relates to the field of network communications technologies, and in particular, to a method and an apparatus for detecting node anomalies based on a graph algorithm, and a storage device.
Background
In an open network cluster, the behavior that a part of malicious nodes perform port scanning sniffing, attack, illegal requests or disguise requests on other nodes in the cluster can exist, so that the overall performance of the cluster is reduced, large-scale data leakage and large-scale failure are caused, and the risk that a system is not available is caused. In long-term researches, the inventor of the application finds that because in an open cluster, the access environment of a node is complex, the behavior of the node is dynamically changeable and uncontrollable, and unknown abnormal behavior patterns are difficult to effectively and timely detect by a detection technology based on rule matching and supervised learning.
Disclosure of Invention
The technical problem that this application mainly solves is to provide a node anomaly detection method, device and storage device based on the graph algorithm, can detect the node that has abnormal behavior fast and high-efficient.
In order to solve the technical problems, one technical scheme adopted by the application is as follows: the method for detecting the node abnormality based on the graph algorithm comprises the following steps: acquiring attribute characteristics of all nodes in an intra-network cluster in a preset time period, and establishing edge connection by using similarity measurement of the attribute characteristics to connect all the nodes to form an undirected graph; calculating attribute characteristics by utilizing a characteristic relation operator to obtain a characteristic vector of an attribute side; calculating different metrics of each node to obtain a group of feature vectors of each node; training the feature vector of each node by utilizing a preset training algorithm to obtain a group of feature representations of each node; and calculating a reconstruction error by using a preset self-coding model to obtain an abnormal offset value of a group of feature vectors of each node, and judging whether the node is abnormal or not according to the abnormal offset value.
In order to solve the technical problems, one technical scheme adopted by the application is as follows: the device comprises a processor, wherein the processor is used for acquiring attribute characteristics of all nodes in an intra-network cluster in a preset time period, establishing edge connection by using similarity measurement of the attribute characteristics, and connecting all the nodes to form an undirected graph; the processor is also used for calculating the attribute characteristics by utilizing the characteristic relation operator to obtain the characteristic vector of the attribute edge; the processor is also used for calculating different metrics of each node to obtain a group of characteristic vectors of each node; the processor is further used for respectively taking the feature vectors of the nodes as different feature channels, and training the feature vectors of the nodes by utilizing a preset training algorithm to obtain a group of feature representations of the nodes; the processor is also used for calculating a reconstruction error by utilizing a preset self-coding model to obtain an abnormal offset value of a group of feature vectors of each node, and judging whether the node has an abnormality or not according to the abnormal offset value.
In order to solve the technical problems, another technical scheme adopted by the application is as follows: provided is a node anomaly detection device based on a graph algorithm, wherein the device comprises: the acquisition module is used for acquiring attribute characteristics of all nodes in the network cluster in a preset time period, establishing edge connection by using similarity measurement of the attribute characteristics, and connecting all the nodes to form an undirected graph; the first calculation module is used for calculating the attribute characteristics by utilizing the characteristic relation operator to obtain the characteristic vector of the attribute edge; the second calculation module is used for calculating different metrics of each node to obtain a group of characteristic vectors of each node; the training module is used for respectively taking the characteristic vectors of the nodes as different characteristic channels, and training the characteristic vectors of the nodes by utilizing a preset training algorithm to obtain a group of characteristic representations of the nodes; and the offset calculation module is used for calculating a reconstruction error by utilizing a preset self-coding model to obtain an abnormal offset value of a group of feature vectors of each node, and judging whether the node is abnormal or not according to the abnormal offset value.
In order to solve the technical problems, another technical scheme adopted by the application is as follows: an apparatus having a storage function is provided, wherein the apparatus stores a program that when executed implements the above-described graph algorithm-based node abnormality detection method.
The beneficial effects of this application are: different from the situation of the prior art, the application provides a method, a device and a storage device for detecting node abnormality based on a graph algorithm.
Drawings
FIG. 1 is a schematic flow chart of a first embodiment of a method for detecting node anomalies based on a graph algorithm;
FIG. 2 is a schematic flow chart of a second embodiment of a method for detecting node anomalies based on a graph algorithm;
fig. 3 is a schematic structural diagram of a first embodiment of a node abnormality detection device based on a graph algorithm according to the present application;
fig. 4 is a schematic structural diagram of a second embodiment of the node abnormality detection device based on the graph algorithm of the present application;
fig. 5 is a schematic structural view of a first embodiment of a device with a memory function according to the present application.
Detailed Description
In order to make the objects, technical solutions and effects of the present application clearer and more specific, the present application will be further described in detail below with reference to the accompanying drawings and examples.
The application provides a node anomaly detection method, a node anomaly detection device and a node anomaly detection storage device based on a graph algorithm, wherein different layers of graph structures, namely multi-layer graph structures, are formed by dividing different attributes and different granularity characteristics; extracting characteristic representation and abnormal values at each level respectively; meanwhile, the characteristic representation of each level is connected with the characteristic representation and the abnormal value of the training whole, so that the purpose of rapidly and efficiently detecting abnormal behavior nodes in each characteristic dimension can be achieved.
Referring to fig. 1, fig. 1 is a schematic flow chart of a first embodiment of a node anomaly detection method based on a graph algorithm in the present application; in this embodiment, the method comprises the steps of:
s101: and acquiring attribute characteristics of all nodes in the network cluster in a preset time period, and establishing edge connection by using similarity measurement of the attribute characteristics to connect all the nodes to form an undirected graph.
The method and the device are used for detecting node abnormality based on a graph algorithm, wherein the graph is an expansion of a tree in the algorithm, the tree is a data structure from top to bottom, and nodes are all provided with a father node (except a root node) and are arranged from top to bottom. The graph has no concept of parent-child nodes, and the nodes in the graph are all equal. The graphs can be classified into undirected graphs (simple connections), directed graphs (connected with directions), weighted graphs (connected with weights), weighted directed graphs (connected with both directions and weights), and the like. The application uses undirected graphs for correlation calculations. And acquiring attribute characteristics of each node, and forming a graph structure according to dependence or connection properties of the acquired characteristic data according to the related request. Specifically, establishing a connection of edges with some measure of similarity of the attribute features forms an attribute edge. For example, the node attribute features are equal, the distribution of the node attribute features is similar, and the like; for example, the IP attributes of the nodes are on the same IP segment; there is a network connection or an action connection between the nodes (when there is an action connection between the nodes, the same event will occur on both nodes, and the same value may be given to this event, i.e. the two attribute features are equal), etc. Wherein the attribute characteristics of each node may be different and varied at different time points, so the composed graph structure is dynamic.
S102: and calculating the attribute characteristics by utilizing a characteristic relation operator to obtain the characteristic vector of the attribute edge.
The nodes are connected through attribute edges, and specifically, if an interaction action exists between two nodes, the interaction action can be used as a similarity measure of attribute characteristics to establish edge connection; or if two nodes have the same or similar characteristics, the similar measurement serving as the attribute characteristics can also be used for establishing edge connection; that is, the attribute edges connecting two nodes may be multi-attribute.
In the method, different attribute features of the edge need to be converted into numerical representation (such as feature vector representation), and specifically, the attribute features of the edge can be calculated by using a feature relation operator to obtain the feature vector of the attribute edge. Where an operator is a mapping of a function space to a function space, in a broad sense, any operation performed on any function can be considered an operator, e.g., exponentiation, evolution, logarithm, etc., can be considered an operator.
S103: and calculating different metrics of each node to obtain a group of characteristic vectors of each node.
One node may be connected with a plurality of attribute edges, and different metrics of each node are calculated according to the feature vectors of the related attribute edges, so as to be represented as basic representation vectors of each node. I.e. the properties of the different nodes need to be converted into a numerical representation.
S104: and training the feature vector of each node by utilizing a preset training algorithm to obtain a group of feature representations of each node.
The most simple method of deep learning is to use the characteristics of an Artificial Neural Network (ANN), which is a hierarchical system, and if a neural network is given, we assume that its output is the same as its input, and then train and adjust its parameters to obtain weights in each layer, and naturally we obtain several different representations of the input I (each layer represents a representation), which are features. Deep learning is the ability to achieve very high accuracy recognition over very deep networks.
S105: and calculating a reconstruction error by using a preset self-coding model to obtain an abnormal offset value of a group of feature vectors of each node, and judging whether the node is abnormal or not according to the abnormal offset value.
An automatic encoder is a neural network that reproduces an input signal as much as possible, and can be understood as a system that attempts to restore its original input. The basic principle of training is to minimize the reconstruction error (defined as the mean square error between the model output value and the original input) so that a deep learning network can be trained unsupervised (in practice using the input data as a supervisory signal).
Reconstruction (Reconstruction) refers to recovering the original data from the transformed data. Specifically, the input data is multiplied by a matrix to obtain a result after the dimension reduction, and then the data after the dimension reduction is multiplied by the transpose of the previous weight matrix to recover an approximate original image. In this process, we want the more similar and better the images between the input layer and the output layer. If the similarity is not good, the offset occurs, namely an abnormal offset value is obtained, and whether the node is abnormal or not is judged according to the abnormal offset value.
In one embodiment, the undirected graph is a multi-level graph structure, uses feature vectors of different levels as different feature granularities, and further includes, after obtaining an abnormal offset value of a set of feature vectors of each node: and performing connection training on codes of each level to obtain an overall coding model, and calculating a reconstruction error by using the overall coding model to obtain the overall offset of each node. Specifically, in the undirected graph, the undirected graph includes a node set, an edge set, a sub-graph structure, an overall graph structure, and the like, wherein the edge set, the sub-graph structure, the overall graph structure belong to different levels, the level of the overall graph structure is greater than that of the sub-graph structure, and the level of the sub-graph structure is greater than that of the edge set, i.e., the graph structure is multi-level.
In particular, the different granularity of the feature representations are joint trained to obtain a global code, where the joint may be similar to a residual joint in a depth residual network. In this embodiment, the graph structure is formed by combining different granularity features into different levels; extracting characteristic representation and abnormal values at each level respectively; meanwhile, the characteristic representation of each level is connected with the characteristic representation and the abnormal value of the training whole, so that the purpose of rapidly and efficiently detecting abnormal behavior nodes in each characteristic dimension can be achieved.
In one embodiment, the overall offset is compared with a preset threshold, and if the overall offset is greater than the preset threshold, it is determined that the node is abnormal. The preset threshold value can be any value of 0.1-1.0, and is specifically set according to the abnormal tolerance of the node.
In one embodiment, the network cluster includes a plurality of servers, and the obtaining attribute features of each node in the network cluster in the predetermined period of time by using each server as a node includes: and acquiring physical hardware fingerprint data, network environment data, node log running state data or interaction action data among nodes of each server. Wherein, the physical hardware fingerprint data is that each server has the same server version/chip model and the like; the network environment data is the IP section of the server, etc.; the node log running state data is node operating state and the like; the interaction data between the nodes is network request between the nodes, task allocation between the nodes and the like. And then forming a multi-attribute dynamic undirected graph according to the attribute characteristics.
In one embodiment, feature relation operators are used in undirected graphs of various levels respectively to convert different attribute features of the edges into numerical representations. The characteristic relation operator is as follows: summing the attribute features by time segment, equaling the attribute features, or logarithm of the attribute features, etc. The attribute edges are multi-attribute edges, the attribute features are calculated by utilizing a feature relation operator, and the obtaining of the feature vectors of the attribute edges comprises the following steps: and respectively calculating different attribute characteristics of the attribute edges under respective characteristic relation operators, and forming the calculation result and the attribute characteristics into characteristic vectors of the attribute edges.
In one embodiment, the graph correlation metric algorithm is used to calculate different metrics of each node, for example, graph correlation metrics of various nodes may be used such as: the weighting of edges, the sub-graph structure metrics, such as egonet, and the overall graph structure representation metrics, such as community membership, are represented as base representation vectors for each node.
In one embodiment, training the feature vector of each node by using a predetermined training algorithm includes: and training the feature vector of each node by utilizing a depth map node embedding (Deep Graph Embedding) training algorithm to obtain a group of feature representations of each node.
Among the models used in deep learning for more reconstruction are, in one embodiment, mainly an automatic encoder (Autoencoder) and a Boltzmann restriction machine (RBM). The basis for both model training is based on reconstruction error minimization. Moreover, the former training uses Value-based reconstruction error minimization; while the latter training uses Distribution-based reconstruction error minimization. In this embodiment, the reconstruction error is calculated using the depth self-coding model, and an abnormal offset value of a set of feature vectors for each node is obtained.
Referring to fig. 2, fig. 2 is a schematic flow chart of a second embodiment of a node anomaly detection method based on a graph algorithm according to the present application; in this embodiment, the method utilizes a multi-attribute, multi-level dynamic graph algorithm for node anomaly detection. Firstly, acquiring attribute characteristics; then the characteristic data is dependent or connected according to the related request to form a graph structure; dividing the graph result into corresponding sub-graph structures (such as a matrix decomposition algorithm) according to the node attribute or the similar connection property; and finally, calculating various statistical metrics (such as k-core numbers) of the nodes according to the characteristic attributes of the nodes, the sub-graph structure to which the nodes belong and the original whole graph structure, comparing the metrics of the nodes with metrics of neighbor nodes, metrics of other nodes in the sub-graph structure and metrics of other nodes in the whole graph structure, and calculating offset to obtain the metrics of abnormal values.
In one application scenario, the nodes a and b and attribute edges connecting the nodes a and b are taken as examples.
S201: and acquiring attribute characteristics of each node, and forming a graph structure according to the connection property.
Wherein, respectively acquiring attribute characteristics of each node in each level, for example, node a initiates a network request to node b, then a and b can be taken as nodes, network request actions are taken as attributes, and nodes a, b and attribute edges e in the attribute graph are established ab . The attribute edges may be multi-attribute, such as multiple attribute features such as task allocation actions between nodes a, b. When there are more nodes and more attribute edges, the graph structure is formed by connecting the related connection properties. In fig. 2, a flow chart of two levels (the flow of level 1 is S201-S204, the flow of level 2 is S201 '-S204'), and in other embodiments, the flow chart is not limited to two levels, but may be any multi-level flow chart.
S202: and extracting edge attribute graph characteristics of the nodes.
And respectively using feature relation operators in the undirected graph of each hierarchy to convert different attribute features of the edges into numerical representation. The feature relation operators can be summed, equal, logarithmized, etc. by time period. Taking the action attribute edge requested by the server node a to the node b as an example, the network request action, the task allocation action and the like between the nodes a and bThe operation results under the characteristic relation operator respectively form the characteristic vector representation (v) of the attribute edge 12 ,…,υ n )。
S203: the characteristic attributes of the nodes and the relevant statistical metrics thereof are calculated.
In each hierarchy, various statistical metrics of the nodes are calculated according to the characteristic attributes of the nodes, the sub-graph structures to which the nodes belong and the original integral graph structure. Specifically, graph-related metrics of various nodes are used for the nodes of each hierarchy such as: the weighting of edges, the sub-graph structure metrics, such as egonet, and the overall graph structure representation metrics, such as community membership, are represented as base representation vectors for each node. With attribute edge e ab For example, according to attribute edge e ab Is (v) 12 ,…,υ n ) Calculating different metrics of the nodes to obtain a set of feature vectors of the node a (or the node b)
Figure BDA0001902766820000081
I.e. a node will correspond to a set of multiple feature vectors.
S204: training the node representation to obtain an abnormal offset value of the node characteristic vector.
And representing different feature vectors of the graph nodes of each level as different feature channels respectively, and using the different feature vectors as a training algorithm for embedding (Deep Graph Embedding) the depth graph nodes for training. Such as by feature vectors
Figure BDA0001902766820000082
Training for feature channels to obtain a feature representation of the node>
Figure BDA0001902766820000083
Training other feature vectors to obtain a group of feature representations of a node
Figure BDA0001902766820000084
Then the depth self-coding model (Deep AutoEncoder) is used to calculate the reconstructed error as the offset of the feature representation
Figure BDA0001902766820000085
I.e. the outlier offset of the set of feature vectors.
S205: and performing connection training on the codes of each level to obtain comprehensive characteristic representation and offset values.
Wherein the feature vector representation of each level is regarded as different feature granularities, the codes of each level are connected and trained into a whole coding model, such as the offset of the first level
Figure BDA0001902766820000086
And an offset of the second level
Figure BDA0001902766820000087
And connecting, wherein the reconstruction error obtained through overall training is regarded as the overall offset.
Comparing the calculated offset with a preset threshold, and if the overall offset is greater than the preset threshold, judging that the node is abnormal.
According to the scheme, the graph structures of different levels are formed by the features with different granularity; extracting characteristic representation and abnormal values at each level respectively; meanwhile, the characteristic representation of each level is connected with the characteristic representation and the abnormal value of the training whole, so that the purpose of rapidly and efficiently detecting abnormal behavior nodes in each characteristic dimension can be achieved, and the performance and the safety of the cluster are ensured.
Based on the above method, the present application further provides a device for detecting node abnormality based on a graph algorithm, referring to fig. 3, fig. 3 is a schematic structural diagram of a first embodiment of the device for detecting node abnormality based on the graph algorithm. In this embodiment, the node anomaly detection apparatus 30 includes a processor 301, where the processor 301 is configured to obtain attribute characteristics of each node in the network cluster in a predetermined period of time, and establish connection of edges according to a similarity measure of the attribute characteristics, and connect each node to form an undirected graph; the processor 301 is further configured to calculate the attribute feature by using a feature relation operator, so as to obtain a feature vector of the attribute edge; the processor 301 is further configured to calculate different metrics of each node, to obtain a set of feature vectors of each node; the processor 301 is further configured to respectively use the feature vectors of the nodes as different feature channels, and train the feature vectors of the nodes by using a predetermined training algorithm to obtain a set of feature representations of the nodes; the processor 301 is further configured to calculate a reconstruction error by using a predetermined self-coding model, obtain an abnormal offset value of a set of feature vectors of each node, and determine whether the node has an abnormality according to the abnormal offset value.
In one embodiment, the undirected graph is a multi-level graph structure, and the feature vectors of different levels are used as different feature granularities, and the processor 301 is further configured to perform connection training on the codes of each level to obtain an overall coding model, and calculate the reconstruction error by using the overall coding model to obtain the overall offset of each node.
In one embodiment, the processor is further configured to compare the global offset with a predetermined threshold, and determine that the node is abnormal if the global offset is greater than the predetermined threshold.
The node anomaly detection device 30 can be used for executing the node anomaly detection method based on the graph algorithm, and has the advantages of detecting the node, and the detailed process is described in the above embodiments, and is not repeated here. The device may be a stand-alone device independent of the server, or may be a module or a processing unit in the server.
Referring to fig. 4, fig. 4 is a schematic structural diagram of a second embodiment of a node anomaly detection device based on a graph algorithm according to the present application. In this embodiment, the node anomaly detection apparatus 40 is a module in a server, and specifically includes an acquisition module 401, a first calculation module 402, a second calculation module 403, a training module 404, and an offset calculation module 405.
The obtaining module 401 is configured to obtain attribute characteristics of nodes in the network cluster within a predetermined period of time, and establish connection of edges according to similarity measurement of the attribute characteristics, and connect the nodes to form an undirected graph.
The first calculation module 402 is configured to calculate the attribute feature by using the feature relation operator, so as to obtain a feature vector of the attribute edge.
The second calculation module 403 is configured to calculate different metrics of each node, so as to obtain a set of feature vectors of each node.
The training module 404 is configured to train the feature vector of each node by using a predetermined training algorithm with the feature vector of each node as different feature channels, so as to obtain a set of feature representations of each node.
The offset calculation module 405 is configured to calculate a reconstruction error by using a predetermined self-coding model, obtain an abnormal offset value of a set of feature vectors of each node, and determine whether the node has an abnormality according to the abnormal offset value.
In one embodiment, the undirected graph is a multi-level graph structure, and feature vectors of different levels are used as different feature granularities, and the node anomaly detection device further includes: and the integral offset calculation module is used for carrying out connection training on codes of all levels to obtain an integral coding model, and calculating a reconstruction error by using the integral coding model to obtain the integral offset of each node.
In one embodiment, the node anomaly detection device further includes: and the comparison module is used for comparing the integral offset with a preset threshold value, and judging that the node is abnormal if the integral offset is larger than the preset threshold value. The node anomaly detection device 40 can be used for executing the node anomaly detection method based on the graph algorithm, and has the corresponding beneficial effects, and the specific process is described in the above embodiment, and will not be repeated here.
Referring to fig. 5, fig. 5 is a schematic structural diagram of a first embodiment of the device with storage function. In this embodiment, the storage device 50 stores a program 501, and the program 501 implements the node abnormality detection method based on the graph algorithm described above when executed. The specific working process is identical to that of the above method embodiment, so that the detailed description thereof will be omitted herein, and the detailed description of the corresponding method steps will be referred to above. The device having the storage function may be a portable storage medium such as a U-disk, an optical disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or the like, or may be a terminal, a server, or the like.
According to the scheme, the node abnormality detection method and device based on the graph algorithm and the storage device are provided, and graph structures of different levels are formed by different granularity characteristics; extracting characteristic representation and abnormal values at each level respectively; meanwhile, the characteristic representation of each level is connected with the characteristic representation and the abnormal value of the training whole, so that the purpose of rapidly and efficiently detecting abnormal behavior nodes in each characteristic dimension can be achieved, and the performance and the safety of the cluster are ensured.
In the several embodiments provided in this application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the modules or units is merely a logical functional division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the embodiment.
In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist alone physically, or two or more units may be integrated into one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied in essence or a part contributing to the prior art or all or part of the technical solution, in the form of a software product stored in a storage medium, including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to perform all or part of the steps of the methods described in the embodiments of the present application.
The foregoing description is only of embodiments of the present application, and is not intended to limit the scope of the patent application, and all equivalent structures or equivalent processes using the descriptions and the contents of the present application or other related technical fields are included in the scope of the patent application.

Claims (14)

1. A graph algorithm-based node anomaly detection method, wherein the method comprises the steps of:
acquiring attribute characteristics of network cluster nodes in a preset time period, and establishing edge connection by using similarity measurement of the attribute characteristics to form an undirected graph;
calculating the attribute characteristics by utilizing a characteristic relation operator to obtain a characteristic vector of an attribute side;
calculating the statistical measure of the node to obtain the feature vector of the node; the statistical measure of the node comprises a characteristic vector of the attribute edge;
training the feature vector of the node by utilizing a depth map node embedding training algorithm to obtain the feature representation of the node;
and calculating a reconstruction error by using a preset self-coding model to obtain an abnormal offset value of the node characteristic vector, and judging whether the node has an abnormality or not according to the abnormal offset value.
2. The method for detecting node anomaly based on graph algorithm according to claim 1, wherein the undirected graph is a multi-level graph structure, feature vectors of different levels are used as different feature granularities, and the obtaining of the anomaly offset value of the node feature vector further comprises:
and performing connection training on codes of each level to obtain an overall coding model, and calculating a reconstruction error by using the overall coding model to obtain the overall offset of the node.
3. The graph algorithm-based node anomaly detection method according to claim 2, wherein the obtaining the overall offset of the node further comprises:
and comparing the integral offset with a preset threshold, and if the integral offset is larger than the preset threshold, judging that the node is abnormal.
4. The graph algorithm-based node anomaly detection method of claim 1, wherein the attribute edge is a multi-attribute edge, the calculating the attribute feature using a feature relation operator, obtaining a feature vector of the attribute edge comprises:
and respectively calculating different attribute characteristics of the attribute edges under respective characteristic relation operators, and forming a calculation result and the attribute characteristics into characteristic vectors of the attribute edges.
5. The graph algorithm-based node anomaly detection method of claim 1, wherein the feature relation operator comprises: summing the attribute features by time segment, equaling the attribute features, or logarithming the attribute features.
6. The graph algorithm-based node anomaly detection method according to claim 1, wherein the network cluster includes a plurality of servers, each server is used as a node, and the acquiring the attribute characteristics of the network cluster node in the predetermined period of time includes:
and acquiring physical hardware fingerprint data, network environment data, node log running state data or interaction action data among nodes of the server.
7. The graph algorithm-based node anomaly detection method of claim 1, wherein the calculating the statistical metric of the node, obtaining the feature vector of the node, comprises:
and calculating the statistical measure of the node by using a graph correlation measure algorithm to obtain the feature vector of the node.
8. The graph-algorithm-based node anomaly detection method of claim 7, wherein the graph-correlation metric algorithm comprises: a weighted metric of attribute edges, a sub-graph structure metric, or an overall graph structure metric.
9. The graph algorithm-based node anomaly detection method of claim 1, wherein the calculating a reconstruction error using a predetermined self-coding model comprises:
and calculating a reconstruction error by using the depth self-coding model to obtain an abnormal offset value of the node characteristic vector.
10. The node anomaly detection device based on the graph algorithm comprises a processor, wherein the processor is used for acquiring attribute characteristics of network cluster nodes in a preset time period, and establishing edge connection by using similarity measurement of the attribute characteristics to form an undirected graph;
the processor is also used for calculating the attribute characteristics by utilizing a characteristic relation operator to obtain characteristic vectors of attribute edges;
the processor is further configured to calculate a statistical metric of the node, to obtain a feature vector of the node, where the statistical metric of the node includes a metric of the feature vector of the attribute edge;
the processor is further used for respectively taking the feature vectors of the nodes as different feature channels, embedding a training algorithm into the depth map nodes, and training the feature vectors of the nodes to obtain feature representation of the nodes;
the processor is further used for calculating a reconstruction error by using a preset self-coding model to obtain an abnormal offset value of the node characteristic vector, and judging whether the node is abnormal or not according to the abnormal offset value.
11. The node anomaly detection device based on graph algorithm of claim 10, wherein the undirected graph is a multi-level graph structure, feature vectors of different levels are used as different feature granularities, the processor is further used for performing connection training on codes of all levels to obtain an overall coding model, and the overall coding model is used for calculating reconstruction errors to obtain the overall offset of the node.
12. The graph algorithm-based node anomaly detection apparatus of claim 11, the processor further configured to compare the global offset to a preset threshold, and determine that the node is anomalous if the global offset is greater than the preset threshold.
13. A graph algorithm-based node anomaly detection apparatus, wherein the apparatus comprises:
the acquisition module is used for acquiring attribute characteristics of network cluster nodes in a preset time period, and establishing edge connection by using similarity measurement of the attribute characteristics to form an undirected graph;
the first calculation module is used for calculating the attribute characteristics by utilizing a characteristic relation operator to obtain the characteristic vectors of the attribute edges;
the second calculation module is used for calculating the statistical measure of the node to obtain the characteristic vector of the node;
the training module is used for respectively taking the characteristic vectors of the nodes as different characteristic channels, embedding a training algorithm into the nodes by utilizing the depth map, and training the characteristic vectors of the nodes to obtain the characteristic representation of the nodes;
and the offset calculation module is used for calculating a reconstruction error by utilizing a preset self-coding model to obtain an abnormal offset value of the node characteristic vector.
14. An apparatus having a storage function, wherein the apparatus stores a program which, when executed, implements the graph algorithm-based node abnormality detection method according to any one of claims 1 to 9.
CN201880002427.1A 2018-08-29 2018-08-29 Node abnormality detection method and device based on graph algorithm and storage device Active CN109844749B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2018/103052 WO2020042024A1 (en) 2018-08-29 2018-08-29 Node abnormality detection method and device based on graph algorithm and storage device

Publications (2)

Publication Number Publication Date
CN109844749A CN109844749A (en) 2019-06-04
CN109844749B true CN109844749B (en) 2023-06-20

Family

ID=66883766

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201880002427.1A Active CN109844749B (en) 2018-08-29 2018-08-29 Node abnormality detection method and device based on graph algorithm and storage device

Country Status (2)

Country Link
CN (1) CN109844749B (en)
WO (1) WO2020042024A1 (en)

Families Citing this family (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110473083B (en) * 2019-07-08 2023-07-14 创新先进技术有限公司 Tree risk account identification method, device, server and storage medium
CN110826914A (en) * 2019-11-07 2020-02-21 陕西师范大学 Learning group grouping method based on difference
CN110933105B (en) * 2019-12-13 2021-10-22 中国电子科技网络信息安全有限公司 Web attack detection method, system, medium and equipment
CN111107107B (en) * 2019-12-31 2022-03-29 奇安信科技集团股份有限公司 Network behavior detection method and device, computer equipment and storage medium
CN111612300B (en) * 2020-04-16 2023-10-27 国网甘肃省电力公司信息通信公司 Scene anomaly perception index calculation method and system based on depth hybrid cloud model
CN111770047B (en) * 2020-05-07 2022-09-23 拉扎斯网络科技(上海)有限公司 Abnormal group detection method, device and equipment
CN111885000B (en) * 2020-06-22 2022-06-21 网宿科技股份有限公司 Network attack detection method, system and device based on graph neural network
CN111953535B (en) * 2020-07-31 2023-06-09 鹏城实验室 Network fault positioning method, terminal and storage medium
CN112202630A (en) * 2020-09-16 2021-01-08 中盈优创资讯科技有限公司 Network quality abnormity detection method and device based on unsupervised model
US20220116782A1 (en) * 2020-10-08 2022-04-14 Qatar Foundation For Education, Science And Community Development Compromised mobile device detection system and method
CN112837078B (en) * 2021-03-03 2023-11-03 万商云集(成都)科技股份有限公司 Method for detecting abnormal behavior of user based on clusters
CN113190790B (en) * 2021-03-30 2023-05-30 桂林电子科技大学 Time-varying graph signal reconstruction method based on multiple shift operators
US11640388B2 (en) 2021-04-30 2023-05-02 International Business Machines Corporation Cognitive data outlier pre-check based on data lineage
CN114445639A (en) * 2022-01-06 2022-05-06 深圳市检验检疫科学研究院 Dual self-attention-based dynamic graph anomaly detection method
CN114401136B (en) * 2022-01-14 2023-05-05 天津大学 Rapid anomaly detection method for multiple attribute networks
WO2023178467A1 (en) * 2022-03-21 2023-09-28 Qualcomm Incorporated Energy-efficient anomaly detection and inference on embedded systems
CN115278687B (en) * 2022-07-27 2023-08-15 联通(山东)产业互联网有限公司 Telephone number fraud detection method based on space-time network and graph algorithm
CN115908574B (en) * 2023-02-28 2023-05-09 深圳联和智慧科技有限公司 River dike encroaching, positioning and pushing method and system based on unmanned aerial vehicle monitoring
CN116760583B (en) * 2023-06-02 2024-02-13 四川大学 Enhanced graph node behavior characterization and abnormal graph node detection method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103713628A (en) * 2013-12-31 2014-04-09 上海交通大学 Fault diagnosis method based on signed directed graph and data constitution
CN103888304A (en) * 2012-12-19 2014-06-25 华为技术有限公司 Abnormity detection method of multi-node application and related apparatus
CN106254175A (en) * 2016-07-26 2016-12-21 北京蓝海讯通科技股份有限公司 A kind of cluster detection of anomaly node method, apply and calculating equipment
CN107340456A (en) * 2017-05-25 2017-11-10 国家电网公司 Power distribution network operating mode intelligent identification Method based on multiple features analysis
CN108345901A (en) * 2018-01-17 2018-07-31 同济大学 A kind of graphical diagram node-classification method based on own coding neural network

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102158372B (en) * 2011-04-14 2013-06-05 哈尔滨工程大学 Distributed system abnormity detection method
JP6545728B2 (en) * 2017-01-11 2019-07-17 株式会社東芝 ABNORMALITY DETECTING APPARATUS, ABNORMALITY DETECTING METHOD, AND ABNORMALITY DETECTING PROGRAM
CN107786388B (en) * 2017-09-26 2020-02-14 西安交通大学 Anomaly detection system based on large-scale network flow data

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103888304A (en) * 2012-12-19 2014-06-25 华为技术有限公司 Abnormity detection method of multi-node application and related apparatus
CN103713628A (en) * 2013-12-31 2014-04-09 上海交通大学 Fault diagnosis method based on signed directed graph and data constitution
CN106254175A (en) * 2016-07-26 2016-12-21 北京蓝海讯通科技股份有限公司 A kind of cluster detection of anomaly node method, apply and calculating equipment
CN107340456A (en) * 2017-05-25 2017-11-10 国家电网公司 Power distribution network operating mode intelligent identification Method based on multiple features analysis
CN108345901A (en) * 2018-01-17 2018-07-31 同济大学 A kind of graphical diagram node-classification method based on own coding neural network

Also Published As

Publication number Publication date
WO2020042024A1 (en) 2020-03-05
CN109844749A (en) 2019-06-04

Similar Documents

Publication Publication Date Title
CN109844749B (en) Node abnormality detection method and device based on graph algorithm and storage device
JP7010641B2 (en) Abnormality diagnosis method and abnormality diagnosis device
CN109145516B (en) Analog circuit fault identification method based on improved extreme learning machine
CN110263538A (en) A kind of malicious code detecting method based on system action sequence
CN109040027B (en) Active prediction method of network vulnerability node based on gray model
CN112052404B (en) Group discovery method, system, equipment and medium of multi-source heterogeneous relation network
CN111107072A (en) Authentication graph embedding-based abnormal login behavior detection method and system
CN107528734A (en) A kind of abnormal host group's detection method based on Dynamic Graph
CN111767472A (en) Method and system for detecting abnormal account of social network
Manganiello et al. Multistep attack detection and alert correlation in intrusion detection systems
CN115168443A (en) Anomaly detection method and system based on GCN-LSTM and attention mechanism
CN105228185A (en) A kind of method for Fuzzy Redundancy node identities in identification communication network
Han et al. Accurate differentially private deep learning on the edge
Wang et al. Incremental causal graph learning for online root cause analysis
CN116628554B (en) Industrial Internet data anomaly detection method, system and equipment
CN111401412B (en) Distributed soft clustering method based on average consensus algorithm in Internet of things environment
Enikeeva et al. Change-point detection in dynamic networks with missing links
CN105721467A (en) Social network Sybil group detection method
CN113886765B (en) Method and device for detecting error data injection attack
WO2018142694A1 (en) Feature amount generation device, feature amount generation method, and program
CN111209567B (en) Method and device for judging perceptibility of improving robustness of detection model
CN114968750A (en) Test case generation method, device, equipment and medium based on artificial intelligence
Gupta et al. Comparative Analysis of Supervised Learning Techniques of Machine Learning for Software Defect Prediction
JP7325557B2 (en) Abnormality diagnosis method and abnormality diagnosis device
Lu et al. Using hessian locally linear embedding for autonomic failure prediction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant