CN116760583A - Enhanced graph node behavior characterization and abnormal graph node detection method - Google Patents
Enhanced graph node behavior characterization and abnormal graph node detection method Download PDFInfo
- Publication number
- CN116760583A CN116760583A CN202310652286.0A CN202310652286A CN116760583A CN 116760583 A CN116760583 A CN 116760583A CN 202310652286 A CN202310652286 A CN 202310652286A CN 116760583 A CN116760583 A CN 116760583A
- Authority
- CN
- China
- Prior art keywords
- graph
- node
- graph node
- abnormal
- vector
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000002159 abnormal effect Effects 0.000 title claims abstract description 93
- 238000001514 detection method Methods 0.000 title claims abstract description 47
- 238000012512 characterization method Methods 0.000 title claims abstract description 39
- 238000000605 extraction Methods 0.000 claims abstract description 31
- 238000004364 calculation method Methods 0.000 claims abstract description 23
- 239000011159 matrix material Substances 0.000 claims abstract description 18
- 238000000034 method Methods 0.000 claims abstract description 18
- 238000012549 training Methods 0.000 claims abstract description 16
- 239000013598 vector Substances 0.000 claims description 79
- 230000006399 behavior Effects 0.000 claims description 73
- 238000013528 artificial neural network Methods 0.000 claims description 13
- 238000013507 mapping Methods 0.000 claims description 6
- 238000005457 optimization Methods 0.000 claims description 6
- 238000009827 uniform distribution Methods 0.000 claims description 6
- 230000002708 enhancing effect Effects 0.000 claims description 5
- 238000009826 distribution Methods 0.000 claims description 4
- 230000004913 activation Effects 0.000 claims description 3
- 230000000694 effects Effects 0.000 abstract description 5
- 230000006870 function Effects 0.000 description 20
- 206010000117 Abnormal behaviour Diseases 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000000547 structure data Methods 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
- H04L63/1425—Traffic logging, e.g. anomaly detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/042—Knowledge-based neural networks; Logical representations of neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/14—Network analysis or design
- H04L41/145—Network analysis or design involving simulating, designing, planning or modelling of a network
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/16—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks using machine learning or artificial intelligence
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Signal Processing (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- General Physics & Mathematics (AREA)
- Computer Networks & Wireless Communication (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Computer Security & Cryptography (AREA)
- Computer Hardware Design (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention discloses an enhanced graph node behavior characterization and an abnormal graph node detection method thereof, which relate to the technical field of network security and comprise the following steps: constructing and training an abnormal graph node detection model of graph node behavior characterization to obtain a corresponding trained abnormal graph node detection model; inputting all node attribute lists of the graph structure and an adjacency matrix representing the graph structure into a trained abnormal graph node detection model to obtain an abnormal score calculation result of a node to be detected in the graph; if the abnormal score of the node to be detected in the graph is greater than a threshold value, judging the node to be an abnormal graph node; otherwise, the node is judged to be a normal graph node. According to the method, the characteristic expression of the graph node behavior can be enhanced through the double random node behavior expression, the robust and effective expression of the graph node behavior is realized, and the capability of the characteristic extraction network for representing the graph node behavior is improved; the difference between the normal graph nodes and the abnormal graph nodes can be fully utilized, and an excellent abnormal detection effect is ensured.
Description
Technical Field
The invention relates to the technical field of network security, in particular to an enhanced graph node behavior characterization and an abnormal graph node detection method thereof.
Background
Attribute graph anomaly graph node detection is an important research content in the field of network security. The graph structure data widely exist in the Internet of things system, and the abnormal graph nodes in the graph correspond to hosts with abnormal behaviors in the Internet of things. When detecting the abnormal graph nodes in the attribute graph, the abnormal graph nodes can be directly detected according to the attribute characteristics of the node graph nodes, and the deeper characteristics can be extracted by combining the association between the abnormal graph nodes and other graph nodes to detect. Because the graph nodes of the attribute graph have higher common attribute dimension and have more complex intrinsic behavior patterns, a machine learning model is usually required to be constructed in an actual scene to complete the task of detecting the abnormal graph nodes, so that abnormal behaviors are timely discovered and timely processed to reduce or avoid loss. Existing systems are typically built based on either supervised or unsupervised methods. The system based on the supervision method generally requires more abnormal labels, the system performance can be drastically reduced under the condition that only a small number of abnormal graph nodes with labels exist in an actual scene, the number of the label graph nodes is too small, and the abnormal graph nodes are easily subjected to over-fitting in the learning process, so that the detection effect of the system is not ideal; the system based on the unsupervised method only learns normal graph nodes, detects abnormal graph nodes according to differences between the nodes of the graph to be detected and the node characteristics of the normal graph, does not fully utilize known abnormal samples with labels, and has low system performance based on the unsupervised method when processing actual abnormal detection data sets due to lack of corresponding label information.
Disclosure of Invention
Aiming at the defects in the prior art, the enhanced graph node behavior characterization and the abnormal graph node detection method thereof solve the problems that extremely small amount of marked abnormal graph node data and a large amount of unmarked graph node data are not fully utilized and the system detection effect is not ideal in the prior art.
In order to achieve the aim of the invention, the invention adopts the following technical scheme:
the method for enhancing the node behavior characterization of the graph and detecting the abnormal graph thereof comprises the following steps:
s1, constructing and training an abnormal graph node detection model of graph node behavior characterization, and obtaining the trained abnormal graph node detection model of graph node behavior characterization;
s2, inputting all node attribute lists of the graph structure and an adjacency matrix representing the graph structure into an abnormal graph node detection model of the trained graph node behavior characterization, and obtaining an abnormal score calculation result of the nodes to be detected in the graph;
s3, if the abnormal score of the node to be detected in the graph is greater than a threshold value, judging that the node is an abnormal graph node; otherwise, the node is judged to be a normal graph node.
Further, the abnormal graph node detection model of the graph node behavior characterization in the step S1 comprises a feature extraction network and an abnormal score calculation network; the feature extraction network comprises a node behavior expression enhancement module, a feature information and position information precoding module and a feature extraction module based on a graph self-encoder; the node behavior expression enhancement module comprises a random node selection operator, a random attribute selection operator and a disturbance adding operator; the characteristic information and position information precoding module comprises a characteristic information precoder and a position information precoder; the characteristic information precoder comprises a fully connected neural network; the position information precoder comprises a fully connected neural network; the feature extraction module based on the graph self-encoder comprises a graph convolution-based encoder and a graph convolution-based decoder; the graph convolution-based encoder includes a multi-layer graph convolution structure; the graph convolution-based decoder includes a multi-layer graph convolution structure; the graph convolution structure comprises a full connection layer and a matrix multiplier; the anomaly score computing network comprises a fully connected neural network; the fully-connected neural network comprises an input layer, an output layer and a plurality of hidden layers.
Further, the specific operation of the abnormal graph node detection model for training the graph node behavior characterization in step S1 is as follows:
s1-1, taking an attribute list X of all nodes in a graph structure and an adjacent matrix A formed among the graph nodes as training data; inputting training data to a node behavior expression enhancement module; randomly extracting normal graph nodes in the attribute list X to obtain graph nodes with a percent, wherein the selection probability of each graph node is subject to uniform distribution; randomly selecting all the attributes of each selected graph node to obtain b% of attributes, wherein the selection probability of each attribute is subject to uniform distribution; counting the average value of each selected attribute of all input normal unlabeled graph nodes, adding random disturbance obeying the normal distribution of the statistical average value with average value mu and standard deviation sigma to the value corresponding to the attribute of the selected graph node, and creating an indication vector with the dimension identical to the attribute dimension of the graph node to obtain a graph node attribute list X' with enhanced behavior expression and an indication vector list V corresponding to the graph node attribute list; wherein, a is 20 by default, b is 20 by default, μ is 0 by default, and σ is 0.1 by default;
s1-2, inputting an indication vector list V corresponding to the graph node attribute list X 'and the graph node attribute list after the behavior expression is enhanced into a characteristic information and position information precoding module, and respectively calculating the indication vector list V corresponding to the graph node attribute list X' after the behavior expression is enhanced through a forward propagation algorithm to obtain corresponding characteristic information precodingAnd position information precoding->According to the formula:
obtaining a spliced precoding result H output by the characteristic information and position information precoding module 0 The method comprises the steps of carrying out a first treatment on the surface of the Wherein Concat (-) represents the vector splice operator;
s1-3, pre-coding result H after splicing 0 The adjacency matrix A formed between the graph nodes is input to a feature extraction module based on a graph self-encoder for reconstruction, and a reconstructed graph node attribute list is obtainedThe characteristic vector H of the graph node in the hidden space, the reconstruction error vector R and the characteristic code R obtained by splicing a first norm value and a second norm value of the reconstruction error vector R;
s1-4, inputting a characteristic vector H and a characteristic code r of the graph node in the hidden space into an anomaly score calculation network, and according to the formula:
H 1 l+1 =ReLu(H 1 l )
obtaining an output characteristic vector H of the first hidden layer of the anomaly score computing network 1 l+1 The method comprises the steps of carrying out a first treatment on the surface of the Wherein H is 1 l An input feature vector representing the first hidden layer of the anomaly score computation network,weight representing the first hidden layer of the anomaly score computing network,/->Representing the bias of the first hidden layer of the anomaly score computing network, wherein Concat (-) represents a vector splicing operator, and ReLu (-) represents a nonlinear activation function;
s1-5, according to the formula:
res=H 1 ×W+b
obtaining an anomaly score calculation result res of the graph node; wherein H is 1 The output characteristic vector of the last hidden layer of the abnormal score calculating network is represented, W represents the weight of the output layer of the abnormal score calculating network, and b represents the bias of the output layer of the abnormal score calculating network;
s1-6, according to the graph node attribute list X and the reconstructed graph node attribute listConstructing a first loss function; constructing a second loss function according to the abnormal score calculation result res of the graph node and the actual label of the graph node; adding the first loss function and the second loss function based on the corresponding preset weights to obtain a third loss function, and training an abnormal graph node detection model represented by graph node behaviors through the third loss function; obtaining an abnormal graph node detection model of the trained graph node behavior characterization; the actual label of the normal graph node is 0, and the actual label of the abnormal graph node is 1.
Further, the specific operation of step S1-3 is as follows:
s1-3-1, pre-coding result H after splicing 0 Mapping to a low-dimensional hidden space, according to the formula:
obtaining a characteristic vector H of a graph node in a hidden space; wherein H is 0 ' represents the pre-coding result H after splicing 0 Or the output of the last fully connected layer,representing fully-connected layersEncoder parameters, f ce (. Cndot.) represents the full-connection layer function of the encoder, H 0 "means the output of the fully connected layer, H 2 Representing the output of the last fully-connected layer, MM (·) representing the matrix multiplier;
s1-3-2, according to the formula:
r=Concat(||R|| 1 ,||R|| 2 )
obtaining a reconstructed graph node attribute listThe reconstruction error vector R and a characteristic code R obtained by splicing a first norm value and a second norm value of the reconstruction error vector; wherein H 'represents the feature vector H of the graph node in the hidden space or the feature vector of the last full-connection layer, H' represents the feature vector of the full-connection layer, < >>Decoder parameters representing full connection layer, f cd (. Cndot.) represents the full-connection layer function of the decoder, H _out Representing the feature vector of the last fully connected layer, I R I 1 A norm value representing the reconstructed error vector R, I R I 2 Representing the two normals of the reconstructed error vector R.
Further, the specific operations of steps S1-6 are as follows:
s1-6-1, taking the difference between the output of the minimum feature extraction network and the node attribute of the input graph as an optimization target, and according to the formula:
obtaining a first loss functionWherein MSE (·) represents the calculated mean square error;
s1-6-2, performing end-to-end joint optimization on the feature extraction network and the anomaly score calculation network by minimizing the comprehensive loss based on the reconstruction error and the anomaly score calculation error according to the formula:
loss c (res,y;t)=(1-y)|res|+ymax(0,t-res)
obtaining a third loss functionWherein loss is c (res, y; t) represents a second loss function, y represents an actual label, t represents a set scaling factor, α represents a constant, |·| represents an absolute value, and max (·) represents a maximum value;
s1-6-3 by a third loss functionAnd carrying out parameter updating on the abnormal graph node detection model of the graph node behavior characterization.
The beneficial effects of the invention are as follows:
1. according to the invention, the graph node behavior feature expression in the normal mode can be enriched in a double-random node behavior expression enhancement mode, and a feature expression hidden space which is convenient for distinguishing normal graph nodes from abnormal graph nodes is constructed based on the trained feature extraction network, so that the robust and effective expression of the graph node behavior is realized.
2. The method and the device can learn the differences between the attribute characteristics and the connection behavior characteristics of the normal graph nodes and the abnormal graph nodes under the condition of fully utilizing the labeling information of the abnormal graph nodes, and ensure excellent abnormal detection effect.
3. The invention can obtain the prompt information of the specific implementation mode of enhancing the graph node behavior expression through the feature information and position information precoding module, is favorable for obtaining robust and effective graph node behavior characterization, and improves the capability of the feature extraction network for the graph node behavior characterization.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a self-monitoring network anomaly graph node detection model graph based on node behavior characterization of the present invention;
fig. 3 is a block diagram of the anomaly score calculation network of the present invention.
Detailed Description
The following description of the embodiments of the present invention is provided to facilitate understanding of the present invention by those skilled in the art, but it should be understood that the present invention is not limited to the scope of the embodiments, and all the inventions which make use of the inventive concept are protected by the spirit and scope of the present invention as defined and defined in the appended claims to those skilled in the art.
As shown in FIG. 1, the method for enhancing the node behavior characterization of the graph and detecting the nodes of the abnormal graph comprises the following steps:
s1, constructing and training an abnormal graph node detection model of graph node behavior characterization, and obtaining the trained abnormal graph node detection model of graph node behavior characterization;
s2, inputting all node attribute lists of the graph structure and an adjacency matrix representing the graph structure into an abnormal graph node detection model of the trained graph node behavior characterization, and obtaining an abnormal score calculation result of the nodes to be detected in the graph;
s3, if the abnormal score of the node to be detected in the graph is greater than a threshold value, judging that the node is an abnormal graph node; otherwise, the node is judged to be a normal graph node.
As shown in fig. 2, the abnormal graph node detection model of the graph node behavior characterization in step S1 includes a feature extraction network and an abnormal score calculation network; the feature extraction network comprises a node behavior expression enhancement module, a feature information and position information precoding module and a feature extraction module based on a graph self-encoder; the node behavior expression enhancement module comprises a random node selection operator, a random attribute selection operator and a disturbance adding operator; the characteristic information and position information precoding module comprises a characteristic information precoder and a position information precoder; the characteristic information precoder comprises a fully connected neural network; the position information precoder comprises a fully connected neural network; the feature extraction module based on the graph self-encoder comprises a graph convolution-based encoder and a graph convolution-based decoder; the graph convolution-based encoder includes a multi-layer graph convolution structure; the graph convolution-based decoder includes a multi-layer graph convolution structure; the graph convolution structure includes a full join layer and a matrix multiplier.
As shown in fig. 3, the anomaly score computation network includes a fully connected neural network; the fully-connected neural network comprises an input layer, an output layer and a plurality of hidden layers, wherein the characteristic vector H of the graph node in the hidden space is input through the input layer, and the characteristic code r is directly input into each hidden layer.
In step S1, the specific operation of the abnormal graph node detection model for training the graph node behavior characterization is as follows:
s1-1, taking an attribute list X of all nodes in a graph structure and an adjacent matrix A formed among the graph nodes as training data; inputting training data to a node behavior expression enhancement module; randomly extracting normal graph nodes in the attribute list X to obtain graph nodes with a percent, wherein the selection probability of each graph node is subject to uniform distribution; randomly selecting all the attributes of each selected graph node to obtain b% of attributes, wherein the selection probability of each attribute is subject to uniform distribution; counting the average value of each selected attribute of all input normal unlabeled graph nodes, adding random disturbance obeying the normal distribution of the statistical average value with average value mu and standard deviation sigma to the value corresponding to the attribute of the selected graph node, and creating an indication vector with the dimension identical to the attribute dimension of the graph node to obtain a graph node attribute list X' with enhanced behavior expression and an indication vector list V corresponding to the graph node attribute list; wherein, a is 20 by default, b is 20 by default, μ is 0 by default, and σ is 0.1 by default;
s1-2, inputting an indication vector list V corresponding to the graph node attribute list X 'and the graph node attribute list after the behavior expression is enhanced into a characteristic information and position information precoding module, and respectively calculating the indication vector list V corresponding to the graph node attribute list X' after the behavior expression is enhanced through a forward propagation algorithm to obtain corresponding characteristic information precodingAnd position information precoding->According to the formula:
obtaining a spliced precoding result H output by the characteristic information and position information precoding module 0 The method comprises the steps of carrying out a first treatment on the surface of the Wherein Concat (-) represents the vector splice operator;
s1-3, pre-coding result H after splicing 0 The adjacency matrix A formed between the graph nodes is input to a feature extraction module based on a graph self-encoder for reconstruction, and a reconstructed graph node attribute list is obtainedThe characteristic vector H of the graph node in the hidden space, the reconstruction error vector R and the characteristic code R obtained by splicing a first norm value and a second norm value of the reconstruction error vector R;
s1-4, inputting a characteristic vector H and a characteristic code r of the graph node in the hidden space into an anomaly score calculation network, and according to the formula:
H 1 l+1 =ReLu(H 1 l )
obtaining an output characteristic vector H of the first hidden layer of the anomaly score computing network 1 l+1 The method comprises the steps of carrying out a first treatment on the surface of the Wherein H is 1 l An input feature vector representing the first hidden layer of the anomaly score computation network,weight representing the first hidden layer of the anomaly score computing network,/->Representing the bias of the first hidden layer of the anomaly score computing network, wherein Concat (-) represents a vector splicing operator, and ReLu (-) represents a nonlinear activation function;
s1-5, according to the formula:
res=H 1 ×W+b
obtaining an anomaly score calculation result res of the graph node; wherein H is 1 The output characteristic vector of the last hidden layer of the abnormal score calculating network is represented, W represents the weight of the output layer of the abnormal score calculating network, and b represents the bias of the output layer of the abnormal score calculating network;
s1-6, according to the graph node attribute list X and the reconstructed graph node attribute listConstructing a first loss function; constructing a second loss function according to the abnormal score calculation result res of the graph node and the actual label of the graph node; adding the first loss function and the second loss function based on the corresponding preset weights to obtain a third loss function, and training an abnormal graph node detection model represented by graph node behaviors through the third loss function; obtaining an abnormal graph node detection model of the trained graph node behavior characterization; the actual label of the normal graph node is 0, and the actual label of the abnormal graph node is 1.
The specific operation of step S1-3 is as follows:
s1-3-1 pre-coding result H after splicing 0 Mapping to a low-dimensional hidden space, according to the formula:
obtaining a characteristic vector H of a graph node in a hidden space; wherein H is 0 ' represents the pre-coding result H after splicing 0 Or the output of the last fully connected layer,encoder parameters representing full connection layer, f ce (. Cndot.) represents the full-connection layer function of the encoder, H 0 "means the output of the fully connected layer, H 2 Representing the output of the last fully-connected layer, MM (·) representing the matrix multiplier;
s1-3-2, according to the formula:
r=Concat(||R|| 1 ,||R|| 2 )
obtaining a reconstructed graph node attribute listReconstruction errorsThe difference vector R and a characteristic code R obtained by splicing a first norm value and a second norm value of the reconstruction error vector; wherein H 'represents the feature vector H of the graph node in the hidden space or the feature vector of the last full-connection layer, H' represents the feature vector of the full-connection layer, < >>Decoder parameters representing full connection layer, f cd (. Cndot.) represents the full-connection layer function of the decoder, H _out Representing the feature vector of the last fully connected layer, I R I 1 A norm value representing the reconstructed error vector R, I R I 2 Representing the two normals of the reconstructed error vector R.
The specific operation of steps S1-6 is as follows:
s1-6-1, taking the difference between the output of the minimum feature extraction network and the node attribute of the input graph as an optimization target, and according to the formula:
obtaining a first loss functionWherein MSE (·) represents the calculated mean square error;
s1-6-2, performing end-to-end joint optimization on the feature extraction network and the anomaly score calculation network by minimizing the comprehensive loss based on the reconstruction error and the anomaly score calculation error according to the formula:
loss c (res,y;t)=(1-y)|res|+ymax(0,t-res)
obtaining a third loss functionWherein loss is c (res, y; t) represents a second loss function,y represents an actual label, t represents a set scaling rate, alpha represents a constant, |·| represents an absolute value, and max (·) represents a maximum value;
s1-6-3 by a third loss functionAnd carrying out parameter updating on the abnormal graph node detection model of the graph node behavior characterization.
In one embodiment of the invention, an anomaly graph node detection model for graph node behavior characterization is constructed and trained, the model comprising a feature extraction network and an anomaly score computation network. The feature extraction network comprises a node behavior expression enhancement module, a feature information and position information precoding module and a feature extraction module based on a graph self-encoder; the node behavior expression enhancement module comprises a random node selection operator, a random attribute selection operator and a disturbance adding operator; the characteristic information and position information precoding module comprises a characteristic information precoder and a position information precoder; the anomaly score computation network comprises a fully connected neural network.
And (3) the node attribute list of the graph to be tested and the adjacency matrix representing the graph structure are sent to a trained self-supervision network abnormal graph node detection model based on node behavior characterization. In the node behavior expression enhancement module, a random node selection operator and a random attribute selection operator sequentially sample a node attribute list of the graph to be tested randomly; the disturbance adding operator adds random disturbance of normal distribution of statistical mean to the value corresponding to the extracted graph node attribute, and creates a corresponding indication vector to obtain a graph node attribute list with enhanced behavior expression and an indication vector list corresponding to the graph node attribute list.
In a feature information and position information precoding module, mapping the graph node attribute subjected to behavior expression enhancement to a lower dimension by a feature information precoder to obtain a preliminary attribute feature; the position information precoder maps the indication vector to a lower dimension to obtain a preliminary indication vector, wherein the dimension is the same as the dimension of the preliminary attribute feature; vector splicing is carried out on the preliminary attribute characteristics and the preliminary indication vectors to obtain a spliced precoding result, wherein the spliced precoding result is the output of the characteristic information and position information precoding module.
In the feature extraction module, mapping the spliced pre-coding result to a low-dimensional hidden space through an encoder; obtaining a low-dimensional representation of the spliced precoding result; and mapping the low-dimensional representation of the spliced pre-coding result back to the original graph node attribute space through a decoder to obtain a reconstructed graph node attribute list, a characteristic vector of the graph node in the hidden space, a reconstruction error vector and a characteristic code obtained by splicing a first norm value and a second norm value of the reconstruction error vector, which are all the output of the characteristic extraction module. Wherein the encoder and decoder each comprise a multi-layer picture convolution structure, each layer of picture convolution structure comprising a fully concatenated layer and a matrix multiplier.
In an anomaly score computing network, computing feature vectors of the graph nodes in a hidden space, a reconstruction error vector, and feature codes obtained by splicing a first norm value and a second norm value of the reconstruction error vector to obtain an anomaly score result of each graph node; comparing the abnormal score result of each graph node with a set threshold value, and when the abnormal score result of each graph node is greater than the threshold value, determining that the graph node is an abnormal graph node; and when the abnormal score result of the graph node is smaller than the threshold value, the graph node is a normal graph node, and the detection of the network abnormal graph node is completed.
In summary, the graph node behavior feature expression in the normal mode can be enriched in a mode of enhancing the dual random node behavior expression, and a feature expression hidden space which is convenient for distinguishing normal graph nodes from abnormal graph nodes is constructed based on the trained feature extraction network, so that the robust and effective expression of the graph node behavior is realized, and the capability of the feature extraction network for representing the graph node behavior is improved; the method can learn the differences between the attribute characteristics and the connection behavior characteristics of the normal graph nodes and the abnormal graph nodes under the condition of fully utilizing the labeling information of the abnormal graph nodes, and ensures excellent abnormal detection effect.
Claims (5)
1. The method for enhancing the behavior characterization of the graph nodes and detecting the abnormal graph nodes is characterized by comprising the following steps of: the method comprises the following steps:
s1, constructing and training an abnormal graph node detection model of graph node behavior characterization, and obtaining the trained abnormal graph node detection model of graph node behavior characterization;
s2, inputting all node attribute lists of the graph structure and an adjacency matrix representing the graph structure into an abnormal graph node detection model of the trained graph node behavior characterization, and obtaining an abnormal score calculation result of the nodes to be detected in the graph;
s3, if the abnormal score of the node to be detected in the graph is greater than a threshold value, judging that the node is an abnormal graph node; otherwise, the node is judged to be a normal graph node.
2. The enhancement map node behavior characterization and anomaly map node detection method according to claim 1, wherein: the abnormal graph node detection model of the graph node behavior characterization in the step S1 comprises a feature extraction network and an abnormal score calculation network; the feature extraction network comprises a node behavior expression enhancement module, a feature information and position information precoding module and a feature extraction module based on a graph self-encoder; the node behavior expression enhancement module comprises a random node selection operator, a random attribute selection operator and a disturbance adding operator; the characteristic information and position information precoding module comprises a characteristic information precoder and a position information precoder; the characteristic information precoder comprises a fully connected neural network; the position information precoder comprises a fully connected neural network; the feature extraction module based on the graph self-encoder comprises a graph convolution-based encoder and a graph convolution-based decoder; the graph convolution-based encoder includes a multi-layer graph convolution structure; the graph convolution-based decoder includes a multi-layer graph convolution structure; the graph convolution structure comprises a full connection layer and a matrix multiplier; the anomaly score computing network comprises a fully connected neural network; the fully-connected neural network comprises an input layer, an output layer and a plurality of hidden layers.
3. The enhancement map node behavior characterization and anomaly map node detection method according to claim 2, wherein: the specific operation of the abnormal graph node detection model for training graph node behavior characterization in the step S1 is as follows:
s1-1, taking an attribute list X of all nodes in a graph structure and an adjacent matrix A formed among the nodes as training data; inputting training data to a node behavior expression enhancement module; randomly extracting normal graph nodes in the attribute list X to obtain graph nodes with a percent, wherein the selection probability of each graph node is subject to uniform distribution; randomly selecting all the attributes of each selected graph node to obtain b% of attributes, wherein the selection probability of each attribute is subject to uniform distribution; counting the average value of each selected attribute of all input normal unlabeled graph nodes, adding random disturbance obeying the normal distribution of the statistical average value with average value mu and standard deviation sigma to the value corresponding to the attribute of the selected graph node, and creating an indication vector with the dimension identical to the attribute dimension of the graph node to obtain a graph node attribute list X' with enhanced behavior expression and an indication vector list V corresponding to the graph node attribute list; wherein, a is 20 by default, b is 20 by default, μ is 0 by default, and σ is 0.1 by default;
s1-2, inputting an indication vector list V corresponding to the graph node attribute list X 'and the graph node attribute list after the behavior expression is enhanced into a characteristic information and position information precoding module, and respectively calculating the indication vector list V corresponding to the graph node attribute list X' after the behavior expression is enhanced through a forward propagation algorithm to obtain corresponding characteristic information precodingAnd position information precoding->According to the formula:
precoding module for obtaining characteristic information and position informationOutput spliced precoding result H 0 The method comprises the steps of carrying out a first treatment on the surface of the Wherein Concat (-) represents the vector splice operator;
s1-3, pre-coding result H after splicing 0 The adjacency matrix A formed between the graph nodes is input to a feature extraction module based on a graph self-encoder for reconstruction, and a reconstructed graph node attribute list is obtainedThe characteristic vector H of the graph node in the hidden space, the reconstruction error vector R and the characteristic code R obtained by splicing a first norm value and a second norm value of the reconstruction error vector R;
s1-4, inputting a characteristic vector H and a characteristic code r of the graph node in the hidden space into an anomaly score calculation network, and according to the formula:
obtaining an output characteristic vector H of the first hidden layer of the anomaly score computing network 1 l+1 The method comprises the steps of carrying out a first treatment on the surface of the Wherein H is 1 l An input feature vector representing the first hidden layer of the anomaly score computation network,weight representing the first hidden layer of the anomaly score computing network,/->Representing the bias of the first hidden layer of the anomaly score computing network, wherein Concat (-) represents a vector splicing operator, and ReLu (-) represents a nonlinear activation function;
s1-5, according to the formula:
res=H 1 ×W+b
obtaining an anomaly score calculation result res of the graph node; wherein H is 1 The output characteristic vector of the last hidden layer of the abnormal score calculating network is represented, W represents the weight of the output layer of the abnormal score calculating network, and b represents the bias of the output layer of the abnormal score calculating network;
s1-6, according to the graph node attribute list X and the reconstructed graph node attribute listConstructing a first loss function; constructing a second loss function according to the abnormal score calculation result res of the graph node and the actual label of the graph node; adding the first loss function and the second loss function based on the corresponding preset weights to obtain a third loss function, and training an abnormal graph node detection model represented by graph node behaviors through the third loss function; obtaining an abnormal graph node detection model of the trained graph node behavior characterization; the actual label of the normal graph node is 0, and the actual label of the abnormal graph node is 1.
4. The enhancement map node behavior characterization and anomaly map node detection method according to claim 3, wherein: the specific operation of the step S1-3 is as follows:
s1-3-1, pre-coding result H after splicing 0 Mapping to a low-dimensional hidden space, according to the formula:
obtaining a characteristic vector H of a graph node in a hidden space; wherein H is 0 ' represents the pre-coding result H after splicing 0 Or the output of the last fully connected layer,encoder parameters representing full connection layer, f ce (. Cndot.) represents the full-connection layer function of the encoder, H 0 "means the output of the fully connected layer, H 2 Representing the output of the last fully-connected layer, MM (·) representing the matrix multiplier;
s1-3-2, according to the formula:
H”=f cd (H';W fcd )
r=Concat(||R|| 1 ,||R|| 2 )
obtaining a reconstructed graph node attribute listThe reconstruction error vector R and a characteristic code R obtained by splicing a first norm value and a second norm value of the reconstruction error vector; wherein H 'represents the characteristic vector H of the graph node in the hidden space or the characteristic vector of the last full-connection layer, H' represents the characteristic vector of the full-connection layer, W fcd Decoder parameters representing full connection layer, f cd (. Cndot.) represents the full-connection layer function of the decoder, H _out Representing the feature vector of the last fully connected layer, I R I 1 A norm value representing the reconstructed error vector R, I R I 2 Representing the two normals of the reconstructed error vector R.
5. The enhancement map node behavior characterization and anomaly map node detection method according to claim 3, wherein: the specific operation of the step S1-6 is as follows:
s1-6-1, taking the difference between the output of the minimum feature extraction network and the node attribute of the input graph as an optimization target, and according to the formula:
obtaining a first loss functionWherein MSE (·) represents the calculated mean square error;
s1-6-2, performing end-to-end joint optimization on the feature extraction network and the anomaly score calculation network by minimizing the comprehensive loss based on the reconstruction error and the anomaly score calculation error according to the formula:
loss c (res,y;t)=(1-y)|res|+ymax(0,t-res)
obtaining a third loss functionWherein loss is c (res, y; t) represents a second loss function, y represents an actual label, t represents a set scaling factor, α represents a constant, |·| represents an absolute value, and max (·) represents a maximum value;
s1-6-3 by a third loss functionAnd carrying out parameter updating on the abnormal graph node detection model of the graph node behavior characterization.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310652286.0A CN116760583B (en) | 2023-06-02 | 2023-06-02 | Enhanced graph node behavior characterization and abnormal graph node detection method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310652286.0A CN116760583B (en) | 2023-06-02 | 2023-06-02 | Enhanced graph node behavior characterization and abnormal graph node detection method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116760583A true CN116760583A (en) | 2023-09-15 |
CN116760583B CN116760583B (en) | 2024-02-13 |
Family
ID=87954440
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310652286.0A Active CN116760583B (en) | 2023-06-02 | 2023-06-02 | Enhanced graph node behavior characterization and abnormal graph node detection method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116760583B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117312350A (en) * | 2023-11-28 | 2023-12-29 | 本溪钢铁(集团)信息自动化有限责任公司 | Steel industry carbon emission data management method and device |
CN117407697A (en) * | 2023-12-14 | 2024-01-16 | 南昌科晨电力试验研究有限公司 | Graph anomaly detection method and system based on automatic encoder and attention mechanism |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020042024A1 (en) * | 2018-08-29 | 2020-03-05 | 区链通网络有限公司 | Node abnormality detection method and device based on graph algorithm and storage device |
CN111669373A (en) * | 2020-05-25 | 2020-09-15 | 山东理工大学 | Network anomaly detection method and system based on space-time convolutional network and topology perception |
CN116192477A (en) * | 2023-02-06 | 2023-05-30 | 复旦大学 | APT attack detection method and device based on mask pattern self-encoder |
-
2023
- 2023-06-02 CN CN202310652286.0A patent/CN116760583B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020042024A1 (en) * | 2018-08-29 | 2020-03-05 | 区链通网络有限公司 | Node abnormality detection method and device based on graph algorithm and storage device |
CN111669373A (en) * | 2020-05-25 | 2020-09-15 | 山东理工大学 | Network anomaly detection method and system based on space-time convolutional network and topology perception |
CN116192477A (en) * | 2023-02-06 | 2023-05-30 | 复旦大学 | APT attack detection method and device based on mask pattern self-encoder |
Non-Patent Citations (1)
Title |
---|
刘杰,李喜旺: "基于图神经网络的工控网络异常检测算法", 计算机系统应用, vol. 29, no. 12, pages 234 - 238 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117312350A (en) * | 2023-11-28 | 2023-12-29 | 本溪钢铁(集团)信息自动化有限责任公司 | Steel industry carbon emission data management method and device |
CN117312350B (en) * | 2023-11-28 | 2024-02-27 | 本溪钢铁(集团)信息自动化有限责任公司 | Steel industry carbon emission data management method and device |
CN117407697A (en) * | 2023-12-14 | 2024-01-16 | 南昌科晨电力试验研究有限公司 | Graph anomaly detection method and system based on automatic encoder and attention mechanism |
CN117407697B (en) * | 2023-12-14 | 2024-04-02 | 南昌科晨电力试验研究有限公司 | Graph anomaly detection method and system based on automatic encoder and attention mechanism |
Also Published As
Publication number | Publication date |
---|---|
CN116760583B (en) | 2024-02-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN116760583B (en) | Enhanced graph node behavior characterization and abnormal graph node detection method | |
CN109949317B (en) | Semi-supervised image example segmentation method based on gradual confrontation learning | |
CN110008680B (en) | Verification code generation system and method based on countermeasure sample | |
CN112561910B (en) | Industrial surface defect detection method based on multi-scale feature fusion | |
CN108734210B (en) | Object detection method based on cross-modal multi-scale feature fusion | |
CN113344826B (en) | Image processing method, device, electronic equipment and storage medium | |
CN111738054B (en) | Behavior anomaly detection method based on space-time self-encoder network and space-time CNN | |
CN109033833B (en) | Malicious code classification method based on multiple features and feature selection | |
CN111079539A (en) | Video abnormal behavior detection method based on abnormal tracking | |
CN112738014A (en) | Industrial control flow abnormity detection method and system based on convolution time sequence network | |
CN114419323A (en) | Cross-modal learning and domain self-adaptive RGBD image semantic segmentation method | |
CN115205689A (en) | Improved unsupervised remote sensing image anomaly detection method | |
CN115410059B (en) | Remote sensing image part supervision change detection method and device based on contrast loss | |
CN114257697B (en) | High-capacity universal image information hiding method | |
CN114663392A (en) | Knowledge distillation-based industrial image defect detection method | |
CN115081618A (en) | Method and device for improving robustness of deep neural network model | |
CN111709488A (en) | Dynamic label deep learning algorithm | |
CN116385935A (en) | Abnormal event detection algorithm based on unsupervised domain self-adaption | |
CN111275025A (en) | Parking space detection method based on deep learning | |
CN114549863B (en) | Light field saliency target detection method based on pixel-level noise label supervision | |
CN116089944A (en) | Cross-platform application program abnormality detection method and system based on transfer learning | |
CN111797732B (en) | Video motion identification anti-attack method insensitive to sampling | |
CN115310837A (en) | Complex electromechanical system fault detection method based on causal graph attention neural network | |
CN113962332A (en) | Salient target identification method based on self-optimization fusion feedback | |
CN114239751A (en) | Data annotation error detection method and device based on multiple decoders |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |