CN115664970A - Network abnormal point detection method based on hyperbolic space - Google Patents

Network abnormal point detection method based on hyperbolic space Download PDF

Info

Publication number
CN115664970A
CN115664970A CN202211404826.5A CN202211404826A CN115664970A CN 115664970 A CN115664970 A CN 115664970A CN 202211404826 A CN202211404826 A CN 202211404826A CN 115664970 A CN115664970 A CN 115664970A
Authority
CN
China
Prior art keywords
hyperbolic
network
node
space
attribute
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211404826.5A
Other languages
Chinese (zh)
Inventor
王文俊
马志涛
邵明来
孙越恒
武南南
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN202211404826.5A priority Critical patent/CN115664970A/en
Publication of CN115664970A publication Critical patent/CN115664970A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention provides a network anomaly point detection method based on a hyperbolic space, which learns the node representation of an attribute network through a hyperbolic graph neural network; training a generated countermeasure network for detecting abnormal nodes embedded in an input network, mainly comprising: constructing an attribute network; estimating hyperbolic geometric curvature parameters input into the attribute network; mapping the input attribute network into a low-dimensional vector representation in a hyperbolic space through a hyperbolic neural network to serve as the output of the hyperbolic neural network; and training the whole neural network by using a back propagation method, and finally realizing an abnormal point identification function by using a discriminator module. The invention uses the hyperbolic graph neural network to extend the node characteristics of the graph neural network into the hyperbolic space in a polymerization manner, effectively fuses the node characteristics and the hierarchical structure, obtains the high-level node representation of the graph, and utilizes rich hierarchical information to perform an abnormality detection task in the hyperbolic space. The detection accuracy is obviously improved, and the abnormity detection time is shortened.

Description

Network abnormal point detection method based on hyperbolic space
Technical Field
The invention belongs to the field of network analysis, and particularly relates to a network anomaly detection method.
Background
An attribute network is a network structure in which nodes or edges have one or more attributes and labels. For example, in a social network, users may have different ages, interests, places of residence, backgrounds of work, educational backgrounds, etc., with different time, type, frequency, etc. of affiliates.
Before the deep learning is advanced, the traditional non-deep learning technology is widely applied to a plurality of real networks to identify abnormal individuals [1], and a key idea of the methods is to convert the graph abnormality detection into the traditional abnormality detection problem, because the network data rich in structural information cannot be directly processed by the traditional detection technology, a plurality of tasks seek to detect abnormal nodes [1] by utilizing node-related statistical characteristics such as incoming and outgoing degrees.
Network representation techniques are widely used in order to obtain more valuable information from the network structure for anomaly detection. These techniques encode the structure of the network into an embedded vector space and identify anomalous nodes through further analysis. To date, many network representation approaches, such as deep Deepwalk, node2Vec, and LINE, have shown their effectiveness in generating Node representations and are used in anomaly detection tasks [3].
In addition to structural information, real-world networks contain rich attribute information related to nodes. These attributes, together with the network structure, provide information about real objects, so that more abnormal information can now be detected. Graph Convolutional neural Network (GCN) [4] has achieved good success in many Graph data mining tasks such as link prediction, node classification, and anomaly detection due to the ability to capture comprehensive information in Graph structure and node attributes.
The prior art has the following problems: existing anomaly detection techniques based on network representations typically embed the network into euclidean space, however, many types of complex data exhibit highly non-euclidean characteristics [5]. In this case, euclidean space does not provide the most powerful or meaningful geometric representation. And classifiers in euclidean space cannot accurately perform outlier identification because they cannot use the hierarchical structure information of the data [6].
[ reference documents ]
[1]G.Pang,C.Shen,L.Cao,and A.V.D.Hengel,“Deep learning for anomaly detection,”ACM Comput.Surv.,vol.54,no.2,p.1–38,2021.
[2]Chalapathy R,Chawla S.Deep Learning for Anomaly Detection:A Survey[J].CoRR,2019,abs/1901.03407.
[3]Akoglu L,McGlohon M,Faloutsos C.oddball:Spotting Anomalies in Weighted Graphs[C].In Advances in Knowledge Discovery and Data Mining,14th Pacific-Asia Conference,PAKDD 2010,Hyderabad,India,June 21-24,2010.Proceed-ings.Part II,2010:410–421.
[4]Wu F,Jr AH S,Zhang T,et al.Simplifying Graph Convolutional Networks[C].In Proceedings of the 36th International Conference on Machine Learning,ICML 2019,9-15June 2019,Long Beach,California,USA,2019:6861–6871.
[5]Gromov M.Hyperbolic groups,Essays in group theory[J].Mathematical Sciences Research Institute Publications,1987,8.
[6]Liu Q,Nickel M,Kiela D.Hyperbolic Graph Neural Networks[J].CoRR,2019,abs/1910.12892
Disclosure of Invention
Aiming at the prior art, a hyperbolic space-based network anomaly point detection method is provided to solve the problems that the existing network anomaly point detection method cannot use hierarchical structure information of data and is low in accuracy. The method mainly learns the node representation of the attribute network through a hyperbolic neural network; training a generative countermeasure network for detecting abnormal nodes in the input network embedding, wherein the purpose of the generator G is to generate a representation of potential abnormal nodes and the discriminator D attempts to learn a decision boundary to separate potential abnormal data from normal data; and detecting abnormal points in the embedding of the input network by using the trained generation countermeasure network. The invention is an inductive method and has strong abnormality detection capability on newly added nodes. The method comprises the following specific steps:
step 1, constructing an attribute network: the attribute network is a static unlicensed undirected graph, nodes in the attribute network have attribute characteristics in a vector form, and the attribute network is defined as follows:
G s =(V s ,E s ,H s ) (1)
in the formula (1), V s As a set of nodes, E s Is a set of edges, H s H (v) represents an attribute vector of the node v, which is a collection of node attributes;
step 2, estimating a hyperbolic geometric curvature parameter delta input into the attribute network, wherein the hyperbolic geometric curvature parameter delta is defined by taking four points x, y, u, v belonging to Vs, and if:
Figure BDA0003936546750000021
in the formula (2), the reaction mixture is,
Figure BDA0003936546750000022
the distance of the shortest path between the two points x and y, then:
Figure BDA0003936546750000023
three intermediate variables are defined, denoted S1, S2 and S3:
Figure BDA0003936546750000024
Figure BDA0003936546750000025
Figure BDA0003936546750000026
sequencing S1, S2 and S3, taking 1/2 of the absolute value of the difference between the maximum two values as a local hyperbolic geometric curvature parameter, sampling the whole attribute network for 1000-10000 times, and taking the maximum value of the local hyperbolic geometric curvature parameter as the hyperbolic geometric curvature parameter delta;
and 3, based on the hyperbolic geometric curvature parameter delta estimated in the step 2, mapping the input attribute network into a low-dimensional vector representation in a hyperbolic space through a hyperbolic neural network (HGNN) as the output of the hyperbolic neural network, wherein the method comprises the following steps of:
3-1) converting the node attribute vector of the attribute network existing in the Euclidean space into a hyperbolic space through exponential mapping;
3-2) extracting the features of the hyperbolic space attribute vector obtained in the step 3-1) through hyperbolic transformation;
3-3) performing hyperbolic neighborhood aggregation operation on the features of the hyperbolic space attribute vector obtained in the step 3-2) and the input topological structure of the attribute network;
3-4) taking the result obtained by the aggregation operation in the step 3-3) as the hyperbolic space attribute vector of the attribute network, and repeatedly performing hyperbolic neighborhood aggregation operation once according to the steps 3-2) and 3-3), wherein the final aggregation operation is network low-dimensional representation in the hyperbolic space;
step 4, taking the hyperbolic neural network as an encoder part of a self-encoder to train the self-encoder so as to update parameters of the hyperbolic neural network;
step 5, based on the hyperbolic geometric curvature parameter delta estimated in the step 2, mapping the input attribute network into a low-dimensional vector representation in a hyperbolic space through the hyperbolic neural network (HGNN) trained in the step 4, and taking the low-dimensional vector representation as the output of the trained hyperbolic neural network;
step 6, training a generation countermeasure network by using the network low-dimensional representation in the hyperbolic space obtained in the step 5;
and 7, inputting the low-dimensional vector representation in the hyperbolic space obtained in the step 5 into the discriminator D which is trained in the step 6 and used for generating the countermeasure network, calculating the abnormal scores of all the nodes according to the output of the discriminator D, and detecting the abnormal nodes of the network according to the abnormal scores.
Further, the method for detecting network outliers based on hyperbolic space of the present invention comprises:
in step 3-1), the node attribute vector of the attribute network existing in Euclidean space
Figure BDA0003936546750000031
Mapping as a node attribute vector of the attribute network in a hyperbolic space
Figure BDA0003936546750000032
The process is as follows: firstly, adding a one-dimensional 0 element in the first dimension of the original node attribute vector to satisfy the following constraint conditions in the tangent space:
Figure BDA0003936546750000033
then, converting the input node attribute vector into a hyperbolic space through exponential mapping:
Figure BDA0003936546750000034
and defines an input node attribute vector h k,E Existing at the origin of hyperbolic space
Figure BDA0003936546750000035
In the tangential space of (A), the tangential space is marked as
Figure BDA0003936546750000036
The specific process of the step 3-2) is as follows: firstly, mapping the node attribute vector in the hyperbolic space into the corresponding hyperbolic space tangent space through log transformation, wherein the formula is as follows:
Figure BDA0003936546750000037
then, in the tangent spaceAnd (3) completing neural network linear layer mapping M in the middle:
Figure BDA0003936546750000041
finally, mapping the converted features back to a hyperbolic space through exponential mapping; the above process is expressed as:
Figure BDA0003936546750000042
in the formula (5), the reaction mixture is,
Figure BDA0003936546750000043
in hyperbolic space, matrix
Figure BDA0003936546750000044
Is (m + 1) × (n + 1) to satisfy the constraint of the condition in the tangential space described by the equation (4).
The specific process of the step 3-3) is as follows: according to the distance of the nodes of the input attribute network in the hyperbolic space, gathering the attribute information of the neighbors of each node to a central node to obtain new node characteristic information; for node characteristics
Figure BDA0003936546750000045
The neighborhood node set of the node is N (i), and for each node j in N (i), there is an aggregation weight W based on hyperbolic distance ij Corresponding to the node, the result of the aggregation is c d,β Derived from the following equation:
Figure BDA0003936546750000046
the square of the distance between two points in hyperbolic space is defined as:
Figure BDA0003936546750000047
one for a central node i and a neighbor node jAggregate weight W ij Learning the aggregation weight W by using a self-attention mechanism ij (ii) a For two node characteristics
Figure BDA0003936546750000048
μ ij An attention coefficient representing the importance of node i to node j, expressed as:
Figure BDA0003936546750000049
in the formula (8), ATT (-) represents a function for calculating an attention coefficient, the similarity of the nodes i and j is positively correlated with the attention coefficient, and ATT (-) is defined as being based on the square of the hyperbolic space distance
Figure BDA00039365467500000410
Normalization is performed using a Softmax function for node i and all its neighbors N (i) to compute an aggregation weight w ij
Figure BDA00039365467500000411
The specific process of step 4 is as follows: first, the attribute network G is divided into s =(V s ,E s ,H s ) As input to the graph convolution self-encoder, the potential representation Z is learned using the encoder:
Z (l+1) =f(Z (l) ,E|W (l) ) (11)
in formula (11), Z (l) Is the input of a convolution, Z (l+1) Is the convolved output;
Figure BDA00039365467500000412
n nodes and m-dimensional eigenvectors; w is a group of (l) Is a matrix of parameters learned in a neural network; then, the attribute network G is reconstructed from the potential representation Z using a decoder s (ii) a Decoder reconstruction attribute network G s Is shown inStructure E s The decoder predicts the probability of a connection between two adjacent nodes i, j
Figure BDA0003936546750000051
Where Zi, zj is a potential representation of node i, j, sigmoid is a nonlinear activation function,
Figure BDA0003936546750000052
Figure BDA0003936546750000053
probability of being predicted
Figure BDA0003936546750000054
Comparing the probability of a true connection to train a decoder; training the self-encoder by using a loss function so as to update the parameters of the hyperbolic neural network, wherein the loss function is expressed as follows:
Figure BDA0003936546750000055
the process of step 6 is as follows: training a generation countermeasure network by using the low-dimensional vector representation in the hyperbolic space obtained in the step 5, wherein the generation countermeasure network consists of a generator G and a discriminator D; taking noise sampled from the prior normal distribution pz as the input of the generator G to generate information potential abnormity; and representing the output of the generator G and the low-dimensional vector in the hyperbolic space as the input of the discriminator D, wherein the output of the discriminator D is the abnormal score of each input node.
In step 6, a countermeasure network is generated by minimizing the total loss training:
Figure BDA0003936546750000056
in the formula (15), D represents the discriminator, and G represents the generator, and the alternate training is performed by fixing the parameter in the generator G, and updating the parameter theta in the discriminator D according to the formula (16) D
Figure BDA0003936546750000057
Then, the parameters in the discriminator D are fixed, and the parameter θ in the generator G is updated in accordance with the equation (17) G
Figure BDA0003936546750000058
In step 7, the abnormality score of the node i is calculated by equation (18) based on the output of the discriminator D,
scpre(x′ i )=1-D(z′ i ) (18)
compared with the prior art, the invention has the beneficial effects that:
the method for detecting the network abnormal points based on the hyperbolic space comprises the steps of using a hyperbolic neural network (HGNN) to aggregate and expand node characteristics of the graph neural network into the hyperbolic space, embedding nodes through the hyperbolic neural network, learning node hierarchical structure information and global structure information which are difficult to obtain by the traditional graph neural network, effectively fusing the node characteristics and the hierarchical structure, obtaining high-level node representation of a graph, using rich hierarchical information to perform abnormal detection in the hyperbolic space, and using a generation countermeasure network in the hyperbolic space to distinguish positive and negative samples so as to complete an abnormal detection task. In the invention, the nodes in the network are embedded into the hyperbolic space, the hierarchical structure information of the network is reserved, two points far away from each other in the topological network are still far away from each other in the embedding space, and thus the problem of embedding distortion in the Euclidean space is solved. The invention obviously improves the detection accuracy and shortens the abnormity detection time.
Drawings
FIG. 1 is a block diagram of the hyperbolic space-based network anomaly detection method of the present invention;
FIG. 2 is a flow chart of the method of the present invention for training a generative countermeasure network.
Detailed Description
The design idea of the hyperbolic space-based network anomaly point detection method provided by the invention is that a hyperbolic graph neural network is used for learning node representation of an attribute network; training a generation countermeasure network for detecting abnormal nodes in the input network embedding, wherein a generator G is used for generating potential abnormal node representations, and a discriminator D is used for separating potential abnormal data from normal data.
The invention will be further described with reference to the following figures and specific examples, which are not intended to limit the invention in any way.
Step 1, constructing an attribute network: the attribute network is a static unlicensed undirected graph, and the Cora data set used in this example is an unlicensed undirected graph, as shown in the input network in fig. 1, nodes in the attribute network have attribute features in the form of vectors, and the attribute network is defined as follows:
G s =(V s ,E s ,H s ) (1)
in the formula (1), V s As a set of nodes, E s Is a set of edges, H s For a collection of node attributes, H (v) represents an attribute vector for node v.
Step 2, estimating hyperbolic geometric curvature parameters delta input into the attribute network: the hyperbolic geometric curvature parameter δ, gromov δ -hyperbolicity, is defined as: taking four points x, y, u, v ∈ Vs, if there are:
Figure BDA0003936546750000061
wherein, define
Figure BDA0003936546750000062
The distance of the shortest path between the two points x and y is as follows:
Figure BDA0003936546750000063
three intermediate variables are defined, denoted S1, S2 and S3:
Figure BDA0003936546750000064
Figure BDA0003936546750000065
Figure BDA0003936546750000066
and sequencing S1, S2 and S3, taking 1/2 of the absolute value of the difference between the maximum two values as a local hyperbolic geometric curvature parameter, sampling the whole attribute network for 1000-10000 times, in the example, sampling the whole attribute network for 5000 times, and taking the maximum value of the local hyperbolic geometric curvature parameter as the hyperbolic geometric curvature parameter delta.
Step 3, based on the Hyperbolic geometric curvature parameter δ estimated in step 2, mapping the input attribute network into a low-dimensional vector representation in a Hyperbolic space through a Hyperbolic Neural network (HGNN), and using the low-dimensional vector representation as an output of the Hyperbolic Neural network, specifically:
1) And converting the node attribute vector originally existing in the Euclidean space into a hyperbolic space through exponential mapping.
The input features in this example are node attribute vectors of the attribute network
Figure BDA0003936546750000071
(a sign means that the input vector is in k dimensions, existing in euclidean space). It needs to be mapped as a vector in hyperbolic space
Figure BDA0003936546750000072
(the notation means that the vector is in the k dimension and exists in hyperbolic space with curvature β). Firstly, adding one-dimensional 0 element in the first dimension of the original node attribute vector to meet the requirement of the element in the tangent space
Figure BDA0003936546750000073
Constraints of the conditions, after which the input features are transformed into the hyperbolic space by exponential mapping:
Figure BDA0003936546750000074
and defines an input node attribute vector h k,E Existing at the origin of hyperbolic space
Figure BDA0003936546750000075
In the tangential space of (A), the tangential space is marked as
Figure BDA0003936546750000076
2) And (3) extracting the features of the hyperbolic space attribute vector obtained in the step 1) through hyperbolic transformation.
The hyperbolic transformation process in this example is: firstly, mapping the node attribute vectors in the hyperbolic space into the corresponding hyperbolic space tangent space through log transformation according to the following formula:
Figure BDA0003936546750000077
and then completing the linear layer mapping M of the neural network in a tangent space:
Figure BDA0003936546750000078
this is followed by an exponential mapping to map the transformed features back into the hyperbolic space, and the whole process can be described as:
Figure BDA0003936546750000079
in the formula (5), the reaction mixture is,
Figure BDA00039365467500000710
in hyperbolic space, matrix
Figure BDA00039365467500000711
Is (m + 1) × (n + 1) to satisfy the constraint of the condition in the tangential space described by the equation (4).
3) And 3) based on the features of the hyperbolic space attribute vector obtained in the step 2) and the topological structure of the input attribute network, obtaining the hyperbolic space embedding of the node by using hyperbolic neighborhood aggregation operation.
The hyperbolic neighborhood aggregation process in this example is: and according to the distance of the input nodes of the attribute network in the hyperbolic space, gathering the attribute information of each point neighbor to a central node to obtain new node attribute information. In terms of volume. For node characteristics
Figure BDA00039365467500000712
Its neighborhood nodes are grouped into N (i), and for each node j in N (i), there is an aggregation weight W based on hyperbolic distance ij In response, the result of the aggregation is c d,β Derived from the following equation:
Figure BDA0003936546750000081
the square of the distance between two points in hyperbolic space is defined as:
Figure BDA0003936546750000082
there is an aggregate weight W for a central node i and a neighbor node j ij The importance of the neighborhood to the central node is represented. Hair brushIn the introduction, a self-attention mechanism is provided to learn the aggregation weight W ij . For two node characteristics
Figure BDA0003936546750000083
μ ij The attention coefficient, which represents the importance of node i to node j, can be expressed as:
Figure BDA0003936546750000084
where ATT (-) denotes a function for calculating the attention coefficient. The similarity of the nodes i and j is positively correlated with the attention coefficient, when the attention coefficient is larger, the higher the similarity of the nodes i and j is, and ATT ((-) is defined as the hyperbolic space distance based on the square of the hyperbolic space distance
Figure BDA0003936546750000085
For all neighbors N (i) of a node i (including the node itself), they are normalized using the Softmax function to compute an aggregate weight w ij
Figure BDA0003936546750000086
4) Taking the result obtained by the aggregation operation in the step 3) as a hyperbolic space attribute vector of the attribute network, and repeatedly performing hyperbolic neighborhood aggregation operation once according to the steps 2) and 3), wherein the final aggregation operation is the final network low-dimensional representation in the hyperbolic space.
Step 4, taking the hyperbolic neural network as an encoder part of a self-encoder to train the self-encoder so as to update parameters of the hyperbolic neural network; the process is as follows:
first, the attribute network G s =(V s ,E s ,H s ) As input to the graph convolution self-encoder, the self-encoder portion learns the potential representation Z:
Z (l+1) =f(Z (l) ,E|W (l) ) (11)
in the formula (11), Z (l) Is the input of a convolution, Z (l+1) Is the convolved output;
Figure BDA0003936546750000087
n nodes and m-dimensional eigenvectors; w is a group of (l) Is a matrix of parameters learned in a neural network;
then, the attribute network G is reconstructed from the potential representation Z using a decoder s (ii) a Decoder reconstruction attribute network G s Diagram structure E s The decoder predicts the probability of a connection between two adjacent nodes i, j
Figure BDA0003936546750000088
Where Zi, zj are potential representations of nodes i, j, sigmoid is a nonlinear activation function,
Figure BDA0003936546750000089
Figure BDA0003936546750000091
probability of going to predict
Figure BDA0003936546750000092
Comparing the probability of a true connection to train a decoder; training the self-encoder by using a loss function so as to update the parameters of the hyperbolic neural network, wherein the loss function is expressed as follows:
Figure BDA0003936546750000093
the self-encoder in the present invention is shown as encoder, potential representation and decoder in fig. 1.
Step 5, based on the hyperbolic geometric curvature parameter δ estimated in step 2, mapping the input attribute network into a low-dimensional vector representation in a hyperbolic space through the hyperbolic neural network (HGNN) trained in step 4, and taking the low-dimensional vector representation as the output of the trained hyperbolic neural network; the process of the step is basically the same as that of the step 3, and the difference is that the original hyperbolic neural network is changed into the hyperbolic neural network trained through the step 4.
Step 6, training a generation countermeasure network by using the network low-dimensional representation in the hyperbolic space obtained in the step 5; the generation countermeasure network consists of a generator G and a discriminator D; will be normally distributed p from the first experiment z The noise of the middle sampling is used as the input of the generator G, and potential information abnormality is generated; the output of the generator G and the low-dimensional vector representation in the hyperbolic space are taken as the inputs of the discriminator D, the output of which is the anomaly score of each input node.
Generating a countermeasure network by minimizing total loss training:
Figure BDA0003936546750000094
in the formula (15), D represents the discriminator, G represents the generator, the two parts are alternately trained, firstly, the parameter in the generator G is fixed, and the parameter theta in the discriminator D is updated according to the formula (16) D
Figure BDA0003936546750000095
Then, the parameter in the discriminator D is fixed, and the parameter θ in the generator G is updated in accordance with the equation (17) G
Figure BDA0003936546750000096
The generation of the countermeasure network is shown as the generator and the arbiter in fig. 1, and the training steps are shown in fig. 2.
And 7, after the training stage is finished, inputting the low-dimensional vector representation in the hyperbolic space obtained in the step 5 into the discriminator D which is trained in the step 6 and used for generating the countermeasure network, calculating the abnormal scores of all nodes according to the output of the discriminator D and an expression (18), and detecting the abnormal nodes of the network according to the abnormal scores.
score(x′ i )=1-D(z′ i ) (18)
In addition, the present invention is able to process data outside of the training set without retraining the model. For the anomaly scores of these nodes, the parameters of the previously trained model can be retained and the new (sub) network G ' = (a ', X ') can be entered directly into it. The present invention will learn the embedded representation of each newly added node in a feed-forward manner. Also, an abnormal score of the node i may be calculated from the output of the discriminator D, and an abnormal node may be detected based on this. As shown in the discriminator and anomaly score list of fig. 1.
While the present invention has been described with reference to the accompanying drawings, the present invention is not limited to the above-described embodiments, which are illustrative only and not restrictive, and various modifications which do not depart from the spirit of the present invention and which are intended to be covered by the claims of the present invention may be made by those skilled in the art.

Claims (8)

1. A network anomaly point detection method based on hyperbolic space is characterized in that node representation of an attribute network is learned through a hyperbolic neural network; training a generation countermeasure network for detecting abnormal nodes embedded in an input network, wherein in the generation countermeasure network, a generator G is used for generating potential abnormal node representation, and a discriminator D is used for separating potential abnormal data from normal data; the method comprises the following specific steps:
step 1, constructing an attribute network
The attribute network is a static unlicensed undirected graph, nodes in the attribute network have attribute characteristics in a vector form, and the attribute network is defined as follows:
G s =(V s ,E s ,H s ) (1)
in the formula (1), V s As a set of nodes, E s Is a set of edges, H s H (v) represents an attribute vector of the node v, which is a collection of node attributes;
step 2, estimating hyperbolic geometric curvature parameter delta input into the attribute network
The hyperbolic geometric curvature parameter δ is defined by taking four points x, y, u, v ∈ Vs, if:
Figure FDA0003936546740000011
in the formula (2), the reaction mixture is,
Figure FDA0003936546740000012
the distance of the shortest path between the two points x and y is as follows:
Figure FDA0003936546740000013
three intermediate variables are defined, denoted S1, S2 and S3:
Figure FDA0003936546740000014
Figure FDA0003936546740000015
Figure FDA0003936546740000016
sequencing S1, S2 and S3, taking 1/2 of the absolute value of the difference between the maximum two values as a local hyperbolic geometric curvature parameter, sampling the whole attribute network for 1000-10000 times, and taking the maximum value of the local hyperbolic geometric curvature parameter as the hyperbolic geometric curvature parameter delta;
and 3, based on the hyperbolic geometric curvature parameter delta estimated in the step 2, mapping the input attribute network into a low-dimensional vector representation in a hyperbolic space through a hyperbolic neural network (HGNN) as the output of the hyperbolic neural network, wherein the method comprises the following steps of:
3-1) converting the node attribute vector of the attribute network existing in the Euclidean space into a hyperbolic space through exponential mapping;
3-2) extracting the features of the hyperbolic space attribute vector obtained in the step 3-1) through hyperbolic transformation;
3-3) performing hyperbolic neighborhood aggregation operation on the features of the hyperbolic space attribute vector obtained in the step 3-2) and the input topological structure of the attribute network;
3-4) taking the result obtained by the aggregation operation in the step 3-3) as the hyperbolic space attribute vector of the attribute network, and repeatedly performing hyperbolic neighborhood aggregation operation once according to the steps 3-2) and 3-3), wherein the final aggregation operation is network low-dimensional representation in the hyperbolic space;
step 4, taking the hyperbolic neural network as an encoder part of a self-encoder to train the self-encoder so as to update parameters of the hyperbolic neural network;
step 5, based on the hyperbolic geometric curvature parameter δ estimated in step 2, mapping the input attribute network into a low-dimensional vector representation in a hyperbolic space through the hyperbolic neural network (HGNN) trained in step 4, and taking the low-dimensional vector representation as the output of the trained hyperbolic neural network;
step 6, training a generation countermeasure network by using the network low-dimensional representation in the hyperbolic space obtained in the step 5;
and 7, inputting the low-dimensional vector representation in the hyperbolic space obtained in the step 5 into the discriminator D which is trained in the step 6 and generates the countermeasure network, calculating the abnormal score of all nodes according to the output of the discriminator D, and detecting the abnormal nodes of the network according to the abnormal score.
2. The hyperbolic space-based network anomaly point detection method according to claim 1, wherein in step 3-1), the network anomaly point is storedNode attribute vector of attribute network in Euclidean space
Figure FDA0003936546740000021
Mapping as a node attribute vector of the attribute network in a hyperbolic space
Figure FDA0003936546740000022
The process is as follows:
firstly, adding a one-dimensional O element in the first dimension of the original node attribute vector to satisfy the following constraint conditions in a tangent space:
Figure FDA0003936546740000023
then, the input node attribute vector is converted into a hyperbolic space through exponential mapping:
Figure FDA0003936546740000024
and defines an input node attribute vector h k,E Existing at the origin of hyperbolic space
Figure FDA0003936546740000025
In the tangential space of (2), the tangential space is marked
Figure FDA0003936546740000026
3. The hyperbolic space-based network anomaly point detection method according to claim 2, wherein the specific process of step 3-2) is:
firstly, mapping the node attribute vector in the hyperbolic space into the corresponding hyperbolic space tangent space through log transformation, wherein the formula is as follows:
Figure FDA0003936546740000027
then, at tangent nullIntermedially performing neural network linear layer mapping
Figure FDA0003936546740000028
Finally, mapping the converted features back to the hyperbolic space through exponential mapping; the above process is expressed as:
Figure FDA0003936546740000029
in the formula (5), the reaction mixture is,
Figure FDA00039365467400000210
in hyperbolic space, matrix
Figure FDA00039365467400000211
Is (m + 1) × (n + 1) to satisfy the constraint of the condition in the tangential space described by the equation (4).
4. The hyperbolic space-based network outlier detection method according to claim 3, wherein the specific process of step 3-3) is:
according to the distance of the nodes of the input attribute network in the hyperbolic space, gathering the attribute information of the neighbors of each node to a central node to obtain new node characteristic information; for node characteristics
Figure FDA0003936546740000031
The neighborhood node set of the node is N (i), and for each node j in N (i), there is an aggregation weight W based on hyperbolic distance ij Corresponding to the node, the result of the aggregation is c d,β Derived from the following equation:
Figure FDA0003936546740000032
the square of the distance between two points in hyperbolic space is defined as:
Figure FDA0003936546740000033
there is an aggregate weight W for a central node i and a neighbor node j ij Learning the aggregation weight W by using a self-attention mechanism ij (ii) a For two node characteristics
Figure FDA0003936546740000034
u ij The attention coefficient, which represents the importance of node i to node j, is expressed as:
Figure FDA0003936546740000035
in the formula (8), ATT (-) represents a function for calculating an attention coefficient, the similarity of the nodes i and j is in positive correlation with the attention coefficient, and ATT (-) is defined as the hyperbolic space distance squared
Figure FDA0003936546740000036
Normalization is performed using a Softmax function for node i and all its neighbors N (i) to compute an aggregation weight w ij
Figure FDA0003936546740000037
5. The hyperbolic space-based network anomaly point detection method according to claim 1, wherein the specific process of step 4 is as follows:
first, the attribute network G s =(V s ,E s ,H s ) As input to the graph convolution self-encoder, the encoder is used to learn the potential representation Z:
Z (l+1) =f(Z (l) ,E|W (l) ) (11)
in formula (11), Z (l) Is the input of a convolution, Z (l+1) Is the convolved output;
Figure FDA0003936546740000038
n nodes and an m-dimensional eigenvector; w (l) Is a matrix of parameters learned in a neural network;
then, the property network G is reconstructed from the potential representation Z using a decoder s (ii) a Decoder reconstruction attribute network G s Graph structure E of s The decoder predicts the probability of a connection between two adjacent nodes i, j
Figure FDA0003936546740000039
Where Zi, zj is a potential representation of node i, j, sigmoid is a nonlinear activation function,
Figure FDA0003936546740000041
Figure FDA0003936546740000042
probability of going to predict
Figure FDA0003936546740000043
Comparing the probability of a true connection to train a decoder; training the self-encoder by using a loss function so as to update the parameters of the hyperbolic neural network, wherein the loss function is represented as follows:
Figure FDA0003936546740000044
6. the hyperbolic space-based network anomaly point detection method according to claim 1, wherein the process of step 6 is as follows:
training a generation countermeasure network by using the low-dimensional vector representation in the hyperbolic space obtained in the step 5, wherein the generation countermeasure network consists of a generator G and a discriminator D; will be normally distributed p from the first experiment z The noise of the middle sampling is used as the input of the generator G, and potential information abnormality is generated; the output of the generator G and the low-dimensional vector representation in the hyperbolic space are taken as the inputs of the discriminator D, the output of which is the anomaly score of each input node.
7. The hyperbolic space-based network outlier detection method of claim 6, wherein in step 6, a countermeasure network is generated by minimizing the overall loss training:
Figure FDA0003936546740000045
in the formula (15), D represents a discriminator, G represents a generator, and the alternate training is performed according to the following procedure,
first, the parameter in the generator G is fixed, and the parameter θ in the discriminator D is updated according to the equation (16) D
Figure FDA0003936546740000046
Then, the parameter in the discriminator D is fixed, and the parameter θ in the generator G is updated in accordance with the equation (17) G
Figure FDA0003936546740000047
8. The hyperbolic space-based network anomaly point detection method according to claim 7, wherein in step 7, an anomaly score of node i is calculated using equation (18) from the output of discriminator D,
score(x′ i )=1-D(z′ i ) (18)。
CN202211404826.5A 2022-11-10 2022-11-10 Network abnormal point detection method based on hyperbolic space Pending CN115664970A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211404826.5A CN115664970A (en) 2022-11-10 2022-11-10 Network abnormal point detection method based on hyperbolic space

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211404826.5A CN115664970A (en) 2022-11-10 2022-11-10 Network abnormal point detection method based on hyperbolic space

Publications (1)

Publication Number Publication Date
CN115664970A true CN115664970A (en) 2023-01-31

Family

ID=85020672

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211404826.5A Pending CN115664970A (en) 2022-11-10 2022-11-10 Network abnormal point detection method based on hyperbolic space

Country Status (1)

Country Link
CN (1) CN115664970A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117237141A (en) * 2023-11-16 2023-12-15 长春大学 Community detection method of hyperbolic graph convolution network based on self-adaptive curvature

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117237141A (en) * 2023-11-16 2023-12-15 长春大学 Community detection method of hyperbolic graph convolution network based on self-adaptive curvature

Similar Documents

Publication Publication Date Title
CN110084296B (en) Graph representation learning framework based on specific semantics and multi-label classification method thereof
CN109858390B (en) Human skeleton behavior identification method based on end-to-end space-time diagram learning neural network
CN109934261B (en) Knowledge-driven parameter propagation model and few-sample learning method thereof
Gao et al. SAR image change detection based on multiscale capsule network
CN110490128B (en) Handwriting recognition method based on encryption neural network
CN114332649B (en) Cross-scene remote sensing image depth countermeasure migration method based on double-channel attention
Du et al. GAN-based anomaly detection for multivariate time series using polluted training set
CN114528755A (en) Power equipment fault detection model based on attention mechanism combined with GRU
CN111291827A (en) Image clustering method, device, equipment and storage medium
CN116628212B (en) Uncertainty knowledge graph modeling method oriented to national economy and social development investigation field
CN114863091A (en) Target detection training method based on pseudo label
CN111178543B (en) Probability domain generalization learning method based on meta learning
Terefe et al. Time series averaging using multi-tasking autoencoder
Wang et al. Deep generative mixture model for robust imbalance classification
CN115664970A (en) Network abnormal point detection method based on hyperbolic space
CN117524353B (en) Molecular large model based on multidimensional molecular information, construction method and application
CN114897085A (en) Clustering method based on closed subgraph link prediction and computer equipment
CN112529025A (en) Data processing method and device
Moholkar et al. Lioness adapted GWO-based deep belief network enabled with multiple features for a novel question answering system
CN116434347B (en) Skeleton sequence identification method and system based on mask pattern self-encoder
CN112529057A (en) Graph similarity calculation method and device based on graph convolution network
CN115661539A (en) Less-sample image identification method embedded with uncertainty information
CN110647917A (en) Model multiplexing method and system
CN116246102A (en) Image classification method and system based on self-encoder and decision tree
Zha et al. Recognizing plans by learning embeddings from observed action distributions

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination