CN114970736A

CN114970736A - Network node depth anomaly detection method based on density estimation

Info

Publication number: CN114970736A
Application number: CN202210651604.7A
Authority: CN
Inventors: 尹美娟; 段顺然; 刘粉林; 焦隆隆; 于岚岚
Original assignee: Information Engineering University of PLA Strategic Support Force
Current assignee: Information Engineering University of PLA Strategic Support Force
Priority date: 2022-06-10
Filing date: 2022-06-10
Publication date: 2022-08-30

Abstract

The invention belongs to the technical field of communication network detection, and particularly relates to a network node depth anomaly detection method based on density estimation, which comprises the steps of obtaining a node attribute matrix and a structure matrix in an attribute network, inputting the attribute matrix and the structure matrix as a graph convolution neural network, and obtaining a joint embedded vector of node structure information and attribute information in the attribute network by utilizing the graph convolution neural network; reconstructing a node attribute matrix and a structure matrix input by the graph convolution neural network respectively by using the joint embedded vector, and acquiring a node reconstruction error; and detecting node anomalies in the attribute network by utilizing a probability distribution model aiming at the joint embedded vector and the node reconstruction error. The method is based on the thought of detecting the abnormal node by the reconstruction error, reconstructs the structure information and the attribute information of the attribute network respectively, utilizes density estimation to detect the abnormal node based on the reconstruction error of the node and the embedded vector of the node, has better robustness and is convenient for practical scene application.

Description

Network node depth anomaly detection method based on density estimation

Technical Field

The invention belongs to the technical field of communication network detection, and particularly relates to a network node depth anomaly detection method based on density estimation.

Background

Attribute networks are widely present in various fields in the real world, from academic networks to social networks, from protein interaction networks to medical health systems. The simple network can only present the mutual dependency relationship between the nodes, and the attribute network not only describes the dependency relationship between the nodes, but also the nodes often contain rich characteristic information. For example, in an academic network, the attribute network can not only reflect the cooperation relationship among scholars, but also depict the research direction, the number of writings, the quoted condition and other information of each scholars; in the social network, the attribute network can simultaneously depict the friend relationship or communication relationship between users and the characteristic information of the users, such as the interest points, the text sending condition, the praise condition and the like of the users. The information contained by the nodes has an important role in knowledge mining on network data. The anomaly detection on the network data is a hot problem of network science, and has wide application scenarios in real life, such as intrusion detection in network space security, abnormal account detection on a social network, fraud detection in the financial field, and the like. In the actual anomaly detection process, not only the interdependence relation between the network nodes but also the characteristic information of the nodes are considered, and how to effectively process the multi-mode information is the key point of anomaly detection on the attribute network. The attribute network has strong advantages for data modeling containing different modal information, so in recent years, a plurality of methods for detecting the abnormality of the attribute network data emerge, abnormal nodes on the attribute network can be expressed as nodes with node patterns significantly deviating from those of most other nodes, and the abnormality of the node patterns is mainly embodied in two aspects: firstly, the connection structure around the node is abnormal; and secondly, the attribute of the node is abnormal.

According to the difference of information considered when detecting an abnormality, the existing abnormality detection methods can be classified into three categories: the method comprises the steps of anomaly detection based on the characteristics of nodes, anomaly detection based on self-centering network or community division, and anomaly detection based on network embedding. The abnormal detection method based on the characteristics of the nodes needs to select key characteristics capable of accurately finding out abnormal nodes, and then performs abnormal detection by using a statistical method. The anomaly detection method based on the self-centering network or community division is characterized in that for each node, a specific function model is used for depicting the dependency relationship between certain aspect characteristics of the node and other node characteristics of the self-centering network or the same community, and if the dependency relationship function corresponding to the node is obviously deviated from the dependency relationship functions corresponding to most other nodes, the node is regarded as an abnormal node. The method for detecting the abnormality based on network embedding comprises the steps of firstly, embedding network nodes into a low-dimensional vector space by using a specific network embedding method, enabling each node to correspond to one point in the low-dimensional vector space and to be called as an embedded vector of the node, and then carrying out abnormality detection based on the embedded vector of the node.

Although the existing methods achieve better effects under specific conditions, the following challenges are still faced when processing attribute network data: (1) and (4) data sparsity. The sparsity of data is mainly caused by two reasons: firstly, sparsity of data itself, such as the famous display of "dungba number" in social network research, "generally, a human individual can maintain a stable social relationship with 150 others at most," and therefore, as the scale of social networks is enlarged, social networks are inevitably sparse; secondly, the data acquisition is incomplete, and due to privacy protection and the data acquisition cost, it is difficult to ensure that all data of a specific network is acquired. The first two methods have high dependence on data, and sparse network data can bring great challenges to the first two abnormal detection methods. And (2) complex modal interaction. The connection relationship between nodes in the attribute network and the characteristic information of the nodes are usually from different information sources, but the information is not mutually independent, that is, the characteristic information of the nodes and the topological structures around the nodes are mutually influenced to a certain extent, and how to depict the interaction relationship between different information and integrate the interaction relationship into the same characteristic space is a difficult point for processing the attribute network. (3) Structural and attribute anomalies are confounded. In a real situation, the structural abnormality and the attribute abnormality are often not equivalent, the node attribute feature of the structural abnormality is not necessarily abnormal, and vice versa, if the attribute abnormality and the structural abnormality can be detected respectively, the classification of the abnormality can be refined, the interference of the attribute feature on the abnormality judgment can be reduced for the node with the structural abnormality, and the interference of the structural feature on the abnormality judgment can be reduced for the node with the structural abnormality.

Disclosure of Invention

Therefore, the invention provides a network node depth anomaly detection method based on density estimation, which reconstructs the structure information and the attribute information of an attribute network respectively based on the idea of reconstructing error detection anomaly nodes, utilizes density estimation to realize anomaly node detection based on reconstruction errors of nodes and self embedded vectors, has better robustness and is convenient for practical scene application.

According to the design scheme provided by the invention, a network node depth anomaly detection method based on density estimation is provided, which comprises the following contents:

acquiring a node attribute matrix and a structure matrix in an attribute network, inputting the attribute matrix and the structure matrix as a graph convolution neural network, and acquiring joint embedded vectors of node structure information and attribute information in the attribute network by using the graph convolution neural network;

reconstructing a node attribute matrix and a structure matrix input by the graph convolution neural network respectively by using the joint embedded vector, and acquiring a node reconstruction error;

and detecting node anomalies in the attribute network by utilizing a probability distribution model aiming at the joint embedded vector and the node reconstruction error.

The method for detecting the network node depth anomaly based on the density estimation further comprises the step of representing the attribute network as G ═ V, E and X >, wherein V represents a node set of the attribute network G, E represents an attribute network connecting edge set, X represents an attribute network node attribute matrix, and A represents an attribute network structure matrix.

As the network node depth anomaly detection method based on density estimation, further, when acquiring the joint embedding vector, acquiring the node embedding vector containing the neighbor information by using a graph convolution neural network formed by stacking k neural network layers and through iterative computation.

As the method for detecting the depth anomaly of the network node based on the density estimation, further, a graph convolution neural network with three neural network layers stacked is adopted, and the iterative computation process is expressed as follows:

wherein,

h ₀ is an attribute matrix X characteristic dimension, h ₁ Is a hidden layer feature dimension;

h ⁽ⁱ⁾ is a node feature matrix calculated by the ith layer, i belongs to {1,2,3}, Z is a finally obtained node embedding vector, h ₃ Embed vector dimensions for nodes, and H ⁽³⁾ The feature vector in (1) contains the third-order neighbor information of the node.

As a network node depth anomaly detection method based on density estimation, an activation function tanh (-) or sigmoid (-) is further selected as a prediction function of a structural matrix in the node reconstruction error acquisition, and attribute matrix prediction is carried out by means of a graph convolution neural network formed by stacking three neural network layers; and acquiring reconstruction errors of the structure information and the attribute information according to the structure matrix and the attribute matrix and the prediction results of the structure matrix and the attribute matrix.

As the network node depth anomaly detection method based on density estimation, the invention further adopts the similarity between node vectors to measure the reconstruction errors between the node structure matrix and the attribute matrix and the corresponding prediction results of the node structure matrix and the attribute matrix.

As the network node depth anomaly detection method based on density estimation, a Gaussian mixture model is further adopted as a probability distribution model for detecting node anomaly in the attribute network, and the density estimation of attribute information and structural information in the attribute network is completed by solving the Gaussian mixture model; and acquiring abnormal nodes according to the probability distribution in the density estimation result.

As the method for detecting the depth anomaly of the network node based on the density estimation, disclosed by the invention, further, in the process of solving the Gaussian mixture model, the node joint embedded vector is taken as input, the probability that the node belongs to each sub-distribution of the Gaussian mixture model is taken as output, and a multilayer perceptron is utilized to carry out model solution.

As the method for detecting the depth anomaly of the network node based on the density estimation, further, the density estimation process of the node attribute information is represented as follows:

the density estimation process of the node structure information is expressed as follows:

wherein z is _x 、 z _a Representing attribute information and structure information of the attribute network node, respectively, representing vectors, z _xr 、z _ar Respectively representing reconstruction errors of the attribute information and the structural information, and MLP () representing a multi-layer perceptron solution.

The method for detecting the depth anomaly of the network node based on density estimation comprises the steps of constructing a depth anomaly detection model frame, acquiring a joint embedded vector and a node reconstruction error by using the depth anomaly detection model frame, and detecting the anomaly of the network node; wherein, the depth anomaly detection model framework comprises: graph convolution neural network for obtaining joint embedding vector, probability distribution model for detecting node abnormityAnd an information reconstruction network for obtaining a node reconstruction error; the objective loss function of the model framework is expressed as:

where N represents the number of samples, θ _e ,θ _d ,θ _m Network parameters of the graph convolutional neural network, the probability distribution model and the information reconstruction network are respectively represented; r is _A/X Indicating structural information reconstruction error R _A Or attribute information reconstruction error R _X And it takes on the value of R _X Or R _A ；λ ₁ ,λ ₂ To preset the hyper-parameter, E (z) _i ) Representing sample value z _i The energy of (a) is,

wherein K represents the number of Gaussian components in the Gaussian mixture model, d represents the dimension of the node representation vector,

The jth element on the main diagonal in the covariance matrix representing the kth gaussian component.

The invention has the beneficial effects that:

the invention respectively carries out structure abnormal node detection and attribute abnormal node detection aiming at an attribute network, utilizes GCN to realize the joint embedding of attribute network structure information and attribute information, further realizes the reconstruction of network structure information by means of a hyperbolic tangent function, realizes the reconstruction of network attribute information by means of GCN, splices the self-embedded vector of the node reconstruction error as the final vector representation of the node, carries out density estimation by GMM based on the final vector representation of the node, further discovers abnormal nodes, can consider the relation between the structure information and the attribute information in the attribute network, and can well distinguish the type of the abnormal node, namely, the attribute abnormality or the structure abnormality; and further, experimental verification is carried out on a common data set, so that the method has better robustness and practicability.

Description of the drawings:

FIG. 1 is a schematic diagram of a network node deep anomaly detection process based on density estimation in an embodiment;

FIG. 2 is a schematic diagram of a depth anomaly detection model framework in an embodiment;

FIG. 3 is a graph showing sigmoid (. smallcircle.) function and tanh (. smallcircle.) function in the embodiment;

FIG. 4 is a graph showing the variation of AIC with GMM parameter K under different data in the examples.

The specific implementation mode is as follows:

in order to make the objects, technical solutions and advantages of the present invention clearer and more obvious, the present invention is further described in detail below with reference to the accompanying drawings and technical solutions.

The research aiming at the problem of abnormal node detection in the attribute network has wide application fields in life, such as: social networks, cyberspace security, financial fields, etc. Most of the existing attribute network abnormal node detection methods ignore the relation between the structure information and the attribute information in the attribute network; although some methods consider the relationship between the two, the types of the abnormal nodes, namely the attribute abnormality or the structural abnormality, cannot be distinguished well. An embodiment of the present invention, as shown in fig. 1, provides a method for detecting a deep anomaly of a network node based on density estimation, which includes the following steps:

s101, acquiring a node attribute matrix and a structure matrix in an attribute network, inputting the attribute matrix and the structure matrix as a graph convolution neural network, and acquiring joint embedded vectors of node structure information and attribute information in the attribute network by using the graph convolution neural network;

s102, reconstructing a node attribute matrix and a structure matrix input by the graph convolution neural network by using the joint embedded vector, and acquiring a node reconstruction error;

and S103, detecting node abnormality in the attribute network by utilizing a probability distribution model aiming at the joint embedded vector and the node reconstruction error.

The method comprises the steps of respectively carrying out structure abnormal node detection and attribute abnormal node detection aiming at an attribute network, realizing the joint embedding of attribute network structure information and attribute information by using GCN, respectively carrying out reconstruction aiming at the structure information and the attribute information of the attribute network based on the idea of detecting abnormal nodes by reconstruction errors, and realizing the abnormal node detection by using density estimation based on the reconstruction errors of the nodes and self-embedded vectors, so that the method has better robustness and is convenient for practical scene application.

For convenience of description, the attribute network G is recorded as<V,E,X>Wherein V ═ { V ═ V ₁ ,v ₂ ,v ₃ ,…,v _n Denotes a set of nodes of the network G, v _i Denotes the ith node, n ═ V | denotes the total number of nodes in the network G,

represents a set of network edges, if v _i And v _j If there is a continuous edge, the edge is marked as e _i,j Property of node

The adjacency matrix of the network G is denoted A, A _i,j 1 represents v _i And v _j There is a connecting edge, otherwise A _i,j ＝0。

By constructing a depth anomaly detection DADDE model frame, as shown in FIG. 2, obtaining a joint embedding vector and a node reconstruction error by using the depth anomaly detection model frame and performing network node anomaly detection; wherein, the depth anomaly detection model framework comprises: the system comprises a graph convolution neural network for obtaining a joint embedded vector, a probability distribution model for detecting node abnormity and an information reconstruction network for obtaining node reconstruction errors. In the model framework, a node embedding vector containing neighbor information can be obtained by using a graph convolution neural network formed by stacking k neural network layers and through iterative computation. The method comprises the steps that a structure matrix A and an attribute matrix X of an attribute network are used as input, joint embedding of attribute network structure information and attribute information is achieved by using GCN, the attribute information of a node is used as initial feature representation of the node, in the process of carrying out each round of GCN calculation by means of the structure matrix A, new node features comprise information of first-order neighbors of the node, and therefore the node features after n rounds of GCN calculation comprise information of n-order neighbors of the node. After the embedded vectors of the nodes are obtained, the input structure matrix A and the attribute matrix X are respectively reconstructed based on the embedded vectors, the numerical values in the structure matrix A are continuous in consideration of the difference of the data characteristics of the structure matrix A and the attribute matrix X, the attribute information is reconstructed by selecting a GCN network, the numerical values in the attribute matrix X can only be 0 or 1, and the structure information is reconstructed by selecting a sigmoid function. In view of the fact that information included in the expression vector of the node itself has an auxiliary effect on the determination of the abnormal node, in the embodiment of the present invention, the error splicing node expression vector can be reconstructed by using the node as the final expression vector of the node. Based on the final expression vector of the node, abnormal nodes can be detected by using a density estimation method, firstly, a data point is supposed to meet a specific probability distribution model, the best fitting parameter of the probability model is obtained by means of maximization of a likelihood function, and then the node obviously deviating from the probability model, namely the abnormal node, is found.

The attribute network simultaneously contains information of both network structure and node attribute, in most cases, the information of the two aspects is not mutually isolated, and the structure information of the network and the attribute information of the node are mutually influenced, for example, in a social network, attribute information of user's hobbies, sex, age and the like can influence the online friend-making behavior of the user, thereby influencing the evolution of the network structure, and vice versa, the social network friends of the user also influence the attribute information of user's hobbies and the like to a certain extent. Therefore, when network embedding is carried out, the relation between the two can not be split, information embedding is carried out respectively, so that the embedded vector can only keep unilateral information, an overfitting phenomenon is easy to occur, information on the other side is ignored, and the actual situation of the attribute network is not met. Therefore, in the embodiment of the scheme, GCN is selected to realize the joint embedding of the attribute network structure information and the attribute information.

From a mathematical point of view, the GCN extends the convolution operation over the spectral domain of the network data and computes a potential representation vector of the hierarchy of the network by means of a spectral convolution function. From the perspective of a neural network, the GCN is formed by stacking k neural network layers, where k determines the order of neighbor information included in a final embedding vector when the GCN performs network embedding, and the propagation manner between layers in the GCN can be represented by the following formula:

wherein H ^(l) Is the input matrix of the first convolutional layer, H ⁽⁰⁾ X, wherein H ⁽⁰⁾ An input matrix representing an input layer, X representing a node attribute matrix of an attribute network;

a is an adjacent matrix corresponding to the network, and I is an identity matrix; matrix array

Is a matrix

Degree matrix of, i.e.

W ^(l) The parameter matrix is learned by the ith convolution layer, and the influence degree of neighbor node information on the node information is determined; σ (·) is a nonlinear activation function, and a usable activation function is a ReLU function, ReLU (x) max (0, x). The calculation process of one GCN layer can be divided into two stages: information aggregation and information activation. In formula (1), the information aggregation stage is operated by

At this stage, the attribute information of the node and the attribute information of the direct neighbor of the node are aggregated together, and a parameter matrix W is utilized ^(l) The compression (or expansion) of characteristic dimensions is realized, and the operation of the information aggregation stage is linear because the operation of the information aggregation stage is realized by matrix multiplication; the operation of the information activation phase is to apply a nonlinear activation function σ (-) to the result of information aggregation, and thus the introduced nonlinear transformation is the key that the GCN can learn the nonlinear dependence between data. For formula (1), theTwo points to note are: one is that for a given attribute network, the structure matrix A and attribute matrix X are determined, and thus in the formula

The method can be directly calculated before model training, so that the calculation time in the model training process can be reduced; second, the attribute matrix A in GCN is shared by all layers, and the parameter matrix W ^(l) When the neighbor information is aggregated in the l-th layer, the neighbor information is shared by all nodes. In the embodiment of the present disclosure, the used neural network layer may be formed by stacking three neural networks, that is, the final node expression vector considers information of neighbors within 3 th order of the node, and the calculation process may be expressed as:

wherein,

h ₀ is the characteristic dimension, h, corresponding to the attribute matrix X ₁ Is the characteristic dimension corresponding to the first hidden layer; in a similar manner, the first and second substrates are,

H ⁽ⁱ⁾ i ∈ {1,2,3} is the node feature matrix after the ith layer computation, H ⁽³⁾ I.e. the final embedding vector Z, h ₃ Is the dimension of the final embedded vector, H ⁽³⁾ The feature vector in (2) contains information of the third-order neighbors of the node.

When the GCN carries out network embedding, the influence of high-order neighbors of the nodes is considered, so that even if attribute information of part of the nodes is lost, the attribute information of the neighbors can be used for making conjecture, and the challenge brought by data sparsity is relieved to a certain extent. On the other hand, the GCN uses the nonlinear activation function ReLU function when aggregating the information of the neighbor nodes, so that the nonlinear dependency relationship between the data can be learned, and the step of aggregating the attribute information based on the structure matrix a takes into account the dependency relationship between the attribute network structure information and the attribute information. Therefore, joint embedding of attribute network structure information and attribute information can be achieved using GCN.

Further, in the embodiment of the present disclosure, in obtaining the node reconstruction error, an activation function tanh (-) or sigmoid (-) is selected as a prediction function of the structural matrix, and the attribute matrix is predicted by means of a graph convolution neural network formed by stacking three neural network layers; and acquiring reconstruction errors of the structure information and the attribute information according to the structure matrix and the attribute matrix and the prediction results of the structure matrix and the attribute matrix. And measuring the reconstruction errors between the node structure matrix and the attribute matrix and the corresponding prediction results by adopting the similarity between the node vectors.

The difference between the original data and the estimated data, also called "reconstruction error", is an important indicator for measuring the degree of abnormality of the data points. For the convenience of description, the predicted structural information is recorded as

The predicted attribute information is

Error in reconstruction of structural information is

Attribute information reconstruction error of

The problem of predicting network structure information can be summarized as: given node v _i And v _j Corresponding embedding vector z _i And z _j Judging node v _i And v _j Whether or not there is a connecting edge therebetween, i.e.

Whether it is equal to 1. By Ding et al [7]Inspiring of the work, the activation function f (-) is used herein as a prediction function of the structural information a, namely:

unlike the sigmoid () activation function, in the embodiment of the present invention, a hyperbolic tangent activation function tanh () may be used, and the final form of the structure information prediction function is shown in formula (6):

referring to FIG. 3, the selection of the activation function tanh (-) or sigmoid (-) as the prediction function of the structure information A is determined by the special property of the matrix A, the element A of the matrix A _i,j Is {0,1}, and | tanh (·) | is [0,1), sigmoid (·) is [0,1), the absolute value of the function always lies in [0,1 ]]Within the interval, the prediction problem for the values of the elements in matrix A can therefore be reduced to finding the appropriate threshold p _th So that

When the temperature of the water is higher than the set temperature,

if not, then,

the most important difference between sigmoid (-) and tanh (-) functions is that the tanh (-) function is symmetric about the origin of coordinates. In the range of real number domain, the probability that the dot product result of two vectors is positive or negative is equal, i.e. the dot product result of two vectors is symmetric about the value 0, so the absolute value of the tan h (-) function symmetric about the origin is used as the connecting edge between two nodesThe probability of (2) is more consistent with the actual situation.

The process of network embedding by using the GCN can be regarded as a process of performing dimension reduction on attribute information of each node by using the adjacency matrix a, and therefore, the process of predicting node attribute information by using a final node embedding vector can be regarded as a process of performing dimension expansion on each node embedding vector. Based on the above-mentioned introduction, in the embodiment of the present application, the node attribute information is predicted by using the GCN, and similar to the attribute network embedding process, the attribute information prediction network is still realized by using a three-layer GCN network, but an input layer in the attribute network embedding process is used as an output layer of the prediction network, and an output layer in the attribute network embedding process is used as an input layer of the prediction network. The calculation process of the attribute information prediction network comprises the following steps:

prediction matrix based on attribute information

And a prediction matrix of structural information

The structure information reconstruction error can be expressed as

Attribute information reconstruction error of

Wherein | · | purple sweet _F Frobenius norm representing matrixThe number of the first and second groups is,

it is noted that R _A And R _X The reconstruction error of the network level is measured by respectively referring to the sum of the reconstruction errors of the structure information and the attribute information of all nodes of the whole network, and can be used as a part of a loss function for training a GCN (generalized negative network) network, so that a more accurate embedded vector Z can be obtained. In the process of generating the final expression vector of the node by splicing the reconstruction error of the embedded vector Z, the reconstruction error refers to the reconstruction error at the node level, and the similarity between the vectors is used for measurement, such as Euclidean distance, Manhattan distance, cosine similarity, Jacard similarity, and the like. In the present embodiment, z can be used _xr ，z _ar Respectively representing reconstruction errors of attribute information and structure information at a node level, z when a plurality of different vector similarity measurement modes are adopted _xr ，z _ar Is multi-dimensional.

Further, in the embodiment of the scheme, a Gaussian mixture model is adopted as a probability distribution model for detecting node abnormality in the attribute network, and the density estimation of attribute information and structural information in the attribute network is completed by solving the Gaussian mixture model; and acquiring abnormal nodes according to the probability distribution in the density estimation result.

At present, a lot of related work is carried out to find abnormal nodes by researching reconstruction errors, and good effects are obtained. However, when the representation learning model used is high in complexity or the data itself has noisy data with a complex structure, a large number of abnormal samples may appear to exhibit a normal level of error, and the challenges presented thereby can be successfully addressed by considering the embedded vectors and reconstruction errors of the nodes themselves. Based on the above teaching, in the embodiment of the present disclosure, a node reconstruction error splicing node itself embedded vector is adopted as a final node expression vector. The density estimation method is a common method for discovering abnormal nodes based on low-dimensional embedded vectors. The density estimation means: given a finite number of sample points x ₁ ,x ₂ ,x ₃ ,…,x _n Modeling the probability distribution p (x) of the variable x, where outliers refer to samplesSample points in the set that have a lower probability of occurring.

In the embodiment of the present invention, the probability distribution Model for modeling the node embedding vector may be a Gaussian Mixture Model (GMM), which is a common Model for simulating joint distribution of multiple random variables. GMM is a combination of K gaussian distributions with a probability distribution of the form p (x | Θ) ═ Σ _k α _k N(x；μ _k ,σ _k ) Wherein

Is the probability coefficient of the kth sub-distribution, N (x; mu) _k ,σ _k ) Is expected to be mu _k Standard deviation of σ _k With a gaussian distribution of (a) (-), (a) ₁ ,α ₂ ,…,α _K ；μ ₁ ,μ ₂ ,…,μ _K ；σ ₁ ,σ ₂ ,…,σ _K ) The process of density estimation is the process of solving the GMM parameters Θ, and the process of solving Θ is usually solved alternately by an Expectation-Maximization (EM) algorithm. However, the idea of alternative solution splits the embedding step of the attribute network and the density estimation process based on the node embedding vector, joint optimization is difficult to achieve in the training process, and in order to meet the challenge, a GMM (Gaussian mixture model) can be solved by using a single neural network, and a better effect is achieved. Therefore, in the embodiment of the present disclosure, a Multi-Layer Perceptron (MLP) is used to solve the GMM model, and the MLP takes the embedded vector of the node as an input and outputs the probability that the node belongs to each sub-distribution of the GMM. The density estimation process for network attribute information can be expressed as:

the density estimation process for the network structure information can be expressed as:

wherein z is _x ，z _a Final representation vectors, z, representing network attribute information and structure information, respectively _xr ，z _ar Respectively representing reconstruction errors of the attribute information and the structure information. Given N samples and the probability that they belong to a sub-distribution K, K is greater than or equal to 1 and less than or equal to K

The parameters of the GMM model can be estimated by:

wherein,

is node v _i Is finally embedded vector z _i The corresponding probabilities belonging to the respective sub-distributions,

the probability coefficient, expectation and covariance corresponding to the kth sub-distribution of the GMM model are respectively, and the sub-distribution is regarded as a multivariate gaussian distribution. Based on the predicted GMM parameter value, the sample energy may be calculated by:

where | · | represents the determinant of the square matrix. The sample energy reflects the probability of the occurrence of the sample, the greater the sample energy, the lower the probability of the occurrence of the sample.

Given N samples x ₁ ,x ₂ ,x ₃ ,…,x _n The finding of anomalies from the perspective of attribute information and the finding of anomalies from the perspective of structural information are done separately, but the form of the penalty function is consistent, and therefore, the penalty function can be expressed as:

wherein, theta _e Refers to the parameter of the GCN network in the attribute network embedding step, theta _m Refers to the parameter, θ, of the MLP network used to solve the GMM parameter _d The parameters of the information reconstruction network in the reconstruction error calculation step are as follows: when reconstructing the attribute information, θ _d Refers to a GCN network for attribute information reconstruction; when the structural information is reconstructed, θ _d Refers to a decision threshold p in the tanh (-) function for structure information reconstruction _th . Equation (14) consists of three parts: in the first part, R _A/X Indicating structural information reconstruction error R _A Or attribute information reconstruction error R _X . When an anomaly is found for attribute information, R _A/X Is taken as R _X (ii) a When an anomaly is found for structural information, R _A/X Is taken as R _A . Intuitively understand that the smaller the reconstruction error, the more the node embedded vector can contain the key information in the node, and the more accurate the embedded vector can depict the node. In the second part, E (z) _i ) Representing the energy of the sample values, the probability of observation of the sample values can be maximized by minimizing the sample energy, which can make the estimation of the GMM parameters by the MLP network more accurate. Introduction of the third part

The method aims to solve the singular point problem existing in the GMM: when the diagonal terms of the covariance matrix approach 0, a trivial solution may result, such that the parameters of the GMM model are no longer updated. Here, can be provided with

The problem caused by the fact that the number of the diagonal terms of the covariance matrix is too small is avoided. Parameter λ in formula (14) ₁ ,λ ₂ Can be specified as hyper-parameter by human, and in the experiment, can set lambda ₁ ＝0.1,λ ₂ ＝0.001。

To verify the validity of the scheme, the following explanation is made with reference to the common data set:

three data sets widely used in the existing research in this field are selected, and as shown in table 1, since abnormal data does not exist in the data sets, abnormal data can be injected into the data sets for experimental verification. The commonly used abnormal injection method is selected: method for changing network part topological node

TABLE 1 data set information statistics Table

	BlogCatalog	Flickr	ACM
				Number of nodes	5196	7575	16484
Number of edges	171743	239738	71980
				Total number of abnormal nodes	270	390	810
Abnormal node occupation ratio	5.2％	5.1％	4.9％

The experiment adopts the precision P, the recall ratio R and the F1 values commonly used in the field of data classification as evaluation indexes, and the precision P, the recall ratio R and the F1 values can be respectively calculated by taking the abnormal category in the data as a positive example according to a confusion matrix of classification results:

in the experiment, the learning rate of the DADDE model is set to be 0.0001, the GCN used in the attribute network embedding step is set to be a three-layer structure, and the number of neurons in each layer is respectively

The GCN network used for predicting the node attribute information in reconstruction error calculation is also set into a three-layer structure, and the number of neurons in each layer of the network is

And satisfy

Node reconstruction errors are measured by using Manhattan distance and cosine similarity: given a sample x and a predicted value x' thereof, the calculation formula of the Manhattan distance is sigma _i |x _i -x′ _i The formula for calculating the cosine similarity is

In the density estimation step, the number of GMM sub-distributions is recorded as K, MLP is set as a three-layer network structure, and the number of neurons in each layer is recorded as

Setting dropout to be 0.2, performing model optimization by adopting an Adam algorithm in a training process, marking nodes 3% of the abnormal degree as abnormal nodes, and marking nodes 6% of the abnormal degree as abnormal nodes in a DOMINAT model, wherein the DOMINAT model can simultaneously detect the abnormal nodes of attributes and the abnormal nodes of structures, the quantity of the two abnormal nodes in the data set is equal, and therefore, the threshold value of the DOMINAT model is equal to that of the abnormal nodes in the data setShould be twice the threshold of the DADDE model. The number of network layer neurons of the DADDE and the value of K are shown in Table 2:

TABLE 2 network layer neuron number table of DADDE

In the experiment, the K value can be selected according to the red blood pool information quantity criterion, the red blood pool information quantity criterion (AIC) is a standard for measuring the complexity of a statistical model and the fitting superiority and inferiority of the model, and the smaller the AIC value is, the better the model performance is. Fig. 4 is a graph showing the variation of the AIC value with the GMM model parameter K under the structural information data and the attribute information data of BlogCatalog, Flickr, and ACM, and it can be seen from fig. 4 that the structural information data of BlogCatalog obtains the best fit effect when K is 3, and the rest data obtain the best fit effect when K is 2.

The experiment was performed with statistics on structural and attribute anomalies in the dataset, respectively, with the results shown in table 3:

TABLE 3 precision P, recall R and F1 values corresponding to different anomaly detection methods

The horizontal line "-" in the table indicates that the corresponding anomaly detection method cannot be applied to the corresponding data type, for example, the SCAN method finds an anomaly with respect to the network structure information and cannot be applied to the network attribute data. As can be seen from table 3, the DADDE model has the optimal effect on the class 4(4/6) data, the DOMINANT model and the OC-SVM each have the optimal effect on the class 1(1/6) data set, the accuracy P, the recall R, and the F1 values of the SCAN and the LOF are not good enough, and the optimal effect on any data set is not obtained, and the OC-SVM based on the kernel technique is superior to the conventional density-based anomaly detection method LOF in performance, but due to the challenge caused by data sparsity, the OC-SVM is difficult to accurately depict the key features of the network node attribute information, and the anomaly detection performance of the method is limited. The DADDE model and the DOMINAT model essentially sort the network nodes from high to low according to the abnormal degree, and judge the abnormal nodes through the artificially specified threshold value, so that the recall rate of the model can be improved by reducing the judgment threshold value in the application scene needing higher recall rate, and on the contrary, the accuracy rate of the model can be improved by improving the judgment threshold value.

Unless specifically stated otherwise, the relative steps, numerical expressions, and values of the components and steps set forth in these embodiments do not limit the scope of the present invention.

Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present invention, which are used for illustrating the technical solutions of the present invention and not for limiting the same, and the protection scope of the present invention is not limited thereto, although the present invention is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present invention, and they should be construed as being included therein. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A network node depth anomaly detection method based on density estimation is characterized by comprising the following contents:

2. The method of claim 1, wherein the attribute network is represented as G ═ V, E, X >, where V represents a set of nodes in the attribute network G, E represents a set of edges in the attribute network, X represents an attribute network node attribute matrix, and A represents an attribute network structure matrix.

3. The method according to claim 1 or 2, wherein when obtaining the joint embedding vector, a node embedding vector containing neighbor information is obtained by iterative computation using a convolutional neural network formed by stacking k neural network layers.

4. The method for detecting the depth anomaly of the network node based on the density estimation according to the claim 3, characterized in that a graph convolution neural network with three neural network layers stacked is adopted, and the iterative computation process is expressed as follows:

wherein,

5. The method for detecting the depth anomaly of the network node based on the density estimation according to claim 1, wherein in the acquisition of the reconstruction error of the node, an activation function tanh (-) or sigmoid (-) is selected as a prediction function of a structural matrix, and the prediction of the attribute matrix is performed by means of a graph convolution neural network with three neural network layers stacked; and acquiring reconstruction errors of the structure information and the attribute information according to the structure matrix and the attribute matrix and the prediction results of the structure matrix and the attribute matrix.

6. The method for detecting the depth anomaly of the network node based on the density estimation according to the claim 1 or 5, characterized in that the similarity between the node vectors is adopted to measure the reconstruction error between the node structure matrix and the attribute matrix and the corresponding prediction results.

7. The method for detecting the depth anomaly of the network node based on the density estimation according to the claim 1, characterized in that a Gaussian mixture model is adopted as a probability distribution model for detecting the anomaly of the node in the attribute network, and the density estimation of the attribute information and the structure information in the attribute network is completed by solving the Gaussian mixture model; and acquiring abnormal nodes according to the probability distribution in the density estimation result.

8. The method as claimed in claim 7, wherein in the solving of the Gaussian mixture model, the joint embedded vectors of the nodes are used as input, the probability that the nodes belong to each sub-distribution of the Gaussian mixture model is used as output, and the model solving is performed by using a multilayer perceptron.

9. According to claim 8The network node depth anomaly detection method based on density estimation is characterized in that the density estimation process of the node attribute information is expressed as follows:

the density estimation process of the node structure information is expressed as:

wherein z is _x 、z _a Representing attribute information and structure information of the attribute network node, respectively, representing vectors, z _xr 、z _ar Respectively representing reconstruction errors of the attribute information and the structure information, and MLP () representing a multi-layer perceptron solution.

10. The method for detecting the depth anomaly of the network node based on the density estimation as claimed in claim I, wherein a depth anomaly detection model frame is constructed, and the depth anomaly detection model frame is utilized to obtain a joint embedding vector and a node reconstruction error and carry out network node anomaly detection; wherein, the depth anomaly detection model framework comprises: the system comprises a graph convolution neural network for acquiring a joint embedded vector, a probability distribution model for detecting node abnormity and an information reconstruction network for acquiring node reconstruction errors; the objective loss function of the model framework is expressed as:

where N represents the number of samples, θ _e ，θ _d ，θ _m Network parameters of the graph convolutional neural network, the probability distribution model and the information reconstruction network are respectively represented; r _A/x Indicating structural information reconstruction error R _A Or attribute information reconstruction error R _x And it takes on the value of R _X Or R _A ；λ ₁ ，λ ₂ To preset the hyperparameter, E (z) _i ) Representing sample value z _i The energy of (a) is,